#+TITLE: A Goofism Guide to PARZIVAL
#+EXPORT_FILE_NAME: /ssh:colin@cicadas.surf:~/public/parzival-tutorial.html
#+OPTIONS: html-postamble:nil html-preamble:nil html-style:nil num:nil html-scripts:nil
#+HTML_HEAD:
* The Goofism Guide to PARZIVAL
_A Gentle Introduction_
This tutorial will guide you through the process of creating a
moderately sophisticated parser using [[https://github.com/cbeo/parzival][parzival]], a [[https://common-lisp.net/][Common Lisp]]
library for writing stream parsers using the [[https://en.wikipedia.org/wiki/Parser_combinator]["parser combinator"]]
approach.
To motivate your learning, you will be building up a parser for the
familiar JSON format, so , take a minute to scroll all the way
through the [[https://www.json.org/json-en.html][JSON definition document]].
Notice the document's structure. At the top is a the definition of a
JSON object. That definition refers to other terms, like arrays and
strings, the definitions of which appear below. And those terms
refer to still simpler terms, all the way to the bottom of the
document where the term for whitespace is defined.
As you work through this tutorial, you will work *up* the JSON
definition. That is, you will begin at the very bottom, with
whitespace. From there you will define parsers for increasingly
complex terms, all the way up to the top where the JSON object
appears.
Enough chit-chat. Time to get going.
** Concepts & Conventions
You should first understand a few concepts and conventions that
will come up in the rest of the tutorial.
*** Parsers
In =parzival= a parser is a function that accepts a character
stream and returns three values:
1. A result value, which can be anything.
2. A success indicator, which is ~T~ if the parse succeeded, or
~NIL~ otherwise.
3. The stream itself, which can be passed to further parsers or can
be examined if the parse failed.
*** On the Terms "Accept", "Succeed", "Fail" and "Result"
Some of the terms used to talk about parsing can be perhaps
confused or conflated with terms used to talk about
functions. This is especially the case in =parzival= because a
parser *is* just a function.
When parsing an input stream, the parser is said to "accept" the
input when the parse "succeeds" with a "result". Otherwise the
parser is said to "fail" to accept the input it was given.
I.e. On the one hand, you may be said to *call* a *function* with
*arguments* so that it *returns* a value. On the other hand, a
*parser* will *accept* *input* and either *result* in a value or
*fail*.
It may seem like nitpicking, but these terms are used frequently
in =parzival='s documentation and in this tutorial. It is my hope
that explicit mention of the terms here will make the tutorial
easier to read and understand.
*** Naming Conventions
The =parzival= package exports a number of tragically un-lispy
looking symbols. You'll see things like =< (let ((string " "))
(parse string
PZ-JSON> (let ((string "
"))
(parse string
PZ-JSON>
#+END_SRC
So what is going on? The combinators =< (parse "hey dude" (<
#+END_SRC
The parser =(< (parse "hey dude" (<
#+END_SRC
The parse resulted in failure (indicated by a second return value of
=NIL=) because, though /dude/ appeared in the input, it was not at
the beginning of the stream.
At this point it seems clear that you will will want to define parsers
that look something like this:
#+BEGIN_SRC lisp
(< (parse "hey dude" (<