* Argot: A System For Grammar-Driven Macros The ~argot~ system provides a macro-defining-macro called ~deflanguage~ that accepts a context-free grammar that is used to parse the body calls to the defined macro. ** A brief tour In ~/examples/calc.lisp~ you see the following ~deflanguage~: #+begin_src lisp (deflanguage calc (:documentation "A calculator language") ( :match (:or (:seq (:eof)) (:seq (:eof)) (:seq (:eof)) (:seq (:eof))) :then car) ( :match (:or )) ( :match (:{} calc)) ( :match (:item) :if numberp) ( :match (:seq (:@ lhs ) (:@ rhs (:+ (:seq (:or= + - / * ^ %) )))) :then (expand-binop lhs rhs)) ( :match (:seq (:or= sin cos tan -) ))) #+end_src This defines a cacluator language that you can use like so: #+begin_src lisp > (calc 1) 1 > (calc 1 + 2 + 3) 6 > (calc 1 + 2 * 3) 7 > (calc (1 + 2) * 3) 9 > (calc (1 + 2) * 3 + 1) 10 > (calc (1 + 2) * (3 + 1)) 12 > (calc 2 ^ (2 * (1 + 1))) 16 #+end_src The symbol ~calc~ also has a function docstring. If you are using slime, you can evoke ~M-x slime-documentation~ with your cursor over the ~calc~ symbol to see this: #+begin_src Documentation for the symbol CALC: Function: Arglist: (&BODY TOKENS) A calculator language ::= ( ::EOF:: | ::EOF:: | ::EOF:: | ::EOF::) ::= ( | | | ) ::= {CALC} ::= ::TOKEN:: ::= ('+' | '-' | '/' | '*' | '^' | '%') ⁤+ ::= ('SIN' | 'COS' | 'TAN' | '-') KEY: ::TOKEN:: Any ole token ::EOF:: Explicitly match the end of the input {GRAMMAR} Parse a sublist of tokens with GRAMMAR (a|b|..) One of the alternavites a b ... PATTERN+ One or more PATTERN PATTERN* Zero or more PATTERN A nonterminal symbol - naming a parse rule [OPT] Zero or one of OPT #+end_src ** User Guide The body of a ~deflanguage~ looks like this: ~(deflanguage NAME (&key (DOCUMENTATION "")) START-RULE &body RULES)~ ~NAME~ is a symbol. This symbol becomes the name the macro you're defining. ~START-RULE~ and ~RULES~ are *rule definition forms*, each of which looks like this: ~(NONTERMINAL :match PATTERN [:if PREDICATE] [:then ACTION])~ A nonterminal is a symbol whose name is surrounded by angle brackets: e.g. ~~ or ~~. *** Pattern Expressions There are two kinds of patterns. First, any nonterminal counts as a valid pattern. Every other pattern is a list whose ~CAR~ is a keyword and whose ~CDR~ varies depending on the value of the ~CAR~. | PATTERN | MATCHES | |----------------------+--------------------------------------------------| | ~(:seq . PATTERNS)~ | Matches a sequence of patterns, results | | | in the sequence of results. | |----------------------+--------------------------------------------------| | ~(:? PATTERN)~ | Optional match of PATTERN. Always succeeds. | | | Succeeds with NIL if PATTERN doesn't match. | |----------------------+--------------------------------------------------| | ~(:* PATTERN)~ | Zero or more of PATTERN, results in sequence | | | of matches. Always succeeds. | |----------------------+--------------------------------------------------| | ~(:+ PATTERN)~ | One ore more of PATTERN, results in a sequence. | |----------------------+--------------------------------------------------| | ~(:or . PATTERNS)~ | Matches one of PATTERNS, checked left to right. | |----------------------+--------------------------------------------------| | ~(:= LITERAL)~ | Literal pattern matches. They match exactly | | ~(:seq= . LITERALS)~ | their arguments (according to EQUALP). These | | ~(:?= LITERAL)~ | variants behave like their counterparts above, | | ~(:*= LITERAL)~ | except with literal value matches instead of | | ~(:+= LITERAL)~ | pattern expressions. | | ~(:or= . LITERALS)~ | | |----------------------+--------------------------------------------------| | ~(:@ VAR PATTERN)~ | Variable Binding. Matches PATTERN and binds the | | | result to VAR, which is in-scope for the body of | | | :IF and :THEN clauses (see below). | |----------------------+--------------------------------------------------| | ~(:{} LANGUAGE)~ | Match a list of TOKENS using a grammar named | | | by LANGUAGE. I.e this lets you compose languages | | | defined with DEFLANGUAGE. This is also the only | | | way to parse sublists in the TOKENS list. | |----------------------+--------------------------------------------------| | ~(:item)~ | Matches any token in TOKENS. | |----------------------+--------------------------------------------------| | ~(:eof)~ | Excplicitly matches the end of the TOKENS list. | |----------------------+--------------------------------------------------| *** IF clauses An ~IF~ clause lets the user check the values of a particular match against a predicate. If the predicate is ~NIL~, the match fails. An ~IF~ clause can be either a function designator or an arbitrary S-EXPRESSION. **** Example 1: Function Designator IF Clauses #+begin_src ( :match (:item) :if symbolp) #+end_src This would check that the token returned by matching against the `(:item)` pattern is a symbol. **** Example 2: Expression IF Clauses #+begin_src ( :match (:seq (:= index-of) (:@ idx (:item)) (:@ str (:item))) :if (and (integerp idx) (stringp str))) #+end_src This would match sequences like ~INDEX-OF 3 "Hello"~ and it would ensure that ~3~, which gets bound to ~idx~ is an integer, and that ~"Hello"~ , which is bound to ~str~ , is a string. E.g. ~INDEX-OF "Hello" 4~ would fail to match. *** THEN clauses A ~THEN~ clause lets users transform a match result. Just like an ~IF~ clause, it can be either a function designator or an arbitrary expression. **** Example #+begin_src ( :match (:seq (:= :foo) (:@ part1 ) (:= :bar) (:@ part2 )) :if (and (good? part1) (also-good? part2)) :then (list part1 part2)) #+end_src When ~~ succeeds, it returns a list of two values. *** The start rule The very first rule in a ~deflanguage~ body is the ~START-RULE~. On a successful parse of the ~TOKENS~, whatever the start rule results in is what the macro being defined by ~deflanguage~ will expand in to.