aboutsummaryrefslogtreecommitdiff
path: root/README.org
blob: 0b08ad03af61eec2aa52c0e41c8307cf8f3912a4 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
* Argot: A System For Grammar-Driven Macros

The ~argot~ system provides a macro-defining-macro called
~deflanguage~ that accepts a context-free grammar that is used to
parse the body of calls to the defined macro.

** A brief tour

In ~/examples/calc.lisp~ you see the following ~deflanguage~:

#+begin_src lisp

(deflanguage calc (:documentation "A calculator language")
  (<start>
   :match (:or
           (:seq <subexpr> (:eof))
           (:seq <value> (:eof))
           (:seq <unop> (:eof))
           (:seq <binop> (:eof)))
   :then car)
  (<expr>
   :match (:or <subexpr> <value> <unop> <binop>))
  (<subexpr>
   :match (:{} calc)
   :note "A subexpression, like (1 + 2 / cos(1.5))")
  (<value>
   :match (:item)
   :if numberp
   :note "A Number")
  (<binop>
   :match (:seq
            (:@ lhs <expr>)
            (:@ rhs (:+ (:seq (:or= + - / * ^ %) <expr>))))
   :then (expand-binop lhs rhs))
  (<unop>
   :match (:seq (:or= sin cos tan -) <expr>)))

#+end_src

This defines a calculator language that you can use like so:

#+begin_src lisp

> (calc 1)
1
> (calc 1 + 2 + 3)
6
> (calc 1 + 2 * 3)
7
> (calc (1 + 2) * 3)
9
> (calc (1 + 2) * 3 + 1)
10
> (calc (1 + 2) * (3 + 1))
12
> (calc 2 ^ (2 * (1 + 1)))
16

#+end_src

The symbol ~calc~ also has a function docstring. If you are using
[[HTTPS://slime.common-lisp.dev/][SLIME]], you can evoke ~M-x slime-documentation~ with your cursor over
the ~calc~ symbol to see this:

#+begin_src
Documentation for the symbol CALC:

Function:
 Arglist: (&BODY TOKENS)

 A calculator language

start           ::= (subexpr eof | value eof | unop eof | binop eof)
expr            ::= (subexpr | value | unop | binop)
subexpr         ::= {CALC}
value           ::= token
binop           ::= expr ('+' | '-' | '/' | '*' | '^' | '%') expr⁤+
unop            ::= ('SIN' | 'COS' | 'TAN' | '-') expr
------------------------------------------
ADDITIONAL NOTES:
subexpr           A subexpression, like (1 + 2 / cos(1.5))
value             A Number
------------------------------------------
KEY: 
token      Any ole token
eof        Explicitly match the end of the input
{LANGUAGE} Parse a sublist of tokens with LANGUAGE
(A|B|...)  One of the alternavites a b ...
PATTERN+   One or more PATTERN
PATTERN*   Zero or more PATTERN
[OPT]      Zero or one of OPT

#+end_src

** User Guide

The body of a ~deflanguage~ looks like this:

~(deflanguage NAME (&key (DOCUMENTATION "")) START-RULE &body RULES)~

~NAME~ is a symbol. This symbol becomes the name the macro
you're defining.

~START-RULE~ and ~RULES~ are *rule definition forms*, each of which
looks like this:

~(NONTERMINAL :match PATTERN [:if PREDICATE] [:then ACTION] [:note STRING])~

A *nonterminal* is a symbol whose name is surrounded by angle brackets:
e.g. ~<RULE1>~ or ~<FOOBAR>~. 

*** The start rule

The very first rule in a ~deflanguage~ body is the ~START-RULE~. On a
successful parse of the ~TOKENS~, whatever the start rule results in is
what the macro being defined by ~deflanguage~ will expand in to.

*** Pattern Expressions

There are two kinds of patterns. First, any nonterminal counts as a
valid pattern.  Every other pattern is a list whose ~CAR~ is a keyword
and whose ~CDR~ varies depending on the value of the ~CAR~.

  | PATTERN              | MATCHES                                          |
  |----------------------+--------------------------------------------------|
  | ~(:seq . PATTERNS)~  | Matches a sequence of patterns, results          |
  |                      | in the sequence of results.                      |
  |----------------------+--------------------------------------------------|
  | ~(:?  PATTERN)~      | Optional match of PATTERN. Always succeeds.      |
  |                      | Succeeds with NIL if PATTERN doesn't match.      |
  |----------------------+--------------------------------------------------|
  | ~(:* PATTERN)~       | Zero or more of PATTERN, results in sequence     |
  |                      | of matches. Always succeeds.                     |
  |----------------------+--------------------------------------------------|
  | ~(:+ PATTERN)~       | One ore more of PATTERN, results in a sequence.  |
  |----------------------+--------------------------------------------------|
  | ~(:or . PATTERNS)~   | Matches one of PATTERNS, checked left to right.  |
  |----------------------+--------------------------------------------------|
  | ~(:=  LITERAL)~      | Literal pattern matches. They match exactly      |
  | ~(:seq= . LITERALS)~ | their arguments (according to EQUALP). These     |
  | ~(:?=  LITERAL)~     | variants behave like their counterparts above,   |
  | ~(:*=  LITERAL)~     | except with literal value matches instead of     |
  | ~(:+=  LITERAL)~     | pattern expressions.                             |
  | ~(:or= . LITERALS)~  |                                                  |
  |----------------------+--------------------------------------------------|
  | ~(:@  VAR PATTERN)~  | Variable Binding. Matches PATTERN and binds the  |
  |                      | result to VAR, which is in-scope for the body of |
  |                      | :IF and :THEN clauses (see below).               |
  |----------------------+--------------------------------------------------|
  | ~(:{} LANGUAGE)~     | Match a list of TOKENS using a grammar named     |
  |                      | by LANGUAGE. I.e this lets you compose languages |
  |                      | defined with DEFLANGUAGE. This is also the only  |
  |                      | way to parse sublists in the  TOKENS list.       |
  |----------------------+--------------------------------------------------|
  | ~(:item)~            | Matches any token in TOKENS.                     |
  |----------------------+--------------------------------------------------|
  | ~(:eof)~             | Explicitly matches the end of the TOKENS list.  |
  |----------------------+--------------------------------------------------|


*** IF clauses

An ~IF~ clause lets the user check the values of a particular match
against a predicate. If the predicate is ~NIL~, the match fails.

An ~IF~ clause can be either a function designator or an arbitrary
S-EXPRESSION.

**** Example 1: Function Designator IF Clauses

#+begin_src

(<rule1>
   :match (:item)
   :if symbolp)

#+end_src

This would check that the token returned by matching against the
~(:item)~ pattern is a symbol.  

**** Example 2: Expression IF Clauses

#+begin_src

(<rule2>
  :match (:seq (:= index-of) (:@ idx (:item)) (:@ str (:item)))
  :if (and (integerp idx) (stringp str)))

#+end_src

This would match sequences like ~INDEX-OF 3 "Hello"~ and it would
ensure that ~3~, which gets bound to ~idx~ is an integer, and that
~"Hello"~ , which is bound to ~str~ , is a string.

E.g. ~INDEX-OF "Hello" 4~ would fail to match.

*** THEN clauses

A ~THEN~ clause lets users transform a match result. Just like an ~IF~
clause, it can be either a function designator or an arbitrary expression.

**** Example

#+begin_src

(<rule3>
  :match (:seq
            (:= :foo)
            (:@ part1 <rule4>)
            (:= :bar)
            (:@ part2 <rule5>))
  :if (and (good? part1) (also-good? part2))
  :then (list part1 part2))

#+end_src

When ~<rule3>~ succeeds, it returns a list of two values. 

*** NOTE clauses


You can provide each rule with a ~:note~ keyword argument followed by
a string. This information will show up in the documentation string
for the macro being defined. These notes should be brief.