THEORY NOTES
We have discussed an algorithm for parsing a string of symbols (ie, is "[ ] { ( } )" legal? Yes, it is "balanced", but it has no meaning in C++). We have also developed an algorithm for translating an infix string into postfix. So, it makes sense to now think about grammars which are rules governing legal inputs.
The following grammar accepts (or produces) the string "abc".
1. S->ABC
A->a
B->b
C->c
Some terminology:
Terminals are denoted by lower-case letters. They are "terminal" because they do not lead anywhere.
Non-terminals are denoted by upper-case letters. They are "non-terminal" because they can be expanded.
The start symbol is traditionally denoted by an 'S'. It is the first rule to be expanded.
A lambda rule () denotes an empty string.
So, to modify the above example:
2. S->ABC
S->
A->a
B->b
C->c
This grammar accepts precisely the string "abc" and the empty string.
A->aB
B->bB
B->b
This grammar accepts precisely strings consisting of 1 a followed by 1or more b's. Note: It is not acceptable to say "an a followed by some b's." Be as precise as possible!
A context-free grammar is when:
Most context-free grammars can be rewritten in Chomsky Normal Form which means:
rules.
Examples 1, 2, and 3 are all context-free languages, but neither is in CNF. Examples 1 and 3 can be rewritten in CNF, but 2 cannot because of the rule.
All context-free grammars can be written in CNF except those containing a .
A-> a
B->
B->b
In the above example, even though there is a expression, an equivalent grammar could be written in CNF as follows:
S->a
S->AB
A->a
B->b
Note: When you have two non-terminals on the right side of the rule, you can expand them in any order. For consistency in this class, please use a left-hand derivation which means expanding the left-most non-terminal at any given time.
We can also create a "modified CNF" which will allow us to accept the empty string. To do this, create another start symbol.
A->a S'->AB
B->b S->AB
A->a
B->b