Download Context-Free Grammars: Understanding CFGs and Parsing Ambiguous Strings and more Study notes Programming Languages in PDF only on Docsity! CMSC 330: Organization of Programming Languages Context-Free Grammars CMSC 330 2 Review • Why should we study CFGs? • What are the four parts of a CFG? • How do we tell if a string is accepted by a CFG? • What’s a parse tree? CMSC 330 5 Another Example (cont’d) S → a | SbS • Is ababa in this language? A leftmost derivation S ⇒ SbS ab⇒ S ⇒ abSbS abab⇒ S ababa⇒ Another leftmost derivation S ⇒ SbS ⇒ SbSbS ⇒ abSbS abab⇒ S ababa⇒ CMSC 330 6 Ambiguity • A string is ambiguous for a grammar if it has more than one parse tree – Equivalent to more than one leftmost (or more than one rightmost) derivation • A grammar is ambiguous if it generates an ambiguous string – It can be hard to see this with manual inspection • Exercise: can you create an unambiguous grammar for S → a | SbS ? CMSC 330 7 Are these Grammars Ambiguous? (1) S → aS | T T → bT | U U → cU | ε (2) S → T | T T → Tx | Tx | x | x (3) S → SS | () | (S) CMSC 330 10 Tips for Designing Grammars 1. Use recursive productions to generate an arbitrary number of symbols A → xA | ε Zero or more x’s A → yA | y One or more y’s • Use separate non-terminals to generate disjoint parts of a language, and then combine in a production G = S → AB A → aA | ε B → bB | ε L(G) = a*b* CMSC 330 11 Tips for Designing Grammars (cont’d) 1. To generate languages with matching, balanced, or related numbers of symbols, write productions which generate strings from the middle {anbn | n ≥ 0} (not a regular language!) S → aSb | ε Example: S aSb aaSbb aabb⇒ ⇒ ⇒ {anb2n | n ≥ 0} S → aSbb | ε CMSC 330 12 Tips for Designing Grammars (cont’d) {anbm | m ≥ 2n, n ≥ 0} S → aSbb | B | ε B → bB | b The following grammar also works: S → aSbb | B B → bB | ε How about the following? S → aSbb | bS | ε CMSC 330 15 Tips for Designing Grammars (cont’d) { anbm | m > n ≥ 0} { a∪ ncm | m > n ≥ 0} S → T | U T → aTb | Tb | b T generates the first set U → aUc | Uc | c U generates the second set • What’s the parse tree for string abbb? • Ambiguous! CMSC 330 16 Tips for Designing Grammars (cont’d) { anbm | m > n ≥ 0} { a∪ ncm | m > n ≥ 0} Will this fix the ambiguity? S → T | U T → aTb | bT | b U → aUc | cU | c • It's not ambiguous, but it can generate invalid strings such as babb CMSC 330 17 Tips for Designing Grammars (cont’d) { anbm | m > n ≥ 0} { a∪ ncm | m > n ≥ 0} Unambiguous version S → T | V T → aTb | U U → Ub | b V → aVc | W W → Wc | c CMSC 330 20 The Issue: Associativity • Ambiguity is bad here because if the compiler needs to generate code for this expression, it doesn’t know what the programmer intended • So what do we mean when we write a-b-c? – In mathematics, this only has one possible meaning – It’s (a-b)-c, since subtraction is left-associative – a-(b-c) would be the meaning if subtraction was right- associative CMSC 330 21 Another Example: If-Then-Else <stmt> ::= <assignment> | <if-stmt> | ... <if-stmt> ::= if (<expr>) <stmt> | if (<expr>) <stmt> else <stmt> – (Here <>’s are used to denote nonterminals and ::= for productions) • Consider the following program fragment: if (x > y) if (x < z) a = 1; else a = 2; – Note: Ignore newlines CMSC 330 22 Parse Tree #1 • Else belongs to inner if CMSC 330 25 What if We Wanted Right-Associativity? • Left-recursive productions are used for left- associative operators • Right-recursive productions are used for right- associative operators • Left: E → E+T | E-T | E*T | T T → a | b | c | (E) • Right: E → T+E | T-E | T*E | T T → a | b | c | (E) CMSC 330 26 Parse Tree Shape • The kind of recursion/associativity determines the shape of the parse tree – Exercise: draw a parse tree for a-b-c in the prior grammar in which subtraction is right-associative left recursion right recursion CMSC 330 27 A Different Problem • How about the string a+b*c ? E → E+T | E-T | E*T | T T → a | b | c | (E) • Doesn’t have correct precedence for * – When a nonterminal has productions for several operators, they effectively have the same precedence • How can we fix this?