Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Understanding Parsing and Ambiguity in Computer Science: Intro to Context-Free Grammars - , Study notes of Computer Science

A lecture outline for topic 15 of cs421 at the university of illinois at urbana-champaign, focusing on context-free grammars. It covers the basics of grammars, their properties, and the concept of ambiguity. Students will learn how to identify and explain the parts of a grammar, show ambiguity using derivations or parse trees, and understand the conversion of strings to trees through lexers, tokens, parsers, and derivations.

Typology: Study notes

Pre 2010

Uploaded on 03/16/2009

koofers-user-we7
koofers-user-we7 🇺🇸

10 documents

1 / 26

Toggle sidebar

Related documents


Partial preview of the text

Download Understanding Parsing and Ambiguity in Computer Science: Intro to Context-Free Grammars - and more Study notes Computer Science in PDF only on Docsity! Outline Objectives and Review Context-Free Grammars Properties of Grammars CS421 Topic 15: Introduction to Grammars1 Sameer Sundresh sundresh@uiuc.edu University of Illinois at Urbana-Champaign June 19, 2007 1Based on slides by Mattox Beckman, as updated by Vikram Adve, Gul Agha, Elsa Gunter, and Mark Hills Sameer Sundresh CS421 Topic 15: Introduction to Grammars Outline Objectives and Review Context-Free Grammars Properties of Grammars Objectives and Review Context-Free Grammars Properties of Grammars Sameer Sundresh CS421 Topic 15: Introduction to Grammars Outline Objectives and Review Context-Free Grammars Properties of Grammars Reminder: The Solution Characters Lexer Tokens Parser Tree The conversion from strings to trees is accomplished in two steps. I First, convert the stream of characters into a stream of tokens. I This is called lexing or scanning. I Turns characters into words and categorizes them. I We did this in the last few lectures! I Second, convert the stream of tokens into an abstract syntax tree. I This is called parsing. I Turns words into sentences. Sameer Sundresh CS421 Topic 15: Introduction to Grammars Outline Objectives and Review Context-Free Grammars Properties of Grammars CFGs Example Derivations and Parse Trees Ambiguity Context-free Grammars Def: A Context-free Grammar (CFG) is a 4-tuple: G = (N, Σ, P, S) where: 1. N is a finite, nonempty set of symbols (non-terminals) 2. Σ is a finite set of symbols (terminals) N ∩ Σ = Φ V ≡ N ∪ Σ (vocabulary) 3. P is a finite subset of N × V∗ (production rules) 4. S ∈ N (Goal symbol or start symbol) Sometimes written as G = (V, Σ, P, S), N = V − Σ Sameer Sundresh CS421 Topic 15: Introduction to Grammars Outline Objectives and Review Context-Free Grammars Properties of Grammars CFGs Example Derivations and Parse Trees Ambiguity Example Grammar: Arithmetic Expressions G = (N, Σ, P, S) where: N = {E , T , F} Σ = {(, ), +, ∗, id} P = {E → T E → E + T T → F T → T ∗ F F → id F → (E )} S = E . Note: P ⊆ N × V∗, where V = N ∪ Σ = {E , T , F , (, ), +, ∗, id} Note: (A, α) ∈ P is usually writ- ten: A → α or A ::= α or A : α Sameer Sundresh CS421 Topic 15: Introduction to Grammars Outline Objectives and Review Context-Free Grammars Properties of Grammars CFGs Example Derivations and Parse Trees Ambiguity Parse Trees of a Grammar A Parse Tree for a grammar G is any tree in which: I The root is labeled with S. I Each leaf is labeled with a token a (a ∈ Σ) or  (the empty string) I Each interior node is labeled by a non-terminal. I If an interior node is labeled A and has children labeled X1, . . . ,Xn, then A → X1 . . .Xn is a production of G. I If A →  is a production in G, then a node labeled A may have a single child labeled  The string formed by the leaf labels (left to right) is the yield of the parse tree. Sameer Sundresh CS421 Topic 15: Introduction to Grammars Outline Objectives and Review Context-Free Grammars Properties of Grammars CFGs Example Derivations and Parse Trees Ambiguity Parse Trees (continued) I An intermediate parse tree is the same as a parse tree except the leaves can be non-terminals. Notes: I Every α ∈ L(G) is the yield of some parse tree. Why? I Consider a derivation, α0 ⇒ α1 ⇒ . . . ⇒ αn, where αn ∈ L(G). For each αi , we can construct an intermediate parse tree. The last one will be the parse tree for the sentence αn. I A parse tree ignores the order in which symbols are replaced to derive a string. Sameer Sundresh CS421 Topic 15: Introduction to Grammars Outline Objectives and Review Context-Free Grammars Properties of Grammars CFGs Example Derivations and Parse Trees Ambiguity Derivations and Parse Trees Example: The rightmost derivation and the parse tree for : id * id E ⇒ T ⇒ T ∗ F ⇒ T ∗ id ⇒ F ∗ id ⇒ id ∗ id E T T * F E T T F * F id E T T F id * F id Sameer Sundresh CS421 Topic 15: Introduction to Grammars Outline Objectives and Review Context-Free Grammars Properties of Grammars CFGs Example Derivations and Parse Trees Ambiguity Order of Evaluation of Parse Tree Note: These are conventions, not theorems I Code for a non-terminal is evaluated as a single “block” I I.e., cannot partially execute it, then execute something else, then evaluate the rest I A different parse tree would be needed to achieve that I E.g. 1: Non-terminal T enforces precedence of * over + I E.g. 2: E → E + T enforces left-associativity, E → T + E enforces right-associativity. I Parse tree does not specify order of execution of code blocks I Must be enforced by code generated for parent block. Obey: I Operator (e.g, +) cannot be evaluated before operands I Associativity rules Sameer Sundresh CS421 Topic 15: Introduction to Grammars Outline Objectives and Review Context-Free Grammars Properties of Grammars CFGs Example Derivations and Parse Trees Ambiguity Common Sources of Ambiguity I There are two common forms of ambiguity: I The “dangling else” form: E→ if E then E else E E→ if E then E E→ whatever Example: if a then if x then y else z ... to which if does the else belong? I The “double-ended recursion” form: E→ E + E E→ E * E Example “3 + 4 * 5” ... is it “(3 + 4) * 5” or “3 + (4 * 5)”? Sameer Sundresh CS421 Topic 15: Introduction to Grammars Outline Objectives and Review Context-Free Grammars Properties of Grammars CFGs Example Derivations and Parse Trees Ambiguity The Dangling-Else Ambiguity Draw two separate parse trees for the “dangling else” example: if a then if x then y else z E→ if E then E else E E→ if E then E E→ id Note: id is the common token for variable names a, x, y, z. Sameer Sundresh CS421 Topic 15: Introduction to Grammars Outline Objectives and Review Context-Free Grammars Properties of Grammars CFGs Example Derivations and Parse Trees Ambiguity Fixing Ambiguity I Ambiguity can often be eliminated by thinking more carefully about what you are trying to express with your grammar. I Double-ended recursion usually reveals a lack of precedence and associativity information. I “Dangling else” usually matches with the nearest if. This can be encoded in the grammar. See §4.3 of the Dragon Book for details. I Language fixes can eliminate this problem – for instance, keywords or symbols to identify the start and end of control blocks (i.e. if-then-else-fi) Sameer Sundresh CS421 Topic 15: Introduction to Grammars Outline Objectives and Review Context-Free Grammars Properties of Grammars CFGs Example Derivations and Parse Trees Ambiguity Fixing Ambiguity I The “double-ended recursion” form usually reveals a lack of precedence and associativity information. A technique called stratification often fixes this. I Left-recursive means “associates to the left”, similarly right-recursive. I Higher precedence rules occur lower in the grammar. E→ E + T E→ T T→ T * F T→ F F→ ( E ) F→ integer Sameer Sundresh CS421 Topic 15: Introduction to Grammars Outline Objectives and Review Context-Free Grammars Properties of Grammars Properties of Grammars It is important to be able to say what properties a grammar has. Informally, Epsilon Productions A production of the form “E → ”, where  represents the empty string. Right Linear Grammar Grammars where all the productions have the form “E → x E” or “E → x”. Left-Recursive Grammar a grammar that can generate “E −→ E + X” (for example). Similarly, “right-recursive grammars.” Ambiguous Grammar More than one parse tree is possible for a specific sentence. Sameer Sundresh CS421 Topic 15: Introduction to Grammars Outline Objectives and Review Context-Free Grammars Properties of Grammars Left-Recursive Grammars I A grammar is recursive if a symbol being produced (the one on the left-hand side) reappears in the right hand side after one or more steps. Example: “E → if E then E else E” I A grammar is left-recursive if the production symbol appears as the first symbol on the right-hand-side (in one or more steps). Example: “E → E + F” I Example with indirect left recursions (two steps): Example: A→ Bx B→ Ay Sameer Sundresh CS421 Topic 15: Introduction to Grammars Outline Objectives and Review Context-Free Grammars Properties of Grammars Representing Parse Trees in OCaml Our end goal of parsing will be to build an OCaml data structure representing the parsed form of a program. Although we will focus more on this later, our strategy will be to I create an OCaml datatype for each syntactic category in the language I this datatype will most likely be mutually recursive, to represent the inherent recursive structure of most language definitions I generate an OCaml term, using these mutually recursive types, representing the parsed form of the program – containment in a type constructor shows that the contained items are children of the containing item in the AST Sameer Sundresh CS421 Topic 15: Introduction to Grammars
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved