Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Context-Free Grammars and Pushdown Automata in CMSC 330 - Prof. Atif M. Memon, Study notes of Programming Languages

An introduction to context-free grammars and pushdown automata in the context of computer science course cmsc 330. Topics covered include the differences between context-free grammars and regular expressions, the definition and workings of pushdown automata, and the equivalence of dfas and regular grammars. The document also discusses the practical applications of context-free grammars in programming languages, including the use of regular expressions and context-free grammars in lex and yacc tools.

Typology: Study notes

Pre 2010

Uploaded on 02/13/2009

koofers-user-oz9
koofers-user-oz9 🇺🇸

5

(1)

10 documents

1 / 6

Toggle sidebar

Related documents


Partial preview of the text

Download Context-Free Grammars and Pushdown Automata in CMSC 330 - Prof. Atif M. Memon and more Study notes Programming Languages in PDF only on Docsity! 1 CMSC 330: Organization of Programming Languages Context-Free Grammars: Pushdown Automaton CMSC 330 2 Reminders • Project 2 Due Oct. 12 CMSC 330 3 Regular expressions and CFGs • Programming languages are not regular – Matching (an arbitrary number of) brackets so that they are balanced • Usually almost context-free, with some hacks MachineDescription pushdown automata (PDAs) context-free grammars context-free languages DFAs, NFAsregular expressions regular languages CMSC 330 4 Equivalence of DFA and regular grammars CMSC 330 5 Pushdown Automaton (PDA) • A pushdown automaton (PDA) is an abstract machine similar to the DFA – Has a finite set of states – Also has a pushdown stack • Moves of the PDA are as follows: – An input symbol is read and the top symbol on the stack is read – Based on both inputs, the machine • Enters a new state, and • Writes zero or more symbols onto the pushdown stack • Or pops zero or more symbols from the stack – String accepted if the stack is empty AND the string has ended CMSC 330 6 Power of PDAs • PDAs are more powerful than DFAs – anbn, which cannot be recognized by a DFA, can easily be recognized by the PDA • Stack all a symbols and, for each b, pop an a off the stack. • If the end of input is reached at the same time that the stack becomes empty, the string is accepted 2 CMSC 330 7 Context-free Grammars in Practice • Regular expressions are used to turn raw text into a string of tokens – E.g., “if”, “then”, “identifier”, etc. – Whitespace and comments are simply skipped – These tokens are the input for the next phase of compilation – Standard tools used include lex and flex • Many others for Java • CFGs are used to turn tokens into parse trees – This process is called parsing – Standard tools used include yacc and bison • Those trees are then analyzed by the compiler, which eventually produces object code CMSC 330 8 Parsing • There are many efficient techniques for turning strings into parse trees – They all have strange names, like LL(k), SLR(k), LR(k)... – Take CMSC 430 for more details • We will look at one very simple technique: recursive descent parsing – This is a “top-down” parsing algorithm because we’re going to begin at the start symbol and try to produce the string CMSC 330 9 Example E  id = n | { L } L  E ; L |  – Here n is an integer and id is an identifier • One input might be – { x = 3; { y = 4; }; } – This would get turned into a list of tokens { x = 3 ; { y = 4 ; } ; } – And we want to turn it into a parse tree CMSC 330 10 Example (cont’d) E  id = n | { L } L  E ; L |  { x = 3; { y = 4; }; } E { L } E ; L x = 3 E ; L { L } E ; L y = 4   CMSC 330 11 Parsing Algorithm • Goal: determine if we can produce a string to be parsed from the grammar's start symbol • At each step, we'll keep track of two facts – What tree node are we trying to match? – What is the next token (lookahead) of the input string? • There are three cases: – If we’re trying to match a terminal and the next token (lookahead) is that token, then succeed, advance the lookahead, and continue – If we’re trying to match a nonterminal then pick which production to apply based on the lookahead – Otherwise, fail with a parsing error CMSC 330 12 Example (cont’d) E  id = n | { L } L  E ; L |  { x = 3 ; { y = 4 ; } ; } E { L } E ; L x = 3 E ; L { L } E ; L y = 4   lookahead 5 CMSC 330 25 What’s Wrong with Parse Trees? • Parse trees contain too much information – E.g., they have parentheses and they have extra nonterminals for precedence – This extra stuff is needed for parsing • But when we want to reason about languages, it gets in the way (it’s too much detail) CMSC 330 26 Abstract Syntax Trees (ASTs) • An abstract syntax tree is a more compact, abstract representation of a parse tree, with only the essential parts parse tree AST CMSC 330 27 ASTs (cont’d) • Intuitively, ASTs correspond to the data structure you’d use to represent strings in the language – Note that grammars describe trees (so do OCaml datatypes which we’ll see later) – E  a | b | c | E+E | E-E | E*E | (E) CMSC 330 28 The Compilation Process CMSC 330 29 Producing an AST • To produce an AST, we modify the parse() functions to construct the AST along the way CMSC 330 30 Producing an AST (cont’d) type ast = Assn of string * int | Block of ast list let rec parse_E () = if lookahead = 'id' then let id = parse_term 'id' in let _ = parse_term '=' in let n = parse_term 'n' in Assn(id, int_of_string n) else if lookahead = '{' then begin let _ = parse_term '{' in let l = parse_L () in let _ = parse_term '}' in Block l end else raise <Parse error>; 6 CMSC 330 31 Producing an AST (cont’d) type ast = Assn of string * int | Block of ast list and parse_L () = if lookahead = 'id' then let e = parse_E () in let _ = parse_term ';' in let l = parse_L () in e::l else []
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved