Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Attribute Grammars and Ad-hoc Syntax-Directed Translation for Semantic Analysis, Study notes of Computer Science

An introduction to attribute grammars and their use in semantic analysis. Attribute grammars are a way to augment context-free grammars with rules that compute the values of attributes for each symbol in a derivation. The basics of attribute grammars, including the concept of inherited and synthesized attributes, and discusses evaluation methods. Additionally, the document introduces ad-hoc syntax-directed translation and shows how it can be used to address the shortcomings of the attribute grammar paradigm. Examples and a comparison of the two approaches.

Typology: Study notes

Pre 2010

Uploaded on 07/30/2009

koofers-user-7bt
koofers-user-7bt 🇺🇸

10 documents

1 / 7

Toggle sidebar

Related documents


Partial preview of the text

Download Attribute Grammars and Ad-hoc Syntax-Directed Translation for Semantic Analysis and more Study notes Computer Science in PDF only on Docsity! 1 Content Sensitive Analysis CS430 2 Roadmap (Where are we?) Last lecture • LR(1) parsing → Building ACTION / GOTO tables → Shift / reduce and reduce / reduce conflicts → SLR(1), LALR(1) parsers This lecture • Context-sensitive analysis → Motivation → Attribute grammars  Attributes  Evaluation order → Ad hoc Syntax-directed translation CS430 3 Context-Sensitive Analysis: Beyond Syntax There is a level of correctness that is deeper than grammar To generate code, we need to understand its meaning ! fie(a,b,c,d) int a, b, c, d; { … } fee() { int f[3],g[0], h, i, j, k; char *p; fie(h,i,“ab”,j, k); k = f * i + j; h = g[17]; printf(“<%s,%s>.\n”, p,q); p = 10; } What is wrong with this program? (let me count the ways …) • declared g[0], used g[17] • wrong number of args to fie() • “ab” is not an int • wrong dimension on use of f • undeclared variable q • 10 is not a character string All these errors are “deeper than syntax” CS430 4 Beyond Syntax To generate code, the compiler needs to answer many questions • Is “x” a scalar, an array, or a function? Is “x” declared? • Are there names that are not declared? Declared but not used? • Which declaration of “x” does each use reference? • Is the expression “x * y + z” type-consistent? • In “a[i,j,k]”, does a have three dimensions? • Where can “z” be stored? (register, local, global, heap, static) • In “f ← 15”, how should 15 be represented? • How many arguments does “fie()” take? What about “printf ()” ? • Does “*p” reference the result of a “malloc()” ? • Do “p” & “q” refer to the same memory location? • Is “x” defined before it is used? These are beyond a CFG CS430 5 Beyond Syntax These questions are part of context-sensitive analysis • Answers depend on “values”, not parts of speech • Questions & answers involve non-local information • Answers may involve computation How can we answer these questions? • Use formal methods → Context-sensitive grammars? → Attribute grammars? (attributed grammars?) • Use ad-hoc techniques → Symbol tables → Ad-hoc code (action routines) In scanning & parsing, formalism won; different story here. CS430 6 Beyond Syntax Telling the story • The attribute grammar formalism is important → Succinctly makes many points clear → Sets the stage for actual, ad-hoc practice • The problems with attribute grammars motivate practice → Non-local computation → Need for centralized information We will cover attribute grammars, then move on to ad-hoc ideas 2 CS430 7 Attribute Grammars What is an attribute grammar? • A context-free grammar augmented with a set of rules • Each symbol in the derivation has a set of values, or attributes • The rules specify how to compute a value for each attribute Number → Sign List Sign → + | – List → List Bit | Bit Bit → 0 | 1 Example grammar This grammar describes signed binary numbers We would like to augment it with rules that compute the decimal value of each valid input string CS430 8 Examples We will use these two throughout the lecture Number ⇒ Sign List ⇒ – List ⇒ – Bit ⇒ – 1 Number List Bit 1 Sign – For “–1” Number ⇒ Sign List ⇒ Sign List Bit ⇒ Sign List 1 ⇒ Sign List Bit 1 ⇒ Sign List 0 1 ⇒ Sign Bit 0 1 ⇒ Sign 1 0 1 ⇒ – 101 Number ListSign – Bit 1 List Bit 0 List Bit 1 For “–101” CS430 9 Attribute Grammars Add rules to compute the decimal value of a signed binary number Productions Attribution Rules Number → Sign List List.pos ← 0 If Sign.neg then Number.val ← – List.val else Number.val ← List.val Sign → + Sign.neg ← false | – Sign.neg ← true List0 → List1 Bit List1.pos ← List0.pos + 1 Bit.pos ← List0.pos List0.val ← List1.val + Bit.val | Bit Bit.pos ← List.pos List.val ← Bit.val Bit → 0 Bit.val ← 0 | 1 Bit.val ← 2Bit.pos Symbol Attributes Number val Sign neg List pos, val Bit pos, val CS430 10 Attribute Grammars Productions Attribution Rules List0 → List1 Bit List1.pos ← List0.pos + 1 Bit.pos ← List0.pos List0.val ← List1.val + Bit.val pos val pos val pos val LIST0 LIST1 BIT • Semantic rules define partial dependency graph • Value flow top down or across: inherited attributes • Value flow bottom-up: synthesized attributes CS430 11 Attribute Grammars • Semantic rules associated with production A → α have to specify the values for all - synthesized attributes for A (root) - inherited attributes for grammar symbols in α (children) ⇒ rules must specify local value flow! • Terminals can be associated with values returned by the scanner. These input values are associated with a synthesized attribute. • Starting symbol cannot have inherited attributes. Note: pos val pos val pos val LIST0 LIST1 BIT CS430 12 Attribute Grammars •Question: What rules specify values for , and ? pos val pos val pos val LIST0 LIST1 BIT 5 CS430 25 Circularity We can only evaluate acyclic instances • We can prove that some grammars can only generate instances with acyclic dependence graphs • Largest such class is “strongly non-circular” grammars (SNC ) • SNC grammars can be tested in polynomial time Many evaluation methods discover circularity dynamically ⇒ Bad property for a compiler to have SNC grammars were first defined by Kennedy & Warren CS430 26 An Extended Example Grammar for a basic block (§ 4.3.3) Block0 → Block1 Assign  Assign Assign → Ident = Expr ; Expr0 → Expr1 + Term  Expr1 – Term  Term Term0 → Term1 * Factor  Term1 / Factor  Factor Factor → ( Expr )  Number  Identifier Let’s estimate cycle counts • Each operation has a COST • Add them, bottom up • Assume a load per value • Assume no reuse Simple problem for an AG Hey, this looks useful ! CS430 27 An Extended Example (continued) Block0 → Block 1 Ass ign Block0.cost ← Block 1.cost + Assign.cost  Assign Block0.cost ← Assign.cost Assign → Ident = Expr ; Assign.cost ← COST(store) + Expr.cost Expr0 → Expr1 + Term Expr0.cost ← Expr1.cost + COST(add) + Term.cost  Expr1 – Term Expr0.cost ← Expr1.cost + COST(add) + Term.cost  Term Expr0.cost ← Term.cost Term0 → Term1 * Factor Term0.cost ← Term1.cost + COST(mult ) + Factor.cost  Term1 / F actor Term0.cost ← Term1.cost + COST(div) + Factor.cost  Factor Term0.cost ← Factor.cost Factor → ( Expr ) Factor.cost ← Expr.cost  Numb er Factor.cost ← COST(loadI)  Identifier Factor.cost ← COST(load) These are all synthesized attributes ! Values flow from rhs to lhs in prod’ns CS430 28 An Extended Example (continued) Properties of the example grammar • All attributes are synthesized ⇒ S-attributed grammar • Rules can be evaluated bottom-up in a single pass → Good fit to bottom-up, shift/reduce parser • Easily understood solution • Seems to fit the problem well What about an improvement? • Values are loaded only once per block (not at each use) • Need to track which values have been already loaded Things will get more complicated. CS430 29 Adding load tracking • Need sets Before and After for each production • Must be initialized, updated, and passed around the tree A Better Execution Model Factor → ( Expr ) Factor.cost ← Expr.cost ; Expr.Before ← Factor.Before ; Factor.After ← Expr.After  Number Factor.cost ← COST(loadi) ; Factor.After ← Factor.Before  Identifier If (Identifier.name ∉ Factor.Before) then Factor.cost ← COST(load); Factor.After ← Factor.Before ∪ Identifier.name else Factor.cost ← 0 Factor.After ← Factor.Before This looks more complex! CS430 30 • Load tracking adds complexity • But, most of it is in the “copy rules” • Every production needs rules to copy Before & After A sample production These copy rules multiply rapidly Each creates an instance of the set Lots of work, lots of space, lots of rules to write A Better Execution Model Expr0 → Expr1 + Term Expr0 .cost ← Expr1.cost + COST(a dd) + Term.cost ; Expr1.Before ← Expr0 .Before ; Term.Before ← Expr1.Afte r; Expr0 .Afte r ← Term.After 6 CS430 31 The Moral of the Story • Non-local computation needed lots of supporting rules • “Complex” local computation is relatively easy The Problems • Copy rules increase cognitive overhead • Copy rules increase space requirements → Need copies of attributes • Result is an attributed tree → Must build the parse tree → Either search tree for answers or copy them to the root CS430 32 Addressing the Problem What would a good programmer do? • Introduce a central repository for facts • Table of names → Field in table for loaded/not loaded state • Avoids all the copy rules, allocation & storage headaches • All inter-assignment attribute flow is through table → Clean, efficient implementation → Good techniques for implementing the table (hashing, § B.4) → When its done, information is in the table ! → Cures most of the problems • Unfortunately, this design violates the functional paradigm → Do we care? CS430 33 The Realist’s Alternative Ad-hoc syntax-directed translation • Associate pieces of code with each production • At each reduction, the corresponding code is executed • Allowing arbitrary code provides complete flexibility → Includes ability to do tasteless & bad things To make this work • Need names for attributes of each symbol on lhs & rhs → Typically, one attribute passed through parser + arbitrary code (structures, globals, statics, …) → Yacc introduced $$, $1, $2, … $n, left to right • Need an evaluation scheme → Fits nicely into LR(1) parsing algorithm CS430 34 Reworking the Example (with load tracking) Block0 → Block1 Assign  Assign Assign → Ident = Expr ; cost← cost + COST(store); Expr0 → Expr1 + Term cost← cost + COST(add);  Expr1 – Term cost← cost + COST(sub);  Term Term0 → Term1 * Factor cost← cost + COST(mult);  Term1 / Factor cost← cost + COST(div);  Factor Factor → ( Expr )  Number cost← cost + COST(loadi);  Identifier { i← hash(Identifier); if (Table[i].loaded = false) then { cost ← cost + COST(load); Table[i].loaded ← true; } } This looks cleaner & simpler than the AG sol’n ! One missing detail: initializing “cost”; (we ignore “Table[ ] for now) CS430 35 Reworking the Example (with load tracking) Start → Init Block Init → ε cost ← 0; Block0 → Block1 Assign  Assign Assign → Ident = Expr ; cost← cost + COST(store); … and so on as in the previous version of the example … • Before parser can reach Block, it must reduce Init • Reduction by Init sets cost to zero This is an example of splitting a production to create a reduction in the middle — for the sole purpose of hanging an action routine there (marker production)! CS430 36 Reworking the Example (with load tracking) Block0 → Block1 Assign $$ ← $1 + $2 ;  Assign $$ ← $1 ; Assign → Ident = Expr ; $$← COST(store) + $3; Expr0 → Expr1 + Term $$← $1 + COST(add) + $3;  Expr1 – Term $$← $1 + COST(sub) + $3;  Term $$ ← $1; Term0 → Term1 * Factor $$ ← $1 + COST(mult) + $3;  Term1 / Factor $$ ← $1 + COST(div) + $3;  Factor $$ ← $1; Factor → ( Expr ) $$ ← $2;  Number $$ ← COST(loadi);  Identifier { i← hash(Identifier); if (Table[i].loaded = false) then { $$ ← COST(load); Table[i].loaded ← true; } else $$ ← 0 } This version passes the values through attributes. It avoids the need for initializing “cost” However, Table[ ] still needs to be initialized 7 CS430 37 Example — Building an Abstract Syntax Tree • Assume constructors for each node • Assume stack holds pointers to nodes • Assume yacc syntax Goal → Expr $$ = $1; Expr → Expr + Term $$ = MakeAddNode($1,$3); | Expr – Term $$ = MakeSubNode($1,$3); | Term $$ = $1; Term → Term * Factor $$ = MakeMulNode($1,$3); | Term / Factor $$ = MakeDivNode($1,$3); | Factor $$ = $1; Factor → ( Expr ) $$ = $2; | number $$ = MakeNumNode(token); | id $$ = MakeIdNode(token); CS430 38 Reality Most parsers are based on this ad-hoc style of context- sensitive analysis Advantages • Addresses the shortcomings of the AG paradigm • Efficient, flexible Disadvantages • Must write the code with little assistance • Programmer deals directly with the details Most parser generators support a yacc-like notation CS430 39 Typical Uses (Semantic Analysis) • Building a symbol table → Enter declaration information as processed → At end of declaration syntax, do some post processing → Use table to check errors as parsing progresses • Simple error checking/type checking → Define before use → lookup on reference → Dimension, type, ... → check as encountered → Type conformability of expression → bottom-up walk → Procedure interfaces are harder  Build a representation for parameter list & types  Check actual vs. formal parameter list  Positional or keyword associations assumes table is global CS430 40 Is This Really “Ad-hoc” ? Relationship between practice and attribute grammars Similarities • Both rules & actions associated with productions • Application order determined by tools • (Somewhat) abstract names for symbols Differences • Actions applied as a unit; not true for AG rules • Anything goes in ad-hoc actions; AG rules are (purely) functional • AG rules are higher level than ad-hoc actions CS430 41 Making Ad-hoc Syntax Directed Translation Work How do we fit this into an LR(1) parser? • Need a place to store the attributes → Stash them in the stack, along with state and symbol → Push three items each time, pop 3 x |β| symbols • Need a naming scheme to access them → $n translates into stack location: top - 3 x (|β| - n) • Need to sequence rule applications → On every reduce action, perform the action rule What about a rule that must work in mid-production? • Can transform the grammar → Split it into two parts at the point where rule must go and apply the rule on reduction to the appropriate part → Introduce marker productions M → ε with appropriate action
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved