CD Put Paper Solution

Embed Size (px)

Citation preview

  • 8/13/2019 CD Put Paper Solution

    1/15

    UNIT 1

    Ques1(a). Explain the phases of compiler, with the neat schematic.

    The process of compilation is very complex. So it comes out to be customary from

    the logical as well as implementation point of view to partition the compilation process intoseveral phases. A phase is a logically cohesive operation that takes as input one representationof source program and produces as output another representation.

    Source program is a stream of characters: E.g.pos = init + rate * 60lexical analysis: groups characters into non-separable units, called token, and generatestoken stream: id1 = id2 + id3 * const

    The information about the identifiers must be stored somewhere (symbol table).

    Syntax analysis: checks whether the token stream meets the grammatical specificationof the language and generates the syntax tree.

    Semantic analysis: checks whether the program has a meaning (e.g. if pos is a recordand init and rate are integers then the assignment does not make a sense).

    :=:=

    id1+

    d1 + id2id2 *

    *id3 inttoreal

    id3 60 60Syntax analysis Semantic analysis

    Intermediate code generation, intermediate code is something that is both close to the

    final machine code and easy to manipulate (for optimization). One example is the three-address code:

    dst = op1 op op2

    The three-address code for the assignment statement:temp1 = inttoreal(60);temp2 = id3 * temp1;temp3 = id2 + temp2;id1 = temp3

    Code optimization: produces better/semantically equivalent code.

    temp1 = id3 * 60.0id1 = id2 + temp1

    Code generation: generates assemblyMOVF id3, R2MULF #60.0, R2

    MOVF id2, R1

    ADDF R2, R1MOVF R1, id1

    Symbol Table Creation / Maintenance

    Contains Info (storage, type, scope, args) on Each Meaningful Token, typicallyIdentifiers

    Data Structure Created / Initialized During Lexical Analysis Utilized / Updated During Later Analysis & Synthesis

    Error Handling Detection of Different Errors Which Correspond to All Phases Each phase should know somehow to deal with error, so that compilation can proceed,

    to allow further errors to be detected

  • 8/13/2019 CD Put Paper Solution

    2/15

    Ques1(b). Explain in detail about compiler constructions tools.

    -directed Translation Engines : Generate Intermediate CodeGenerators : Generate Actual Code

    -Flow Engines : Support Optimization

    Ques1(c). Drawpictures of DFAs for each of the follwing regular expressions.

    a. (a/b)*c/d

    b. Ab*/cd*

    OR

    Ques2(a). Given the regular expression RE = a.(b*).(aa | a) over the alphabet = {a,b} answer the

    following questions:

    (a) Derive the NFA corresponds to this RE.(b) Convert this NFA to a DFA using the closure computation.

    Ans:

  • 8/13/2019 CD Put Paper Solution

    3/15

    (a) See NFA below.

    (b) The subset construction yields the DFA on below where we have noted the subset each statestands for.

    Ques2(b). Explain in detail about the role of lexical analyzer.Few errors are discernible at the lexical level alone, because a lexical analyzer has a verylocalized view of a source program. The simplest recovery strategy is panic moderecovery:

    delete the successive characters from the remaining input until the lexical analyzer can find awell-formed token. Other possible error recovery actions are

    oDeleting an extraneous characteroInserting a missing characteroReplacing an incorrect character by a correct characteroTransposing two adjacent charactersQues2(c). Explain briefly in detail about the Input buffering technique with the algorithm.Determining the next lexeme requires reading the input beyond the end of the lexeme.

    Buffer Pairs:

    Concerns with efficiency issuesUsed with a look ahead on the inputIt is a specialized buffering technique used to reduce the overhead required to process aninput character. Buffer is divided into two N-character halves. Use two pointers. Used attimes when the lexical analyzer needs to look ahead several characters beyond the lexeme fora pattern before a match is announced. One pointer called forward pointer, points to firstcharacter of the next lexeme found. The string of characters between two forms the lexeme.

    Increment procedure for forward pointer: (2)If forward at end of first half then

    reload second halfforward+=1

    else if forward at end of second halfreload the first half

    move forward to beginning of first halfelse

    Sentinels: (2)

    forward+=1

  • 8/13/2019 CD Put Paper Solution

    4/15

    Non

    terminalsInput symbol

    a ( ) , $S S->a S->(L)

    L L->SL L->SL

    L L-> L->,SL

    It is the special character which cannot be a part of source program. It isused to reduce the two tests into one. e.g. eofIncrement procedure for forward pointer using sentinels:

    forward+=1if forward =eof then

    If forward at end of first halfthen reload second half

    forward+=1else if forward at end of second

    half reload the first half

    move forward to beginning of first halfelse

    terminate lexical analysis

    UNIT 2

    Ques3(a). Construct predictive parsing table for the grammar

    S->(L) | a

    L->L,S | SAfter the elimination of left recursion:S->(L) | aL->SLL->,SL |

    Calculation of First: First(S) = {(, a} First(L) = {(, a} First(L) = {, , }

    Calculation of Follow: Follow(S) = {$, , ,)} Follow (L) = {)} Follow (L) = {)}

    Predictive parsing table:

    Ques3(a). Eliminate immediate left recursion for the following grammar E->E+T | T,

    T->T * F | F, F-> (E) | id. The rule to eliminate the left recursion is A->A | can be converted as A-> Aand A->A | . So, the grammar after eliminating left recursion is

    E->TE; E->+TE| ; T->FT; T->*FT | ; F-> (E) | id.

    Ques3(b). Consider the following grammar G for expressions and lists of statements (StatList) using

    assignment statements (Assign) and basic expressions (Expr) with the productions presented below

    and already augmented by the intitialproduction G StatList $.

    (0) G StatList $

    (1) StatList Stat ;StatList

    (2) StatList Stat

    (3) Stat Assign

    (4) Assign id =Expr

    (5) Expr id

    (6) Expr const

    For the grammar presented above determine the following:

    (a) Compute the DFA that recognizes the set of LR(0) items.

    (b) Construct the LR(0) parsing table.

  • 8/13/2019 CD Put Paper Solution

    5/15

    (c) Identify the nature of and explain how to resolve any conflicts in the table found in(b).

    Answer:(a) The set of LR(0) items and the corresponding DFA that identifies them is depicted belowwhere for each state (labeled at the upper right corner) we indicate the items and the transition onterminals and non-terminals.

    (b) The LR(0) parsing table is shown below.

    (c) As can be seen there is a shift/reduce conflict in state 2 resulting from the fact that on a

    terminal ; and having seen a Statement the parser could reduce immediately by the production

    rule (2). But becausether

    emight be a valid statement list coming next, there is also apossibility of executing a shift operation and moving to state 3. Typically, as one tries to parse

    the largest possible sentences preference is given to a shift operation.

    OR

    Ques4(a). Construct a parse tree of (a+b)*c for the grammar E->E+E | E*E | (E) | id.

  • 8/13/2019 CD Put Paper Solution

    6/15

    Ques4(b). Find the SLR parsing table for the given grammar and parse the sentence

    (a+b)*c. E->E+E | E*E | (E) | id.

    Ans: Given grammar:1. E->E+E2. E->E*E3. E->(E)

    4. E->idAugmented grammar:

    E->EE->E+EE->E*EE->(E)E->id

    I0: E->.EE->.E+EE->.E*EE->.(E)

    E->.idI1: goto(I0, E)

    E->E.E->E.+EE->E.*E

    I2: goto(I0, ()E->(.E)E->.E+EE->.E*EE->.(E)E->.id

    I3: goto(I0, id)E->id.

    I4: goto(I1, +)E->E+.EE->.E+EE->.E*EE->.(E)E->.id

    I5: goto(I1, *)

    E->E*.EE->.E+EE->.E*EE->.(E)E->.id

    I6: goto(I2, E)E->(E.)E->E.+EE->E.*E

    I7: goto(I4, E)E->E+E.E->E.+EE->E.*E

    I8: goto(I5, E)E->E*E.E->E.+E

    E->E.*E

    goto(I2, ()=I2goto(I2, id)=I3goto(I4, ()=I2goto(I4, id)=I3goto(I5, ()=I2goto(I5, id)=I3

    I9: goto(I6, ))E->(E).

    goto(I6, +)=I4goto(I6, *)=I5goto(I7, +)=I4goto(I7, *)=I5goto(I8, +)=I4goto(I8, *)=I5

    First(E) = {(, id}

    Follow(E)={+, *, ), $}

  • 8/13/2019 CD Put Paper Solution

    7/15

    SLR parsing table:

    StatesAction Goto

    + * ( ) id $ E

    0 S2 S3 11 S4 S5 Acc

    2 S2 S3 6

    3 r4 r4 r4 r4

    4 S2 S3 7

    5 S2 S3 8

    6 S4 S5 S9

    7 S4, r1 S5, r1 r1 r1

    8 S4, r2 S5, r2 r2 r2

    9 r3 r3 r3 r3

    Parsing the sentence (a+b)*c:

    0 (a+b)*c$ shift 20(2 a+b)*c$ shift 30(2a3 +b)*c$ reduce by E->id0(2E6 +b)*c$ shift 40(2E6+4 b)*c$ shift 30(2E6+4b3 )*c$ reduce by E->id0(2E6+4E7 )*c$ reduce by E->E+E0(2E6 )*c$ shift 90(2E6)9 *c$ reduce by E->(E)

    0E1 *c$ shift 50E1*5 c$ shift 30E1*5c3 $ reduce by E->id0E1*5E8 $ reduce by E->E*E0E1 $ accept

    UNIT 3

    Ques5(a). Define three address code. Describe the various types &methods of implementing three address

    statements with an example.

    It is one of the intermediate representations. It is a sequence of statements of theform x:= y op z, where x, y, and z are names, constants or compiler-generatedtemporaries and op is an operator which can be arithmetic or a logical operator. E.g.x+y*z is translated as t1=y*z and t2=x+t1. (4)

    Reason for the term three-address code is that each statement usually contains threeaddresses, two for the operands and one for the result. (2)

  • 8/13/2019 CD Put Paper Solution

    8/15

    Op arg1 arg2 result

    (0) uminus c t1(1) * b t1 t2(2) uminus c t3(3) * b t3 t4

    (4) + t2 t4 t5(5) := t5 a

    Implementation:Quadruples

    Record with four fields, op, arg1, arg2 and resultTriples

    Record with three fields, op, arg1, arg2 to avoid entering temporary names

    into symbol table. Here, refer the temporary value by the position of the statement thatcomputes it.

    Indirect triplesList the pointers to triples rather than listing the triples

    For a: = b* -c + b * -c

    Quadruples

    Triples

    Op arg1 arg2

    (0) uminus c

    (1) * b (0)

    (2) uminus c

    (3) * b (2)

    (4) + (1) (3)(5) assign a (4)

    Indirect TriplesOp arg1 arg2 Statement

    (14) uminus c (0) (14)(15) * b (14) (1) (15)(16) uminus c (2) (16)(17) * b (16) (3) (17)(18) + (15) (17) (4) (18)(19) assign a (18) (5) (19)

    Ques5(b). Explain different storage allocation strategies.

    Strategies:

    Static allocation lays out storage for all data objects during compile time Stack allocation manages the run-time storage as a stack Heap allocation allocates and deallocates storages as needed at runtime from heap area

  • 8/13/2019 CD Put Paper Solution

    9/15

    Static allocation: (2) Names are bound to storage at compile time No need for run-time support package When a procedure is activated, its names are bound to same storage location. Compiler must decide where activation records should go.

    Limitations:size must be known at compile timerecursive procedures are restricteddata structures cant be created dynamically

    Stack allocation: (3)Activation records are pushed and popped as activations begin and end.Locals are bound to fresh storage in each activation and deleted when theactivation ends.Call sequence and return sequencecaller and callee

    Dangling references

    Heap allocation: (3)Stack allocation cannot be used if either of the following is possible:

    1. The values of local names must be retained when an activation ends2. A called activation outlives the caller.

    Allocate pieces of memory for activation records, which can be deallocated in anyorderMaintain linked list of free blocksFill a request for size s with a block of size s, where s is the smallest sizegreater than or equal to sUse heap manager, which takes care of defragmentation and garbage collection.

    OR

    Ques6(a).Generate intermediate code for the following code segment:i=1; s=0;while(i

  • 8/13/2019 CD Put Paper Solution

    10/15

    (11) s=t4(12) t5=i+1(13) i=t5(14) goto (3)

    Ques6(b). Explain (1) Storage organization(2) Parameter passing

    Subdivision of run time memory:Run time storage: The block of memory obtained by compiler from OS to execute thecompiled program. It is subdivided into

    Generated target codeData objectsStack to keep track of the activationsHeap to store all other information

    Activation record: (Frame)It is used to store the information required by a single procedure call.

    Returned valueActual parameters

    Optional control link

    Optional access link

    Saved machine status

    Local data

    temporaries

    Temporaries are used to hold values that arise in the evaluation of expressions.Local data is the data that is local to the execution of procedure. Saved machine statusrepresents status of machine just before the procedure is called. Control link (dynamic link)points to the activation record of the calling procedure. Access link refers to the non-localdata in other activation records. Actual parameters are the one which is passed to the calledprocedure. Returned value field is used by the called procedure to return a value to the callingprocedure

    Compile time layout of local data:The amount of storage needed for a name is determined by its type. The field for

    the local data is laid out as the declarations in a procedure are examined at compile time. Thestorage layout for data objects is strongly influenced by the addressing constraints on the targetmachine.

    (2) Parameter passing.

    Call by value A formal parameter is treated just like a local name. Its storage is in the

    activation record of the called procedure

  • 8/13/2019 CD Put Paper Solution

    11/15

    The caller evaluates the actual parameter and place the r-value in the storage forthe formals

    Call by reference If an actual parameter is a name or expression having L-value, then that l-

    value itself is passed

    However, if it is not (e.g. a+b or 2) that has no l-value, then expression isevaluated in the new location and its address is passed.

    Copy-Restore: Hybrid between call-by-value and call-by-ref (copy in, copy out) Actual parameters evaluated, its r-value is passed and l-value of the actuals are

    determined When the called procedure is done, r-value of the formals are copied back to the

    l-value of the actuals

    Call by name Inline expansion(procedures are treated like a macro)

    UNIT 4

    Ques7(a). Define back patching with an example.Back patching is the activity of filling up unspecified information of labels using

    appropriate semantic actions in during the code generation process. In the semantic actions thefunctions used are mklist(i), merge_list(p1,p2) and backpatch(p,i).

    Source: L2: x= y+1if a or b then L3:

    if c then After Backpatching:x= y+1 100: if a goto 103

    Translation: 101: if b goto 103

    if a go to L1 102: goto 106if b go to L1 103: if c goto 105go to L3 104: goto 106

    L1: if c goto L2 105: x=y+1goto L3 106:

    Ques7(b). Define basic block and flow graph.A basic block is a sequence of consecutive statements in which flow of control enters

    at the beginning and leaves at the end without halt or possibility of branching except at the end.A flow graph is a directed graph in which the flow control information is added to the basicblocks.The nodes in the flow graph are basic blocksthe block whose leader is the first statement is called initial block.There is a directed edge from block B1 to block B2 if B2 immediately follows B1 in the some

    execution sequence. We can say that B1 is a predecessor of B2 and B2 is a successor of B1.

    Ques7(c). Discuss briefly about DAG representation of basic blocks. Draw DAG fort1:=4*i

    t2:=a[t1]

    A DAG for a basic block is a directed acyclic graph in which

  • 8/13/2019 CD Put Paper Solution

    12/15

    operators

    It is useful for implementing transformations on basic blocks and shows how valuescomputed by a statement are used in subsequent statements.

    e.g t1:=4*i t2:=a[t1]Dag is

    Algorithm for the construction of DAG:

    Input: A basic block

    Output: DAG for that basic block, havingLabel for each node where leaves are identifiers, interior nodes are operatorsymbol.for each node, a list of identifiers to hold computed values

    1) x = y op z 2) x = op y 3) x = yStep 1: If node(y) is undefined, create a leaf labeled y and let node(y) be this node. In 1), ifnode(z) is undefined, create a leaf labeled z and let that leaf be node(z)Step 2: For 1), create node op with left child y and right child z, after checking forcommon sub expressionFor 2), check for a node op with a child y. If not create such nodeFor 3), let n be node y.Step 3: Delete x from the list of identifiers for node x. Append x to the list of attachedidentifiers for node n found in step 2 and set node x to n

    Applications of DAG:

    8 Determining the common sub-expressions.8 Determining which identifiers have their values used in the block8 Determining which statements compute values that could be used outside the block8 Simplifying the list of quadruples by eliminating the common sub-expressions and notperforming the assignment of the form x: = y unless and until it is a must.

    OR

    Ques8(a). Discuss briefly about peephole optimization.

    Peephole optimization is a simple and effective technique for locally improvingtarget code. This technique is applied to improve the performance of the target program byexamining the short sequence of target instructions and replacing these instructions by shorteror faster sequence, whenever is possible.Peep hole is a small, moving window on the target program.

  • 8/13/2019 CD Put Paper Solution

    13/15

    Local in nature Pattern driven Limited by the size of the window

    Characteristics of peephole optimization:8 Redundant instruction elimination

    8Flow of control optimization8 Algebraic simplification

    8 Use of machine idioms

    Constant Foldingx := 32x := x + 32 becomes x := 64

    Unreachable CodeAn unlabeled instruction immediately following an unconditional jump is removed.

    goto L2x := x + eeded

    Flow of control optimizations

    Unnecessary jumps are eliminated.goto L1

    L1: goto L2 becomes goto L2

    Algebraic Simplificationx := x + eeded

    Dead code eliminationx := ere x not used after statementy := x + y y := y + 32

    Reduction in strengthReplace expensive operations by equivalent cheaper ones

    x := x x := x + x

    Ques8(b). Explain the various issues in the design of code generation.

    Input to the code generator: Intermediate representation of the source program, like

    linear representations such as postfix notation, three address representations such as

    quadruples, virtual machine representations such as stack machine code and graphical

    representations such as syntax trees and dags.

    Target programs: It is the output such as absolute machine language, relocatable

    machine language or assembly language.

    Memory management: Mapping of names in the source program to addresses of data

    object in run time memory is done by front end and the code generator.

    Instruction selection:Nature of the instruction set of the target machine determines the

    difficulty of instruction selection.

    Register allocation: Instructions involving registers are shorter and faster. The use of

    registers is being divided into two sub problems:o During register allocation, we select the set of variables that will reside in

    registers at a point in the program

  • 8/13/2019 CD Put Paper Solution

    14/15

    o During a subsequent register assignment phase, we pick the specificregister that a variable will reside in

    Choice of evaluation order: The order in which computations are performed affect the

    efficiency of targetcode.Approaches to code generation

    UNIT 5

    Ques9(a). Write short notes on global data flow analysis.Collecting information about the way data is used in a program. Takes control flow into account Forward

    flow vs. backward flowForward: Compute OUT for given IN, GEN, KILL

    Information propagates from the predecessors of a vertex.

    Examples: Reachability, available expressions, constant propagationBackward: Compute IN for given OUT, GEN, KILL

    Information propagates from the successors of a vertex. Example: Live variable Analysis

    Ques9(b). Describe in detail about optimization of basic blocks with example.

    Code improving transformations:-preserving transformations

    o Common sub expression eliminationo Dead-code eliminations

    Structure preserving transformations:It is implemented by constructing a dag for a basic block. Common sub

    expression can be detected by noticing, as a new node m is about to be added,

    whether there is an existing node n with the same children, in the same order, andwith the same operator. If so, n computes the same value as m and may be used in itsplace.E.g. DAG for the basic blockd:=b*ce:=a+bb:=b*ca:=e-d is given by

    For dead-code elimination, delete from a dag any root (root with no ancestors)that has no live variables. Repeated application of this will remove all nodes from thedag that corresponds to dead code.Use of algebraic identities:e.g. x+0 = 0+x=x

  • 8/13/2019 CD Put Paper Solution

    15/15

    x-0 = xx*1 = 1*x = xx/1 = xReduction in strength:Replace expensive operator by a cheaper one.

    x ** 2 = x * xConstant folding:Evaluate constant expressions at compile time and replace them by their values.Can use commutative and associative lawsE.g. a=b+c

    e=c+d+bIC: a=b+c

    t=c+de=t+b

    If t is not needed outside the block, change this toa=b+c

    e=a+dUsing both the associativity and commutativity of +.

    OR

    Ques10(a). Explain in detail about code- improving transformations.

    Ques10(b).Describe in detail about principal sources of optimization.Code optimization is needed to make the code run faster or take less space or both.Function preserving transformations:

    ommon sub expression elimination

    -code elimination

    Common sub expression elimination: E is called as a common sub expression if E waspreviously computed and the values of variables in E have not changed since the previouscomputation.Copy propagation: Assignments of the form f:=g is called copy statements or copies in short.The idea here is use g for f wherever possible after the copy statement.Dead code elimination: A variable is live at a point in the program if its value can be usedsubsequently. Otherwise dead. Deducing at compile time that the value of an expression is aconstant and using the constant instead is called constant folding.Loop optimization:

    p: Takes an expression that yields the same resultindependent of the number of times a loop is executed (a loop-invariant computation) and placethe expression before the loop.

    ive operation by a cheaper one.