Veri akış denklemlerinin çözümü ile kod optimizasyonu

Akarsu, Erol

Veri akış denklemlerinin çözümü ile kod optimizasyonu

Dosyalar

39313.pdf (5.47 MB)

Tarih

1993

Yazarlar

Akarsu, Erol

Yayınevi

Fen Bilimleri Enstitüsü

Özet

Bu çalışmamızda derleyicilerin optimizasyon safhasında temel bloklara gelen değişik veri akış bilgilerini bulmak için kurulan veri akış denklemlerini, bir akış grafinı kök düğümüne indirgeyen özel bir algoritmayı kullanarak graf tek düğüme indirgenirken yığın makinalannın hesaplama mantığı yardımıyla çözdük, önce C'nin bir alt gramerine uygun olarak yazılan C yüksek seviyeli dili girdi olarak alıp bundan üç adresli ifadelerden oluşan bir ara kod ürettik. Sonra bu ara koddan temel blokları çıkartıp, düğümleri temel bloklar olan bir akış grafi oluşturduk. Sonra her blok için değişik veri akış denklemleri kurduk. Kurulan bu denklemler, graf tek bir düğüme indirgenirken çözüldü. Bu çözümlerden yararlanarak değişik optimizasyon metodlarını gerçekledik.
Global data flow graph, with N nodes and E directed edges is constructed and all types of data flow equations, that must be solved by optimizing compiler, for each node is solved using an algorithm for global data flow analysis introduced by Graham and Wegman[l]. Using the solutions of these equations, three types of code- improving transformations, common sub expression elimination, constant folding and copy propagation are implemented. Firstly, for a subset of C language, called Sub_C, two input files, lex.l and yacc.y are prepared for automated compiler construction tools LEX and YACC respectively. LEX takes an input file, lex.l consisting of a table of regular expressions and corresponding program fragments, and produces a program, lex.yy.c which is actually lexical analyzer. As LEX language token is recognized, the corresponding program fragment,specified in lex.l, is executed. YACC takes a specification of the input process,yacc.y, which includes rules describing the input structure, code to be invoked when these rules are recognized(actions), and a low-level routine to do the basic input. This routine here is the output code of LEX, lex.yy.c. YACC then generates a function, parser, which calls the lexical analyzer, picks up tokens, and organizes them according to the input structure rules, called grammar rules. When one of these rules has been recognized, the code supplied for this rule, an action, is invoked. This action is called a semantic rule for corresponding grammar rule. YACC is used to build an intermediate code generator for a syntax-directed definition of a subset of C. In this syntax directed definition, each grammar production A -> B has associated with it a set of semantic rules of form b:= f(cl, c2,.., ck) where f is a function, and either 1. b is a synthesized attribute of A and cl, c2,.., ck are attributes belonging to the grammar symbols of production, or 2. b is an inherited attribute of one of grammar symbols on the right side of the production, and cl, c2,.., ck are attributes belonging to the grammar symbols of the production. Functions in semantic rules will often be written as expressions or as procedure calls or program fragments. An attribute can represent anything we choose : a string, a number, a type, a memory location, or whatever. An attribute corresponds to the name of a field of symbol table record which is kept for holding many information for each user-defined variable. YACC parser has two stacks, state stack which holds integer-valued state, and value stack(attribute stack) which holds attribute values of grammar symbols. Parser can keep the values of synthesized attributes associated with grammar symbols on its value stack. Whenever a reduction is made, the values of the new synthesized attributes are computed from the attributes appearing on the value stack for grammar symbols on the right side of the production. Suppose one grammar rule A -> Bl 132... Bn {an action }. The attribute value of Bi is referenced from the value stack by symbol $i just before reduction to A. We can access to the attribute of A by symbol $$. If Bl -> Al A2 A3, then the attribute value of Ai is referenced by $(l-i). The type of an entry on value stack is integer as default value. If user wishes to put the other typed- valued on value stack, the stack is declared to be a union of the various types and we must insert the type of entry in o after $ if we would reference the attribute of ai. Suppose the production A -> XY. Since the value of X.s (synthesized attribute of symbol X) is already on the value stack before any reductions take place in the subtree below Y. This value can be inherited by Y. That is, if the inherited attribute Y.i if defined by a copy rule Y.i := X.s, then the value X.s can be used where Y.i id called for. Using these reference mechanisms for inherited and synthesized attributes, we can attach many semantic rules into grammar productions. By this way, we can translate a high level C program into intermediate code consisting of three address(TAS) statements. The general form of a TAS id x = y op z, where x, y and z are pointers into heir symbol table entry. The compiler-generated temporaries are inserted into symbol table. There are many types of TASs. These statements can be implemented as records with fields for the operator and the operands. One quadruple holds one TAS. A quadruple is a record structure with four main fields, which we call op, argl, arg2, arg3 and res. The p field contains integer number for the operator. The TAS x = y op z is represented by placing y in argl, z in arg2, x in res. AH TASs was inserted into an array of quadruples called quadrup. After producing intermediate code, we partition a sequence of three address statements into basic blocks. We first determine the set of leaders, the first statements of basic blocks. The first statements, any statement that is target of a conditional an unconditional goto, or any statement that immediately follows a goto or conditional goto statements, are leaders. For each leader, its basic block consists of leader and all statements up to but not including the next leader or the end of program. If there is a jump from the last statement of Bl to the first statement of B2 or B2 immediately follows Bl in the order of program, and Bl does nor end in unconditional jump, then there is directed edge from block Bl to block B2. A flow graph is a directed graph and have nodes corresponding to basic blocks. The block whose leader are the first statement is initial node. We make jumps point to blocks rather than quadruples. VI A basic block is declared as a record with fields such as tacjist, succlist and predlist, all pointing to the head of a linked list whose nodes contain indices of TASs included in b, indices of successor blocks of b, and indices of predecessor blocks of b respectively. As explained above, the main purpose of this work is to solve data flow equations using a specific algorithm. In order to do code optimization and a good job of code generation, a compiler needs to collect information about the program an distribute this information to each block in the flow graph. This information can be available expressions used to eliminate redundant computations, reaching definitions, copy statements and so on. These information are just a few examples of data-flow information that an optimizing compiler collect by a process known as data-flow analysis. Data-flow information can be collected by setting up and solving data-flow equations that relate information at various points in a program. A typical equation has the form out(S) = gen(S) U ( in(S) - kill(S) ) and can be read as " the information at the end of statement S is either generated within statement S, or enters at the beginning and is not killed as control flows trough statement S. We solved five type of data-flow equations. First one is for reaching definitions and has the form for a basic block B: in(B) = U out(P) /* P is a predecessor of B */ out(B) = gen(B) U ( in(B) - kill(B) ) Here in(B) and out(B) are unknown quantities and gen(B) and kill(B) known ones. All variables are represented compactly using bit vector arrays. We assign a number to each definition of interest in the flow graph. Because we define the elements of this array as unsigned long with 32 bit length, the bit vector array representing a set of definitions will have 1 in (i mod 32). bit of [i%32]. element of that array if and only if the definite numbered i is in the set. The bit vector array representation for sets also allows set operations to be implemented efficiently. The union (U) and intersection (n) of two sets can be implemented by "logical or" and "logical and" respectively. The difference A - B of sets A and b can be implemented as "A and (not B) ". gen(B) and kill(B) can be defined, by noting that each block may be viewed as a statement that is cascade of one or more statements, and that vu gen(S) and kill(S) of statement S consisting of two cascade statements SI ; S2, may be defined as gen(S) = gen(S2)U(gen(Sl)-kill(S2)) kill(S) = kill(S2) U (kill(Sl) - gen(S2)) Second type of data flow equations is for available expressions and has the form for a basic block, B e_in(B) = II e_out(P) /* P is a predecessor of B */ e_out(B) = e_gen(B) U ( e_in(B) - e_kill(B) ) e_in(B_l) = Empty set, where B_l is initial block e^gen(B) is the expressions generated by B, and e_kill(B) the set of expressions in US killed by B, where US is all the expressions appearing on the right side of one or more statements of intermediate program. e_out(B) and e_in(B) are the expressions available at the end of B and such expressions at the beginning of B respectively. e_in and e_out are unknown variables and e_gen and ekill known ones. Third type of these equations is for live- variable analysis and has the form for a basic block B liv_in(B) = use(B) U ( liv_out(B) - defÇB) ) liv_out(B) = U liv_in(S) /* S is a successor of B */ def(B) is the set of variables definitely assigned values in B prior to any use of that variable in B, and use(B) is the set of variables whose values may be used in B prior to any definition of the variable. First group of above equations says that a variable is live coming into a block if either it is used before definition in the block or it is live coming out of the block and is not defined in the block. The second group of equations says that a variable is live coming out of a block if and only if it is live coming into one of its successors. liv_in and livout are known variables and def and use known ones. Fourth type of these equations is for definition-use (du) chains and has the form for a basic block B du_in(B) = du_use(B) U ( du_out(B) - du_def(B) ) du_out(B) = U du_in(S) /* S is a successor of B */ VM du_use(B) is the set of upwards exposed uses in B, that is the set of pairs(s, x) such that s is a statement in B which uses variable x and such that no prior definition of x occurs in B. dudef is the set of pairs (s, x) such that s is a statement which uses x, s is not in B and B has a definition of x. du_in and du_out are unknown variables and du_use and du_def known ones. Last equation for copy propagation has the form for a basic block B c_in(B) = II c_out(P) /* P is a predecessor öf B */ c_out(B) = c_gen(B) U ( c_in(B) - c_kill(B) ) c_in(B_l) = Empty set, where B_l is initial block cgen(B) is the set of all copies, x:= y generated in block B and c_kill(B) is the set of copies of all copy statements in the program that killed. cin and c_out are unknown variables and the others known ones. After known quantities in above equations are calculated, in, c_in, ein, livout, du_out are substituted in out, cout, e_out, liv_in, du_in respectively. Then we take postfix form of out, c_out, e_out, liv_in, du_in, and insert, for each block b, each element of each posfix form into corresponding linked list, whose first address is a field of basic block. After performing all the tasks as mentioned above, we arrive at Graham- Wegman algorithm. The algorithm takes a reducible flow graph, and reduce this graph to the root node of it. We make a depth first search on a give flow graph and find depth first number for each node. Then we must find dominator number for each node such that if node x dominates y then dominator number of x less than the dominator number of y. One node x dominates the other node y if all paths from initial node to the node y must go through x. Using dominator information and depth first numbers of all nodes, we must find loops and header nodes of them. The algorithm uses three transformations, Tl, T2, T3 in reducing a given flow graph. Tl transformation eliminates loops consisting of only one node, v, that is, deletes the edge (v, v). When applying Tl transformation, on basic block b, we set/reset the content of all nodes, which have a block indices equal to b, of linked list, which keeps postfix form of one data flow equation. We repeat this process for the postfix form of all data flow equations. T2 transformation takes as an input the nodes h, v, and w such that there are edges (h, v) and (v, w) T2 transformation break dependency of w to v and make w dependent to h. that is, the edge (v, w) is deleted and the edge (h, v) constructed. If node v has no successors after last process, this node v and the edge (h, v) are deleted from the flow graph. IX At the same time T2 transformation is made, we search, for a basic block b, all nodes of linked list holding postfix form of one equation and find nodes including basic block indices equal to v and deletes that node and insert instead of it the corresponding linked list holding postfix form of the same type equation for block h. Thus, the postfix form of that equation for basic block w does not depend on out(v) value but out(h). We have to repeat the last process for postfix forms of all type equations for block w. After we find head nodes of loops in graph, and insert them into list, head_list and we sort this list according to dominator number of head nodes in ascending order. Then we take one loop head from headjist, h, and form loop nodes of that head, then as long as there is nodes h, v, and w in current loop nodes such that there are edges (h, v) and (v, w), we apply T2 transformation on these nodes. At the last of T2 transformation, we get the loop including only one node (h, h). We remove this loop by Tl transformation. We perform last process for every loop nodes whose head is in headjist and get a flow graph which does not contain loops from original graph. On the nodes of this loopless flow graph, we apply T2 transformation until we get flow graph containing only the edges whose source node are the root node of original flow graph and whose target node are the other node. We reduce resulting graph into root node by T3 transformation. After we reduce flow graph to root node, we have to calculate the unknown quantities of postfix form of all data flow equations for each basic block by using the calculation of stack-machines, that is, for each postfix form list, we start at the first node, if current node contains a basic block indices, then set/reset the value of this node. If current node contains one operator symbol, we apply this operator on the values of previous two nodes, then insert the result value into second node from the last and delete first node from the node and the current node. This process is continued until only one node remains in postfix form list. The result value of this postfix form is assigned to unknown quantity(first unknown)that has this postfix form list. According to data flow equation for a basic block b, the second unknown for this equation is calculated from first calculated unknowns, for corresponding equation, of predecessor nodes of b. After solving five type data flow equations, we can determine du_chain for every definition in program, that is, if there is a definition of variable x in the block, du_chain of this definition is the list of possible uses of this definition. We find ud_chains, which are lists, for each use of a variable, of all the definitions that reach that use. The solutions of the data flow equations, which give us the data flow information collected, we implemented three code-improving transformation, common sub expression elimination, copy propagation and constant folding.

Açıklama

Tez (Yüksek Lisans) -- İstanbul Teknik Üniversitesi, Fen Bilimleri Enstitüsü, 1993

Anahtar kelimeler

Akış denklemleri, Kod çözme, Flow equations, Decoding

URI

http://hdl.handle.net/11527/23566

Koleksiyonlar

FBE- Kontrol ve Otomasyon Mühendisliği Lisansüstü Programı - Yüksek Lisans

Tam öğe sayfası