Data Flow

The material in these slides have been taken from Lecture Notes in Static Analysis” (Sec.6), by Michael I. Schwartzbach, “Principles of Program Analysis”, Chapter 6, by Niesen et al, and from Miachel Schwartzbach’s “Lecture notes in Static Analysis”, Chapter 6, First Section.

IN and OUT

EQUATIONS

Liveness Example

Summary by basic blocks

The dataflow equations used for a given basic block b and exiting block final in live variable analysis:

$\operatorname{GEN}[b]$ - The set of variables that are used in b before any assignment in the same basic block.

$\operatorname{KILL}[b]$ - The set of variables that are assigned a value in b

The in-state of a block is the set of variables that are live at the start of the block. Its out-state is the set of variables that are live at the end of it. The out-state is the union of the in-states of the block’s successors. The transfer function of a statement is applied by making the variables that are written dead, then making the variables that are read live.

equations

$ \[\begin{aligned} & \operatorname{IN}[b]=\operatorname{GEN}[b] \cup\left(\operatorname{OUT}[b]-\operatorname{KILL}[s]\right) \\ & \operatorname{OUT}[\text { final }]=\emptyset \\ & \operatorname{OUT}[b]=\bigcup_{p \in s u c c[b]} \operatorname{IN}[p] \\ & \operatorname{GEN}\left[b: y \leftarrow f\left(x_1, \cdots, x_n\right)\right]=\left\{x_1, \ldots, x_n\right\} \\ & \operatorname{KILL}\left[b: y \leftarrow f\left(x_1, \cdots, x_n\right)\right]=\{y\} \end{aligned}\]

an example

b1: 
a = 3
b = 5
d = 4
x = 100
if a > b then 

   b2: 
    c = a + b
    d = 2 

b3: 
   c = 4
   return b*d +c

$\operatorname{GEN}[b]$ - The set of variables that are used in b before any assignment in the same basic block.

$\operatorname{KILL}[b]$ - The set of variables that are assigned a value in b

GEN[b1] = []         kill[b1] = [a,b,d,x]
GEN[b2] = [a,b]      kill[b2] = [c,d]
GEN[b3] = [b,d]      Kill[b3] = [c]

processing

%%{init: {"flowchart": {"htmlLabels": false}} }%%
graph TB
b1--> b2
b1 --> b3
b2--> b3

%%{init: {"flowchart": {"htmlLabels": false}} }%%
graph TB
b1--> b2
b1 --> b3
b2--> b3

GEN[b1] = []         kill[b1] = [a,b,d,x]
GEN[b2] = [a,b]      kill[b2] = [c,d]
GEN[b3] = [b,d]      Kill[b3] = [c]

block OUT       IN   Next IN        worklist  
b3    []        []    [b,d]           b1,b2
b1    [b,d]     []    []              b2
b2    [b,d]     []    [a,b]           b1
b1    [a,b,d]   []    []              empty

frameworks

common properties Direction

Direction

backward

liveness
very busy expressions

OUT is a function of the IN of successors

forward

reaching Defs
Available Expressions

IN is a function of the OUT of Preds

common properties Operation

May union

Liveness
Reaching defs

merge using intersection

must

very busy expressions
Available Expressions

merge using union

transfer functions with a block or for one statement

Forward
\[ \text{OUT}_b = f_b(\text{IN}_b) \]

Backward
\[ \text{IN}_b = f_b(\text{OUT}_b) \]

liveness IN = (OUT-def) union (args)

Very busy expressions IN = (OUT - exprs(def)) union (this expr)

an example

if b1 
    while b2 { x = a1}
else 
    while b3 { x = a2}
x = a3

%%{init: {"flowchart": {"htmlLabels": false}} }%%

graph TD;
b1["p1: if b1"]
b2["p2: use b3"]
b3["p3: x = a2"]
b4["p4: use b2"]
b5["p5: x = a1"]
b6["p6: x = a3"]
b1--> b4
b1 --> b2


b2--> b6
b2--> b3

b3--> b2
b5--> b4

b4--> b5
b4--> b6

%%{init: {"flowchart": {"htmlLabels": false}} }%%

graph TD;
b1["p1: if b1"]
b2["p2: use b3"]
b3["p3: x = a2"]
b4["p4: use b2"]
b5["p5: x = a1"]
b6["p6: x = a3"]
b1--> b4
b1 --> b2


b2--> b6
b2--> b3

b3--> b2
b5--> b4

b4--> b5
b4--> b6

reaching defs - a definition of a variable v at pv reaches a point p if there is a path from pv tp p and v is not redefined along the path

equations

%%{init: {"flowchart": {"htmlLabels": false}} }%%


graph TD;
b1["p1: if b1"]
b2["p2: use b3"]
b3["p3: x = a2"]
b4["p4: use b2"]
b5["p5: x = a1"]
b6["p6: x = a3"]
b1--> b4
b1 --> b2


b2--> b6
b2--> b3

b3--> b2
b5--> b4

b4--> b5
b4--> b6

%%{init: {"flowchart": {"htmlLabels": false}} }%%


graph TD;
b1["p1: if b1"]
b2["p2: use b3"]
b3["p3: x = a2"]
b4["p4: use b2"]
b5["p5: x = a1"]
b6["p6: x = a3"]
b1--> b4
b1 --> b2


b2--> b6
b2--> b3

b3--> b2
b5--> b4

b4--> b5
b4--> b6

\[ \small \text{IN}_p = \bigcup \text{OUT}_{ps}, ps \in pred(p)\]

IN[1] = empty
IN[2] = OUT[1] union OUT[3]
IN[3] = OUT[2]
IN[4] = OUT[1] union OUT[5]
IN[5] = OUT[4]
IN[6] = OUT[2] union OUT[4]

\[ \small \text{OUT}_p = (\text{IN)}_p - defs(v)) \cup \{ (p,v) \} \]

OUT[1] = IN[1]
OUT[2] = IN[2]
OUT[3] = (IN[3] -{3,5,6}) union {3}
OUT[4] = IN[4]
OUT[5] = (IN[5] - {3,5,6}) union {5}
OUT[6] = (IN[6] - {3,5,6}) union {6}

complexity

graph of equations

IN[1] = empty
IN[2] = OUT[1] union OUT[3]
IN[3] = OUT[2]
IN[4] = OUT[1] union OUT[5]
IN[5] = OUT[4]
IN[6] = OUT[2] union OUT[4]
OUT[1] = IN[1]
OUT[2] = IN[2]
OUT[3] = (IN[3] -{3,5,6}) union {3}
OUT[4] = IN[4]
OUT[5] = (IN[5] - {3,5,6}) union {5}
OUT[6] = (IN[6] - {3,5,6}) union {6}

%%{init: {"flowchart": {"htmlLabels": false}} }%%

graph TB;
IN1-->OUT1
OUT1-->IN2
OUT1-->IN4
OUT3--> IN2
OUT2--> IN3
OUT5--> IN4
OUT4--> IN5
OUT2--> IN6
OUT4--> IN6
IN2--> OUT2
IN3--> OUT3
IN4--> OUT4
IN5--> OUT5
IN6--> OUT6

%%{init: {"flowchart": {"htmlLabels": false}} }%%

graph TB;
IN1-->OUT1
OUT1-->IN2
OUT1-->IN4
OUT3--> IN2
OUT2--> IN3
OUT5--> IN4
OUT4--> IN5
OUT2--> IN6
OUT4--> IN6
IN2--> OUT2
IN3--> OUT3
IN4--> OUT4
IN5--> OUT5
IN6--> OUT6

Reverse Postorder

visit successors first (need an ordering)

order

%%{init: {"flowchart": {"htmlLabels": false}} }%%

graph TB;
IN1-->OUT1
OUT1-->IN2
OUT1-->IN4
OUT3--> IN2
OUT2--> IN3
OUT5--> IN4
OUT4--> IN5
OUT2--> IN6
OUT4--> IN6
IN2--> OUT2
IN3--> OUT3
IN4--> OUT4
IN5--> OUT5
IN6--> OUT6

%%{init: {"flowchart": {"htmlLabels": false}} }%%

graph TB;
IN1-->OUT1
OUT1-->IN2
OUT1-->IN4
OUT3--> IN2
OUT2--> IN3
OUT5--> IN4
OUT4--> IN5
OUT2--> IN6
OUT4--> IN6
IN2--> OUT2
IN3--> OUT3
IN4--> OUT4
IN5--> OUT5
IN6--> OUT6

%%{init: {"flowchart": {"htmlLabels": false}} }%%
graph TD
b1[b1-1]--> b4[b4-4]
b1--> b2[b2-2]
b4--> b6[b6-5]
b4--> b5[b5-6]
b2--> b3[b3-3]

%%{init: {"flowchart": {"htmlLabels": false}} }%%
graph TD
b1[b1-1]--> b4[b4-4]
b1--> b2[b2-2]
b4--> b6[b6-5]
b4--> b5[b5-6]
b2--> b3[b3-3]

order b5 b6 b3 b3 b2 b1

implement

keep two data structures

C current list
P set of pending lists

initially C is a reverse post order sort of the nodes

process each element of C

when we find a changed add it to P

When C is empty, sort P in reverse post order and move to C

representing sets

we keep doing union and intersection for sets, which are sparse

compilers generally use bit vectors

pseudo code

// Initialize
for all CFG nodes n in N,
    OUT[n] = emptyset; // can optimize by OUT[n] = GEN[n];

// put all nodes into the changed set
// N is all nodes in graph,
Changed = N;

// Iterate 
while (Changed != emptyset)
{
    choose a node n in Changed;
    // remove it from the changed set
    Changed = Changed -{ n };

    // init IN[n] to be empty
    IN[n] = emptyset;
  
    // calculate IN[n] from predecessors' OUT[p]
    for all nodes p in predecessors(n)
         IN[n] = IN[n] Union OUT[p];

    oldout = OUT[n]; // save old OUT[n]
    
    // update OUT[n] using transfer function f_n ()
    OUT[n] = GEN[n] Union (IN[n] -KILL[n]);

    // any change to OUT[n] compared to previous value?
    if (OUT[n] changed) // compare oldout vs. OUT[n]
    {    
        // if yes, put all successors of n into the changed set
        for all nodes s in successors(n)
             Changed = Changed U { s };
    }
}

loops

This algorithm has no problems with loops!

homework 3

Implement one data flow analysis - For Bonus points make it generic so that the same code supports multiple analysis. As always, think about how to test it. use a simple ordering- Not necessary to use reverse post order

as always think about testing

Other Formats

IN and OUT

EQUATIONS

Liveness Example

Summary by basic blocks

equations

an example

processing

frameworks

common properties Direction

common properties Operation

transfer functions with a block or for one statement

an example

equations

complexity

graph of equations

Reverse Postorder

order

implement

representing sets

pseudo code

loops

homework 3