Homework 3

Author

Rohit Gurusamy Anandakumar

Data Flow Analysis

About data flow analysis:

Definition: Data Flow Analysis (DFA) is a technique used in compiler optimization to analyze how values flow through program variables during execution.

  • The code should remove the instructions that are reassigned without being used.
  • If the value in being used in between the reassigning, then the instruction should be retained.

Forward (Top-Down) Analysis:

Reaching Definitions
  • Flows in direction of program execution
  • Propagates information about variable definitions forward
  • Initial state flows from start to end
Available Expressions
  • Information flows from top to bottom
  • Computes expressions already evaluated at each point
  • Propagates forward through control flow graph

Backward (Bottom-Up) Analysis:

Liveness Analysis
  • Information flows opposite to program execution
  • Starts from variable uses and works backwards
  • Determines if variable value will be needed later
Very Busy Expressions
  • Flows from bottom to top
  • Works backwards from expression uses
  • Determines what will definitely be computed later

The direction matters because:

  • Forward analyses use information from predecessor nodes
  • Backward analyses use information from successor nodes
  • Direction affects how we iterate through the control flow graph
  • Initialization points differ (entry vs exit nodes)

Implementation:

Import Libraries and Define Named Tuples

  • Import necessary modules (sys, json, etc.)
  • Define Analysis named tuple with attributes: forward, init, merge, transfer

Define Union and Intersection Functions

  • Define union function: Merge multiple sets into one
  • Define intersection function: Find common elements among sets

Define Data Flow Worklist Algorithm

  • Determine edges based on analysis direction (forward/backward)
  • Initialize in_ and out sets for each block
  • Add all blocks to worklist
  • While worklist is not empty:
    • Pop a block, calculate in_ value by merging predecessor/successor out values
    • Update out value using transfer function
    • If out value changes, re-add successors/predecessors to worklist

Define Formatting Function

  • Format sets and dictionaries into readable strings

Run Data Flow Analysis

  • For each function in the BRIL program:
    • Convert instructions into blocks
    • Add terminators to blocks
    • Execute worklist algorithm
    • Print in_ and out values for each block

Define Generation and Killing Functions for Analyses

  • Define functions to generate and kill variables/expressions for different analyses (liveness, reaching defs, very busy expressions, available expressions)

Define Analyses

  • Define attributes and transfer functions for each analysis type (LIVENESS, VERYBUSY, REACHING, AVAILABLE)

Main Execution

  • Load BRIL program from input
  • Run the specified data flow analysis

Full code: https://github.com/gurusamyanandakuma-r/bril/blob/main/HW/HW3_Rohit/dfa.py

Testing:

Testing Tool: (Turnt)

```{turnt.toml} [envs.LIVENESS] command = “bril2json < {filename} | python3 dfa.py LIVENESS” output.”LIVENESS.out” = “-”

[envs.VERYBUSY]
command = "bril2json < {filename} | python3 dfa.py VERYBUSY"
output."VERYBUSY.out" = "-"

[envs.REACHING]
command = "bril2json < {filename} | python3 dfa.py REACHING"
output."REACHING.out" = "-"

[envs.AVAILABLE]
command = "bril2json < {filename} | python3 dfa.py AVAILABLE"
output."AVAILABLE.out" = "-"
#### Example 1: Without branches (one block code)
##### Input bril:






```{bril}
    @main {

        divisor: int = const 7;
        divident: int = const 42;
        quotient: int = div divident divisor;
        product: int = mul quotient divisor;
        remainder: int = sub divident product;  

        print remainder; 

    }
Output: LIVENESS:
    BLOCK1:
        IN:  NULL
        OUT: NULL
Output: VERY BUSY:
    BLOCK1:
        IN:  ('div', ('divident', 'divisor')), ('mul', ('quotient', 'divisor')), ('sub', ('divident', 'product'))
        OUT: NULL
Output: REACHING:
    BLOCK1:
        IN:  NULL
        OUT: ('divident', 'BLOCK1'), ('divisor', 'BLOCK1'), ('product', 'BLOCK1'), ('quotient', 'BLOCK1'), ('remainder', 'BLOCK1')
Output: AVAILABLE:
    BLOCK1:
        IN:  NULL
        OUT: ('div', ('divident', 'divisor')), ('mul', ('quotient', 'divisor')), ('sub', ('divident', 'product'))

Example 2: With branches

Input bril:
    @main {
    x: int = const 3;
    y: int = const 4;
    
    .header:
        cond: bool = lt x y;   # Check if x < y
        br cond .then .else;   # Branch based on condition

    .then:
        result: int = add x y;  # result = x + y
        jmp .end;

    .else:
        result: int = sub x y;  # result = x - y
        jmp .end;

    .end:
        print result;           # Print the result
    }
Output: LIVENESS:
    BLOCK1:
        IN:  NULL
        OUT: x, y
    header:
        IN:  x, y
        OUT: x, y
    then:
        IN:  x, y
        OUT: result
    else:
        IN:  x, y
        OUT: result
    end:
        IN:  result
        OUT: NULL
Output: VERY BUSY:
    BLOCK1:
        IN:  ('lt', ('x', 'y'))
        OUT: ('lt', ('x', 'y'))
    header:
        IN:  ('lt', ('x', 'y'))
        OUT: NULL
    then:
        IN:  ('add', ('x', 'y'))
        OUT: NULL
    else:
        IN:  ('sub', ('x', 'y'))
        OUT: NULL
    end:
        IN:  NULL
        OUT: NULL
Output: REACHING:
    BLOCK1:
        IN:  NULL
        OUT: ('x', 'BLOCK1'), ('y', 'BLOCK1')
    header:
        IN:  ('x', 'BLOCK1'), ('y', 'BLOCK1')
        OUT: ('cond', 'header'), ('x', 'BLOCK1'), ('y', 'BLOCK1')
    then:
        IN:  ('cond', 'header'), ('x', 'BLOCK1'), ('y', 'BLOCK1')
        OUT: ('cond', 'header'), ('result', 'then'), ('x', 'BLOCK1'), ('y', 'BLOCK1')
    else:
        IN:  ('cond', 'header'), ('x', 'BLOCK1'), ('y', 'BLOCK1')
        OUT: ('cond', 'header'), ('result', 'else'), ('x', 'BLOCK1'), ('y', 'BLOCK1')
    end:
        IN:  ('cond', 'header'), ('result', 'else'), ('result', 'then'), ('x', 'BLOCK1'), ('y', 'BLOCK1')
        OUT: ('cond', 'header'), ('result', 'else'), ('result', 'then'), ('x', 'BLOCK1'), ('y', 'BLOCK1')
Output: AVAILABLE:
    BLOCK1:
        IN:  NULL
        OUT: NULL
    header:
        IN:  NULL
        OUT: ('lt', ('x', 'y'))
    then:
        IN:  ('lt', ('x', 'y'))
        OUT: ('add', ('x', 'y')), ('lt', ('x', 'y'))
    else:
        IN:  ('lt', ('x', 'y'))
        OUT: ('lt', ('x', 'y')), ('sub', ('x', 'y'))
    end:
        IN:  ('lt', ('x', 'y'))
        OUT: ('lt', ('x', 'y'))

Challenges

  • Balancing the forward and backward data flow analysis while managing the worklist efficiently wass tough. Each direction affects the edges and node values differently, which can lead to complicated logic.

Conclusion

Developed code that handles liveness, availability, reaching definitions, and very busy expressions using a worklist algorithm. This ensures effective forward and backward analysis by generating and killing sets for each analysis type.

Back to top