Homework 3

Author

Rohit Gurusamy Anandakumar

Data Flow Analysis

About data flow analysis:

Definition: Data Flow Analysis (DFA) is a technique used in compiler optimization to analyze how values flow through program variables during execution.

The code should remove the instructions that are reassigned without being used.
If the value in being used in between the reassigning, then the instruction should be retained.

Forward (Top-Down) Analysis:

Reaching Definitions

Flows in direction of program execution
Propagates information about variable definitions forward
Initial state flows from start to end

Available Expressions

Information flows from top to bottom
Computes expressions already evaluated at each point
Propagates forward through control flow graph

Backward (Bottom-Up) Analysis:

Liveness Analysis

Information flows opposite to program execution
Starts from variable uses and works backwards
Determines if variable value will be needed later

Very Busy Expressions

Flows from bottom to top
Works backwards from expression uses
Determines what will definitely be computed later

The direction matters because:

Forward analyses use information from predecessor nodes
Backward analyses use information from successor nodes
Direction affects how we iterate through the control flow graph
Initialization points differ (entry vs exit nodes)

Implementation:

Import Libraries and Define Named Tuples

Import necessary modules (sys, json, etc.)
Define Analysis named tuple with attributes: forward, init, merge, transfer

Define Union and Intersection Functions

Define union function: Merge multiple sets into one
Define intersection function: Find common elements among sets

Define Data Flow Worklist Algorithm

Determine edges based on analysis direction (forward/backward)
Initialize in_ and out sets for each block
Add all blocks to worklist
While worklist is not empty:
- Pop a block, calculate in_ value by merging predecessor/successor out values
- Update out value using transfer function
- If out value changes, re-add successors/predecessors to worklist

Define Formatting Function

Format sets and dictionaries into readable strings

Run Data Flow Analysis

For each function in the BRIL program:
- Convert instructions into blocks
- Add terminators to blocks
- Execute worklist algorithm
- Print in_ and out values for each block

Define Generation and Killing Functions for Analyses

Define functions to generate and kill variables/expressions for different analyses (liveness, reaching defs, very busy expressions, available expressions)

Define Analyses

Define attributes and transfer functions for each analysis type (LIVENESS, VERYBUSY, REACHING, AVAILABLE)

Main Execution

Load BRIL program from input
Run the specified data flow analysis

Full code: https://github.com/gurusamyanandakuma-r/bril/blob/main/HW/HW3_Rohit/dfa.py

Testing:

Testing Tool: (Turnt)

```{turnt.toml} [envs.LIVENESS] command = “bril2json < {filename} | python3 dfa.py LIVENESS” output.”LIVENESS.out” = “-”

[envs.VERYBUSY]
command = "bril2json < {filename} | python3 dfa.py VERYBUSY"
output."VERYBUSY.out" = "-"

[envs.REACHING]
command = "bril2json < {filename} | python3 dfa.py REACHING"
output."REACHING.out" = "-"

[envs.AVAILABLE]
command = "bril2json < {filename} | python3 dfa.py AVAILABLE"
output."AVAILABLE.out" = "-"

#### Example 1: Without branches (one block code)
##### Input bril:






```{bril}
    @main {

        divisor: int = const 7;
        divident: int = const 42;
        quotient: int = div divident divisor;
        product: int = mul quotient divisor;
        remainder: int = sub divident product;  

        print remainder; 

    }

Output: LIVENESS:

    BLOCK1:
        IN:  NULL
        OUT: NULL

Output: VERY BUSY:

    BLOCK1:
        IN:  ('div', ('divident', 'divisor')), ('mul', ('quotient', 'divisor')), ('sub', ('divident', 'product'))
        OUT: NULL

Output: REACHING:

    BLOCK1:
        IN:  NULL
        OUT: ('divident', 'BLOCK1'), ('divisor', 'BLOCK1'), ('product', 'BLOCK1'), ('quotient', 'BLOCK1'), ('remainder', 'BLOCK1')

Output: AVAILABLE:

    BLOCK1:
        IN:  NULL
        OUT: ('div', ('divident', 'divisor')), ('mul', ('quotient', 'divisor')), ('sub', ('divident', 'product'))

Example 2: With branches

Input bril:

    @main {
    x: int = const 3;
    y: int = const 4;
    
    .header:
        cond: bool = lt x y;   # Check if x < y
        br cond .then .else;   # Branch based on condition

    .then:
        result: int = add x y;  # result = x + y
        jmp .end;

    .else:
        result: int = sub x y;  # result = x - y
        jmp .end;

    .end:
        print result;           # Print the result
    }

Output: LIVENESS:

    BLOCK1:
        IN:  NULL
        OUT: x, y
    header:
        IN:  x, y
        OUT: x, y
    then:
        IN:  x, y
        OUT: result
    else:
        IN:  x, y
        OUT: result
    end:
        IN:  result
        OUT: NULL

Output: VERY BUSY:

    BLOCK1:
        IN:  ('lt', ('x', 'y'))
        OUT: ('lt', ('x', 'y'))
    header:
        IN:  ('lt', ('x', 'y'))
        OUT: NULL
    then:
        IN:  ('add', ('x', 'y'))
        OUT: NULL
    else:
        IN:  ('sub', ('x', 'y'))
        OUT: NULL
    end:
        IN:  NULL
        OUT: NULL

Output: REACHING:

    BLOCK1:
        IN:  NULL
        OUT: ('x', 'BLOCK1'), ('y', 'BLOCK1')
    header:
        IN:  ('x', 'BLOCK1'), ('y', 'BLOCK1')
        OUT: ('cond', 'header'), ('x', 'BLOCK1'), ('y', 'BLOCK1')
    then:
        IN:  ('cond', 'header'), ('x', 'BLOCK1'), ('y', 'BLOCK1')
        OUT: ('cond', 'header'), ('result', 'then'), ('x', 'BLOCK1'), ('y', 'BLOCK1')
    else:
        IN:  ('cond', 'header'), ('x', 'BLOCK1'), ('y', 'BLOCK1')
        OUT: ('cond', 'header'), ('result', 'else'), ('x', 'BLOCK1'), ('y', 'BLOCK1')
    end:
        IN:  ('cond', 'header'), ('result', 'else'), ('result', 'then'), ('x', 'BLOCK1'), ('y', 'BLOCK1')
        OUT: ('cond', 'header'), ('result', 'else'), ('result', 'then'), ('x', 'BLOCK1'), ('y', 'BLOCK1')

Output: AVAILABLE:

    BLOCK1:
        IN:  NULL
        OUT: NULL
    header:
        IN:  NULL
        OUT: ('lt', ('x', 'y'))
    then:
        IN:  ('lt', ('x', 'y'))
        OUT: ('add', ('x', 'y')), ('lt', ('x', 'y'))
    else:
        IN:  ('lt', ('x', 'y'))
        OUT: ('lt', ('x', 'y')), ('sub', ('x', 'y'))
    end:
        IN:  ('lt', ('x', 'y'))
        OUT: ('lt', ('x', 'y'))

Challenges

Balancing the forward and backward data flow analysis while managing the worklist efficiently wass tough. Each direction affects the edges and node values differently, which can lead to complicated logic.

Conclusion

Developed code that handles liveness, availability, reaching definitions, and very busy expressions using a worklist algorithm. This ensures effective forward and backward analysis by generating and killing sets for each analysis type.