Homework4 - dominance

Author

Sana Taghipour Anvari

Link for the code: cfgdominance

Explanation of the code

This implementation performs dominator analysis on a control flow graph (CFG). It includes several key functions:

build_cfg: Constructs a Control Flow Graph from a given function’s instructions.
compute_dominators: Calculates the set of dominators for each node in the CFG.
build_dominator_tree: Constructs the dominator tree by finding the immediate dominator for each node.
compute_dominance_frontier: Computes the dominance frontier for each node in the CFG.
test_dominance: Verifies the correctness of the computed dominators.

The main function reads a JSON input file containing function definitions, applies these analyses to each function, and outputs the results. Input:

main {
  entry:
    x: int = const 0;
    jmp L1;

  L1:
    y: int = const 1;
    br y L2 L3;

  L2:
    z: int = add x y;
    jmp L4;

  L3:
    w: int = sub x y;
    jmp L4;

  L4:
    ret z;
}

Control Flow Graph:

          [ Block 0 (Entry) ]
                   |
                   v
             [ Block 1 (L1) ]
              /             \
             v               v
    [ Block 2 (L2) ]     [ Block 3 (L3) ]
             \               /
              \             /
               v           v
             [ Block 4 (L4) ]
                   |
                [ Exit ]

import json
from collections import defaultdict

def build_cfg(func):
    cfg = defaultdict(list)
    blocks = []
    current_block = []
    label_to_block = {}

    for instr in func['instrs']:
        if isinstance(instr, dict) and 'label' in instr:
            if current_block:
                blocks.append(current_block)
            current_block = [instr]
            label_to_block[instr['label']] = len(blocks)
        else:
            current_block.append(instr)

        if isinstance(instr, dict) and instr.get('op') in ['jmp', 'br']:
            blocks.append(current_block)
            current_block = []

    if current_block:
        blocks.append(current_block)

    for i, block in enumerate(blocks):
        last_instr = block[-1]
        if isinstance(last_instr, dict):
            if last_instr.get('op') in ['jmp', 'br']:
                for label in last_instr['labels']:
                    if label in label_to_block:
                        cfg[i].append(label_to_block[label])
            elif last_instr.get('op') != 'ret' and i + 1 < len(blocks):
                cfg[i].append(i + 1)
        cfg[i] 

    return cfg

def compute_dominators(cfg):
    entry = 0
    all_nodes = set(cfg.keys())
    dom = {node: all_nodes.copy() for node in all_nodes}
    dom[entry] = {entry}

    changed = True
    while changed:
        changed = False
        for node in cfg:
            predecessors = get_predecessors(cfg, node)
            if predecessors:
                new_dom = set.intersection(*(dom[pred] for pred in predecessors))
                new_dom.add(node)
            else:
                new_dom = {node}
            if new_dom != dom[node]:
                dom[node] = new_dom
                changed = True

    return dom

def get_predecessors(cfg, node):
    return [pred for pred, succs in cfg.items() if node in succs]

def build_dominator_tree(dom):
    idom = {}
    for node in dom:
        if node == 0:  
            idom[node] = None
        else:
            # Sort dominators by the size of their dominator sets in descending order
            sorted_doms = sorted(dom[node] - {node}, key=lambda x: len(dom[x]), reverse=True)
            idom[node] = next(d for d in sorted_doms if d != node)
    return idom

def compute_dominance_frontier(cfg, dom):
    df = defaultdict(set)
    for node in cfg:
        for succ in cfg[node]:
            runner = node
            while runner not in dom[succ]:
                df[runner].add(succ)
                runner = next(d for d in sorted(dom[runner] - {runner}) if d != runner)
    return df

def test_dominance(cfg, dom):
    entry = 0
    for node in cfg:
        if node == entry:
            continue
        for d in dom[node] - {node}:
            if d == entry:
                continue
            #Temporarily remove dominator d from the CFG
            modified_cfg = remove_node_from_cfg(cfg, d)
            # Check if node is still reachable from entry
            if is_reachable(modified_cfg, entry, node):
                #if node is still reachable without d, then d does not dominate node
                return False
    return True


def remove_node_from_cfg(cfg, node_to_remove):
    modified_cfg = {node: succs.copy() for node, succs in cfg.items() if node != node_to_remove}
    for succs in modified_cfg.values():
        if node_to_remove in succs:
            succs.remove(node_to_remove)
    return modified_cfg

def is_reachable(cfg, start, target):
    visited = set()
    stack = [start]
    while stack:
        node = stack.pop()
        if node == target:
            return True
        if node not in visited:
            visited.add(node)
            successors = cfg.get(node, [])
            stack.extend(successors)
    return False


def main(json_input):
    data = json.loads(json_input)

    for func in data['functions']:
        print(f"Function: {func['name']}")

        cfg = build_cfg(func)
        print("\nControl Flow Graph:")
        print(json.dumps(cfg, indent=2))

        dom = compute_dominators(cfg)
        print("\nDominators:")
        print(json.dumps({k: sorted(list(v)) for k, v in dom.items()}, indent=2))

        idom = build_dominator_tree(dom)
        print("\nImmediate Dominators (Dominator Tree):")
        print(json.dumps(idom, indent=2))

        df = compute_dominance_frontier(cfg, dom)
        print("\nDominance Frontier:")
        print(json.dumps({k: sorted(list(v)) for k, v in df.items()}, indent=2))

        test_result = test_dominance(cfg, dom)
        print("\nDominance Test Result:")
        print(test_result)

if __name__ == "__main__":
    input_json = '''
    {
      "functions": [
        {
          "name": "main",
          "instrs": [
            { "label": "entry" },
            { "op": "const", "dest": "x", "type": "int", "value": 0 },
            { "op": "jmp", "labels": ["L1"] },
            
            { "label": "L1" },
            { "op": "const", "dest": "y", "type": "int", "value": 1 },
            { "op": "br", "args": ["y"], "labels": ["L2", "L3"] },

            { "label": "L2" },
            { "op": "add", "args": ["x", "y"], "dest": "z", "type": "int" },
            { "op": "jmp", "labels": ["L4"] },

            { "label": "L3" },
            { "op": "sub", "args": ["x", "y"], "dest": "w", "type": "int" },
            { "op": "jmp", "labels": ["L4"] },

            { "label": "L4" },
            { "op": "ret", "args": ["z"] }
          ]
        }
      ]
    }
    '''
    main(input_json)

Function: main

Control Flow Graph:
{
  "0": [
    1
  ],
  "1": [
    2,
    3
  ],
  "2": [
    4
  ],
  "3": [
    4
  ],
  "4": []
}

Dominators:
{
  "0": [
    0
  ],
  "1": [
    0,
    1
  ],
  "2": [
    0,
    1,
    2
  ],
  "3": [
    0,
    1,
    3
  ],
  "4": [
    0,
    1,
    4
  ]
}

Immediate Dominators (Dominator Tree):
{
  "0": null,
  "1": 0,
  "2": 1,
  "3": 1,
  "4": 1
}

Dominance Frontier:
{
  "2": [
    4
  ],
  "3": [
    4
  ]
}

Dominance Test Result:
True

How we testing our implementation

The test_dominance function verifies the correctness of the computed dominators. It does this by:

It iterates through all nodes in the CFG except the entry node.
For each node, it considers all of its computed dominators (excluding the node itself and the entry node).
For each dominator, it temporarily removes that node from the CFG using the remove_node_from_cfg function.
It then checks if the original node is still reachable from the entry node in this modified CFG using the is_reachable function.
If the node is still reachable after removing a supposed dominator, it means that dominator wasn’t actually necessary to reach the node, so the test fails.
If all nodes pass this test for all their dominators, the function returns True. Also, in the code directory I tested this code with two different inputs and both passed the test function.

Hardest Part of the Task and How We Addressed It

The hardest part of this task was likely the correct implementation of the dominator tree construction, specifically in the build_dominator_tree function. This is challenging because:

It requires finding the immediate dominator for each node, which is the closest strict dominator in the dominator graph.
The naive approach of simply choosing any dominator can lead to incorrect results in complex control flow structures.

We addressed this challenge by:

Sorting the dominators of each node based on the size of their own dominator sets, in descending order.
This sorting ensures that we consider closer dominators first, as nodes closer in the dominator tree will have larger dominator sets.
We then select the first dominator from this sorted list (excluding the node itself) as the immediate dominator.