Overview of Bril

How to use Bril with real code

Bril is very simple, very regular, ir.
Bril can be extended easily.
Bril has lots of tools and examples.
Bril tools are written in lots of languages so setup can be messy

Lets look at a bril program.

Bril is written in JSON format. Almost all programming languages have a way to read json.

import json
import subprocess
import os 
import sys


### temp 
out = subprocess.check_output('which python', shell=True)
print(out)
print('***********************')

# read from a file 
with open("images/add.json","r") as f:
    bril_program = json.load(f)

# read from a pipe
# bril_program = json.load(sys.stdin)
    
print(json.dumps(bril_program, 
    indent=2))

b'/opt/hostedtoolcache/Python/3.10.15/x64/bin/python\n'
***********************
{
  "functions": [
    {
      "name": "main",
      "instrs": [
        {
          "op": "const",
          "type": "int",
          "dest": "v0",
          "value": 1
        },
        {
          "op": "const",
          "type": "int",
          "dest": "v1",
          "value": 2
        },
        {
          "op": "add",
          "type": "int",
          "dest": "v2",
          "args": [
            "v0",
            "v1"
          ]
        },
        {
          "op": "print",
          "args": [
            "v2"
          ]
        }
      ],
      "args": []
    }
  ]
}

Formatted

{
  "functions": [
    {
      "instrs": [
        {"dest": "v0", "op": "const","type": "int","value": 1},
        {"dest": "v1", "op": "const","type": "int","value": 2},
        {"dest": "v2", "op": "add",  "type": "int","args": ["v0","v1"],},
                       "op": "print","args": [ "v2"],}],
      "name": "main",
    }
  ]
}

getting started

links:

step 1 clone the bril repo on a linux or wsl machine

git clone https://github.com/sampsyo/bril.git

step 2 support packages

deno is the runtime for typescript/javascript

curl -fsSL https://deno.land/install.sh | sh

on my ubuntu machine ‘sudo snap install deno’ also worked

you may need to add $HOME/.deno/bin to your $PATH.

flit a python package manager

python3 -m pip install flit

step 3 install the bril interpreter, and the typescript to bril compiler

cd bril
deno install brili.ts 
deno install --allow-env --allow-read ts2bril.ts

running the interpreter

brili <images/add.json

brili -p <images/add.json

the -p flag turns on profiling

text to json and back

There are programs bril2txt and bril2json that make it easy to convert. Keep in mind that the json format is Bril and thats where you will do all the work.

install text tools

cd bril-txt
flit install --symlink --user

run json to text

bril2txt < images/add.json

connect tools via pipes

cat images/add.json'
bril2txt < images/add.json | bril2json

Other tools

There is also a fast interpreter written in Rust see docs for installation

turnt Tiny unified runner and tester

Bril uses turnt as a test tool

Turnt is a simple snapshot testing tool inspired by LLVM’s lit. It’s good for testing things that translate text files to other text files, like compilers. The idea is that each test is one input file, and you want to run a command and check that it still matches the saved output file.

pip install –user turnt

As you think about your projects, you might consider adding a new tool. you can setup Bril on your local linux (can be wsl) machine

Gen CFG

Lets write a sample program - that generates the cfg

How would you do that?

{
  "functions": [
    {
      "name": "main",
      "instrs": [
        { "op": "const", "type": "int", "dest": "v0", "value": 1 },
        { "op": "const", "type": "int", "dest": "v1", "value": 2 },
        { "op": "add", "type": "int", "dest": "v2",
          "args": ["v0", "v1"] },
        { "op": "print", "args": ["v2"] }
      ],
      "args": []
    }
  ]
}

. . .

I’ll do this in two steps

find all the basic blocks
add all the cfg edges

You can also do this in a single step, adding cfg edges as soon as you reach the successor node.

basic blocks from a list of instructions-

keep adding instructions till we get to a terminator or a label (do we add labels?)

. . .

in: list of instrs 
out: list of lists of instrs 
 
blocks = []
curr_block = []
for each instr in list 
    if the instruction is not a label put it on curr_block
    if instr is a label or terminator 
       put curr_block on blocks
       curr_block = []
 
 if curr_block is not empty add it to blocks
 return blocks

two labels in a row do not need another block

step 2 add edges

find cfg: in: is bril program in json 
for each function find the list of basic blocks
for each basic block
   get last_instr 
   if it is a terminator  br/jmp/ret 
     add edge from current block to successor  
    --- what do we want to do with call? 
   else it is a fall through
      add edge to next block

. . .

we need a map (block_map) label->block so we can add edges for blocks that end in br/jmp - can build this while getting the blocks or we can put the label as the first instruction

how do we handle fall through?

what about a return

if every block ends with a terminator, and every block has a label, then no fall through case

what happens if try to delete the terminator (because the block never executes)

code

I’ll use a python data structure called OrderedDict, when you iterate over the items in a ordered dict, they come back in the order that they were installed.

GitHub Copilot says:

OrderedDict in Python is a dictionary subclass that maintains the order in which keys are inserted. When iterating over an OrderedDict, the items are returned in the order they were added. This behavior contrasts with a standard dictionary in Python 3.6 and earlier, where the iteration order was not guaranteed. However, starting from Python 3.7, the built-in dict type also maintains insertion order by default, making OrderedDict less necessary for most applications. OrderedDict still provides additional functionality, such as the move_to_end method, which allows moving an existing key to either end of the dictionary.

I’ll use a generator

In Python, a generator is an iterator that is defined with a function using the yield statement.

Produce items only once
Do not store all the items in memory
When items from the generator are requested, the function executes until it reaches a yield statement, which produces the next value. Execution then pauses, preserving the function’s state, until the next item is requested.

Given a list of Bril instructions, generate a sequence of instruction lists representing the basic blocks in the program.

Every instruction in instr will show up in exactly one block. Jump and branch instructions may only appear at the end of a block, and control can transfer only to the top of a basic block—so labels can only appear at the start of a basic block. Basic blocks may not be empty.

    #Instructions that terminate a basic block.
    TERMINATORS = 'br', 'jmp', 'ret'

    def form_blocks(instrs):
        # Start with an empty block.
        cur_block = []

        for instr in instrs:
            if 'op' in instr:  # It's an instruction.
                # Add the instruction to the currently-being-formed block.
                cur_block.append(instr)

                # If this is a terminator (branching instruction), it's the
                # last instruction in the block. Finish this block and
                # start a new one.
                if instr['op'] in TERMINATORS:
                    yield cur_block
                    cur_block = []
            
            else:  # It's a label.
                # End the block here (if it contains anything).
                if cur_block:
                    yield cur_block

                # Start a new block with the label.
                cur_block = [instr]

        # Produce the final block, if any.
        if cur_block:
            yield cur_block

as a test, lets print out the blocks

def print_blocks(bril):
    """Given a Bril program, print out its basic blocks.
    """


    func = bril['functions'][0]  # We only process one function.
    for block in form_blocks(func['instrs']):
        # Mark the block.
        leader = block[0]
        if 'label' in leader:
            print( f"block {leader['label']}")
            block = block[1:]  # Hide the label
        else:
            print('anonymous block:')

        # Print the instructions.
        for instr in block:
            print(instr)

print_blocks(bril_program)

anonymous block:
{'op': 'const', 'type': 'int', 'dest': 'v0', 'value': 1}
{'op': 'const', 'type': 'int', 'dest': 'v1', 'value': 2}
{'op': 'add', 'type': 'int', 'dest': 'v2', 'args': ['v0', 'v1']}
{'op': 'print', 'args': ['v2']}

this test program has one block so pretty easy

lets try a second example with a jmp

@main {
  v: int = const 4;
  jmp .somewhere;
  v: int = const 2;
.somewhere:
  print v;
}

running commands inside python

GitHub Copilot: subprocess.check_output is a function in Python’s subprocess module that runs a command with arguments and returns its output as a byte string. If the command exits with a non-zero exit status, it raises a CalledProcessError, which includes the exit status and output of the command. This function is useful for capturing the output of a command for further processing in Python.

import subprocess

# Run a command and capture its output
output = subprocess.check_output(['ls', '-l'])

# Convert the byte string to a regular string (assuming UTF-8 encoding)
output_str = output.decode('utf-8')

print(output_str)

total 2288
-rw-r--r-- 1 runner docker  40732 Dec 15 16:54 010_compiler_overview.html
-rw-r--r-- 1 runner docker  11181 Dec 15 16:53 010_compiler_overview.qmd
-rw-r--r-- 1 runner docker  32547 Dec 15 16:54 01a1_performance_measurement.html
-rw-r--r-- 1 runner docker   3018 Dec 15 16:53 01a1_performance_measurement.qmd
drwxr-xr-x 4 runner docker   4096 Dec 15 16:54 01a1_performance_measurement_files
-rw-r--r-- 1 runner docker  24184 Dec 15 16:53 01a2_performance_measurement.html
-rw-r--r-- 1 runner docker   3018 Dec 15 16:53 01a2_performance_measurement.qmd
-rw-r--r-- 1 runner docker  33835 Dec 15 16:53 01a_performance_measurement.html
-rw-r--r-- 1 runner docker   4152 Dec 15 16:53 01a_performance_measurement.qmd
drwxr-xr-x 4 runner docker   4096 Dec 15 16:53 01a_performance_measurement_files
-rw-r--r-- 1 runner docker  40303 Dec 15 16:54 02a_representation.html
-rw-r--r-- 1 runner docker   7431 Dec 15 16:53 02a_representation.qmd
-rw-r--r-- 1 runner docker  13795 Dec 15 16:53 02b_bril.qmd
-rw-r--r-- 1 runner docker  23632 Dec 15 16:54 02b_bril.quarto_ipynb
-rw-r--r-- 1 runner docker  74899 Dec 15 16:53 03_local.html
-rw-r--r-- 1 runner docker  12760 Dec 15 16:53 03_local.qmd
-rw-r--r-- 1 runner docker  13285 Dec 15 16:53 03b_local_value_numbering.qmd
-rw-r--r-- 1 runner docker   9456 Dec 15 16:53 04_data_flow.qmd
-rw-r--r-- 1 runner docker  56505 Dec 15 16:54 05_global.html
-rw-r--r-- 1 runner docker   9992 Dec 15 16:53 05_global.qmd
-rw-r--r-- 1 runner docker   8730 Dec 15 16:53 05b_licm.qmd
-rw-r--r-- 1 runner docker  33395 Dec 15 16:54 05c_pre.html
-rw-r--r-- 1 runner docker   3166 Dec 15 16:53 05c_pre.qmd
-rw-r--r-- 1 runner docker  20202 Dec 15 16:53 06_ssa.qmd
-rwxr-xr-x 1 runner docker  25643 Dec 15 16:52 07_llvm.notebook
-rw-r--r-- 1 runner docker   5707 Dec 15 16:53 08_classic_loop_ops.qmd
-rw-r--r-- 1 runner docker  99019 Dec 15 16:54 09_poly.html
-rwxr-xr-x 1 runner docker  32475 Dec 15 16:53 09_poly.qmd
drwxr-xr-x 4 runner docker   4096 Dec 15 16:54 09_poly_files
-rw-r--r-- 1 runner docker  20936 Dec 15 16:54 100_mlir.html
-rw-r--r-- 1 runner docker    320 Dec 15 16:53 100_mlir.qmd
-rw-r--r-- 1 runner docker  20970 Dec 15 16:54 110_whole_program.html
-rw-r--r-- 1 runner docker    338 Dec 15 16:53 110_whole_program.qmd
-rw-r--r-- 1 runner docker  21028 Dec 15 16:54 12_memory.html
-rw-r--r-- 1 runner docker    334 Dec 15 16:53 12_memory.qmd
-rw-r--r-- 1 runner docker 102297 Dec 15 16:53 13_dynamic_compilers.html
-rw-r--r-- 1 runner docker  31886 Dec 15 16:53 13_dynamic_compilers.qmd
-rw-r--r-- 1 runner docker  35314 Dec 15 16:54 14_gpu_compilers.html
-rw-r--r-- 1 runner docker  10135 Dec 15 16:53 14_gpu_compilers.qmd
-rw-r--r-- 1 runner docker   4682 Dec 15 16:52 a.json
-rw-r--r-- 1 runner docker    248 Dec 15 16:52 a.ts
drwxr-xr-x 2 runner docker   4096 Dec 15 16:52 df
-rw-r--r-- 1 runner docker  71550 Dec 15 16:54 diverg.html
-rw-r--r-- 1 runner docker  27522 Dec 15 16:53 diverg.qmd
-rw-r--r-- 1 runner docker  26721 Dec 15 16:54 diverg1.html
-rw-r--r-- 1 runner docker   5001 Dec 15 16:53 diverg1.qmd
-rw-r--r-- 1 runner docker   4595 Dec 15 16:52 errors
-rw-r--r-- 1 runner docker   3623 Dec 15 16:52 foo.ll
-rw-r--r-- 1 runner docker   2296 Dec 15 16:52 identity.bc
drwxr-xr-x 2 runner docker   4096 Dec 15 16:52 images
-rw-r--r-- 1 runner docker  30615 Dec 15 16:54 junk.html
-rw-r--r-- 1 runner docker   2152 Dec 15 16:53 junk.qmd
-rw-r--r-- 1 runner docker    905 Dec 15 16:52 junk.txt
-rw-r--r-- 1 runner docker  20731 Dec 15 16:53 llvm.qmd
-rw-r--r-- 1 runner docker   7659 Dec 15 16:52 llvm.qmd.next
-rw-r--r-- 1 runner docker  44411 Dec 15 16:54 mem_consistancy.html
-rw-r--r-- 1 runner docker  15878 Dec 15 16:53 mem_consistancy.qmd
-rw-r--r-- 1 runner docker  24125 Dec 15 16:54 mlir.html
-rw-r--r-- 1 runner docker   2278 Dec 15 16:53 mlir.qmd
drwxr-xr-x 2 runner docker   4096 Dec 15 16:52 papers
-rwxr-xr-x 1 runner docker  30724 Dec 15 16:53 poly_final.qmd
drwxr-xr-x 2 runner docker   4096 Dec 15 16:52 pytorch
drwxr-xr-x 2 runner docker   4096 Dec 15 16:52 ra
-rw-r--r-- 1 runner docker  41702 Dec 15 16:54 ra-checking.html
-rw-r--r-- 1 runner docker  15172 Dec 15 16:53 ra-checking.qmd
-rw-r--r-- 1 runner docker   8032 Dec 15 16:53 register_allocation.qmd
-rw-r--r-- 1 runner docker  40546 Dec 15 16:54 revealjs_010_compiler_overview.qmd.html
-rw-r--r-- 1 runner docker  33435 Dec 15 16:54 revealjs_01a1_performance_measurement.qmd.html
-rw-r--r-- 1 runner docker  24507 Dec 15 16:53 revealjs_01a2_performance_measurement.qmd.html
-rw-r--r-- 1 runner docker  34770 Dec 15 16:53 revealjs_01a_performance_measurement.qmd.html
-rw-r--r-- 1 runner docker  39260 Dec 15 16:54 revealjs_02a_representation.qmd.html
-rw-r--r-- 1 runner docker  73582 Dec 15 16:53 revealjs_03_local.qmd.html
-rw-r--r-- 1 runner docker  51948 Dec 15 16:54 revealjs_05_global.qmd.html
-rw-r--r-- 1 runner docker  32916 Dec 15 16:54 revealjs_05c_pre.qmd.html
-rw-r--r-- 1 runner docker  86492 Dec 15 16:54 revealjs_09_poly.qmd.html
-rw-r--r-- 1 runner docker  21418 Dec 15 16:54 revealjs_100_mlir.qmd.html
-rw-r--r-- 1 runner docker  21445 Dec 15 16:54 revealjs_110_whole_program.qmd.html
-rw-r--r-- 1 runner docker  21468 Dec 15 16:54 revealjs_12_memory.qmd.html
-rw-r--r-- 1 runner docker  87753 Dec 15 16:53 revealjs_13_dynamic_compilers.qmd.html
-rw-r--r-- 1 runner docker  34449 Dec 15 16:54 revealjs_14_gpu_compilers.qmd.html
-rw-r--r-- 1 runner docker  70744 Dec 15 16:54 revealjs_diverg.qmd.html
-rw-r--r-- 1 runner docker  31673 Dec 15 16:54 revealjs_diverg1.qmd.html
-rw-r--r-- 1 runner docker  30291 Dec 15 16:54 revealjs_junk.qmd.html
-rw-r--r-- 1 runner docker  44717 Dec 15 16:54 revealjs_mem_consistancy.qmd.html
-rw-r--r-- 1 runner docker  28411 Dec 15 16:54 revealjs_mlir.qmd.html
-rw-r--r-- 1 runner docker  40196 Dec 15 16:54 revealjs_ra-checking.qmd.html
-rw-r--r-- 1 runner docker  22316 Dec 15 16:52 xx

print the blocks

import json 
import os
import subprocess

result =  subprocess.check_output('bril2json < images/jmp.bril', shell=True)
test2json = json.loads(result)
print_blocks(test2json)

anonymous block:
{'dest': 'v', 'op': 'const', 'type': 'int', 'value': 4}
{'labels': ['somewhere'], 'op': 'jmp'}
anonymous block:
{'dest': 'v', 'op': 'const', 'type': 'int', 'value': 2}
block somewhere
{'args': ['v'], 'op': 'print'}

the map (label names to blocks)

from collections import OrderedDict


def block_map(blocks):
    """Given a sequence of basic blocks, which are lists of instructions,
    produce a `OrderedDict` mapping names to blocks.

    The name of the block comes from the label it starts with, if any.
    Anonymous blocks, which don't start with a label, get an
    automatically generated name. Blocks in the mapping have their
    labels removed.
    """
    by_name = OrderedDict()

    for block in blocks:
        # Generate a name for the block.
        if 'label' in block[0]:
            # The block has a label. Remove the label but use it for the
            # block's name.
            name = block[0]['label']
            block = block[1:]
        else:
            # Make up a new name for this anonymous block.
            name = f'gen_bk_{len(by_name)}'

        # Add the block to the mapping.
        by_name[name] = block

    return by_name


blks = form_blocks(test2json['functions'][0]['instrs'])
od = block_map(blks)
for (name, instrs) in od.items():
    print (name, instrs)

gen_bk_0 [{'dest': 'v', 'op': 'const', 'type': 'int', 'value': 4}, {'labels': ['somewhere'], 'op': 'jmp'}]
gen_bk_1 [{'dest': 'v', 'op': 'const', 'type': 'int', 'value': 2}]
somewhere [{'args': ['v'], 'op': 'print'}]

the cfg given the block map (pseudo code)

out cfg = {} 
# map label -> list of labels the successors of the block

for i,block in enumerate(blocks)  # blocks is a ordereddict 
    last = block[i]  # last instruction
    if last is jmp:
        cfg[block_name] = jmp.dest
    elif last is br:
        cfg[block.name] = [ last.if_label, last.else_label]
    else
        # fall through
        cfg[block_name] = blocks[i+1].name  ## special case for last block

cfg

def get_cfg(ordered_blocks):
    cfg = {}

    labels = list(ordered_blocks.keys())

    for i, (block_name, block) in enumerate(ordered_blocks.items()):
        last = block[-1]
        op = last['op']

        if op == 'jmp':
            cfg[block_name] = last['labels']
        elif op == 'br':
            cfg[block_name] = last['labels']
        else:
            if i+1 < len(labels):  # last block does not fall through
                cfg[block_name] = [labels[i+1]]
            else:
                cfg[block_name] = []
    return cfg


blks = form_blocks(test2json['functions'][0]['instrs'])
od = block_map(blks)
cfg = get_cfg(od)

print(cfg)

{'gen_bk_0': ['somewhere'], 'gen_bk_1': ['somewhere'], 'somewhere': []}

graph

%%{init: {"flowchart": {"htmlLabels": false}} }%%

graph TD
gen_bk_0--> somewhere
gen_bk_1 --> somewhere

%%{init: {"flowchart": {"htmlLabels": false}} }%%

graph TD
gen_bk_0--> somewhere
gen_bk_1 --> somewhere

homework

Due in 1 week

Your goal is to get familiar with Bril.

Part 1

Write a new benchmark.

You can write it by hand, use the TypeScript compiler, or generate it some other way. Try running it with brili.

Use turnt –save yours.bril to create the test outputs for your new benchmark. (See the Turnt README for details.)

Start your blog post, talking about your benchmark.

part 2

Write a program to analyze or transform Bril programs in some small way. Pick your favorite programming language—there is no “starter code,” so you can start from scratch.

Load up a JSON file. You can start with the tiny ones in lectures/images! Read the docs.

Do something unambitious with it: count the number of add instructions, or add a print instruction before every jump, or whatever. Pick something small and contrived! Use Turnt to test your new tool.

Along the way, you will run into problems! Ask questions on github discussions, use open issues and pull requests to describe or fix problems. For example, even super simple benchmarks you might imagine probably can’t be written easily because Bril is too simple. Mention this in discussions, and consider pitching in to help add features.

Think about how to write a good test, and add to your post describing your work, submit the post on github, and finally add a link to the post in canvas, homework 1

Other Formats

How to use Bril with real code

Lets look at a bril program.

Formatted

getting started

step 1 clone the bril repo on a linux or wsl machine

step 2 support packages

step 3 install the bril interpreter, and the typescript to bril compiler

running the interpreter

text to json and back

connect tools via pipes

Other tools

turnt Tiny unified runner and tester

Gen CFG

basic blocks from a list of instructions-

step 2 add edges

code

I’ll use a generator

as a test, lets print out the blocks

lets try a second example with a jmp

running commands inside python

print the blocks

the map (label names to blocks)

the cfg given the block map (pseudo code)

cfg

graph

homework