homework 7 dynamic compile
This task is to implement a trace-based speculative optimizer for Bril. You’ll implement the same concept as in a tracing JIT, but in a profile-guided AOT setting: profiling, transformation, and execution will be distinct phases. The idea is to implement the “heavy lifting” for a trace-based JIT without needing all the scaffolding that a complete JIT requires, such as on-stack replacement.
Concretely, there are three main phases:
- Modify the reference interpreter to produce traces.
- Build an optimizer that injects traces back into the original program using the speculation extension to provide a “fast path.”
- Check that the whole process is correct and had some effect on performance (it needn’t actually be good!).
Start by reading the documentation for the speculation extension (and watch the video!). That should give you an idea of what’s required to augment a program with a speculative execution of an extracted trace. Then make a plan for how you’ll hack the interpreter to produce one of those traces.
Here’s a recipe:
- Start interpreting normally.
- At some point during execution (at the very beginning of main, for example, or when reaching a backedge), start tracing.
- While tracing, record every instruction as it executes. Eliminate jumps; replace branches with guard instructions. Feel free to do the interprocedural version, and to bail out on any other instruction you don’t want to handle.
- Stop tracing at some point (after a fixed number of instructions, for example, or at the next backedge) and save the trace to a file.
- For bonus “points,” statically optimize the trace by eliminating instructions that depend on foregone conclusions enforced by guards.
- Transform the program to stitch the trace back into the program using speculate and commit instructions.
- For these tasks, unlike some previous lessons, I recommend not attempting to support all the benchmarks. It’s more important that you understand a few programs well than you apply your transformation to a large body of code. (In other words, I recommend that you work depth-first instead of breadth-first.)
In particular, you do not need to support Bril’s memory extension, which makes things more complicated because it doesn’t get automatically rolled back on speculation aborts. If you are feeling very ambitious, you can try devising a scheme to manually roll back memory modifications on aborts (consider an “undo log” or “redo log,” which are concepts from databases).
Finally, evaluate your work:
Check that you didn’t break programs. For at least one benchmark (and ideally a few), create multiple inputs to the program that result in different outputs. Use one input to generate the trace and optimize the program, and use other inputs to check correctness. This approach guards against cases where your tracing optimization “overfits” and you end up with code that only works on one input. Measure performance impact, i.e., the effect of your transformation on the dynamic instruction count.
for at least one benchmark, use at least two inputs to evaluate tracing’s impact on executions that are not identical to the traced execution. If you implemented optimizations on the traced code, consider comparing the optimized vs. un-optimized versions. (It’s OK if your tracing apparatus makes programs slower, especially on unseen inputs! We just want to measure the difference.)