Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

IR Design Philosophy

Morok uses a single unified IR (UOp) for all compilation stages, following Tinygrad’s design philosophy.

The UOp Structure

#![allow(unused)]
fn main() {
pub struct UOp {
    pub id: u64,                    // Unique stable ID
    pub(crate) op: Op,              // Operation (Rust tagged union)
    pub(crate) dtype: DType,        // Data type
    // Cached properties (computed lazily)
    pub(crate) shape_cache: OnceCell<...>,
    pub(crate) vmin_vmax_cache: OnceCell<...>,
}
}

Key fields:

  • op: An Op enum with 80+ operations spanning all abstraction levels
  • dtype: Type information (scalars, vectors, pointers)
  • id: Unique identifier for caching and provenance tracking

Why One IR?

Traditional ML compilers use multiple IRs:

  • TensorFlow: Graph → XLA HLO → MLIR → LLVM IR
  • PyTorch: Python AST → TorchScript → FX Graph → Inductor IR → Triton
  • JAX: Python → Jaxpr → StableHLO → MHLO → platform IR

Morok/Tinygrad uses one IR that represents all abstraction levels:

Same UOp at Different Stages

# High-level (after tensor ops):
BUFFER.reshape([2,3]).reduce(ADD, axis=0)

# Mid-level (after rangeify):
RANGE(2) -> RANGE(3) -> LOAD -> REDUCE -> STORE -> END

# Low-level (after codegen passes):
DEFINE_GLOBAL -> INDEX -> LOAD -> ADD -> STORE

Enabling Factors

  1. Graph Rewriting as Universal Mechanism

    Pattern matching + rewriting handles all transformations:

    #![allow(unused)]
    fn main() {
    let optimized = graph_rewrite(&patterns, uop_graph, &mut ctx);
    }
  2. Hash Consing (Structural Sharing)

    Identical subgraphs share memory via a thread-local cache:

    #![allow(unused)]
    fn main() {
    thread_local! {
        static CACHE: RefCell<HashMap<UOpKey, Weak<UOp>>> = ...;
    }
    }

    Benefits:

    • O(1) equality checking (pointer comparison)
    • No duplicate subgraphs in memory
    • Pattern matching can use pointer identity
  3. Lazy Property Computation

    Expensive analyses computed once and cached:

    #![allow(unused)]
    fn main() {
    pub(crate) shape_cache: OnceCell<Result<Option<Shape>>>,
    pub(crate) vmin_vmax_cache: OnceCell<(ConstValue, ConstValue)>,
    }
  4. Operation Hierarchy

    Ops organized by level to support progressive lowering:

    #![allow(unused)]
    fn main() {
    impl Op {
        pub fn is_movement(&self) -> bool { ... }
        pub fn is_buffer(&self) -> bool { ... }
        pub fn is_alu(&self) -> bool { ... }
    }
    }

Trade-offs

AspectSingle IRMulti-IR
ComplexityLowerHigher
Translation bugsNonePossible
Cross-level optimizationNaturalRequires bridging
Compile-time safetyRuntime checksPer-IR guarantees
Codebase size~15k lines100k+ lines

Morok vs Tinygrad UOp

AspectTinygradMorok
LanguagePython dataclassRust struct
Childrensrc: tuple[UOp, ...]Encoded in Op variants
Type safetyRuntimeCompile-time
Extra dataarg: Any (untyped)Typed per variant
MemoryWeakref + GCRc<UOp> + explicit cleanup

Morok’s Rust implementation adds compile-time guarantees:

#![allow(unused)]
fn main() {
// Each Op variant encodes its exact structure
Op::Binary(BinaryOp::Add, lhs, rhs)  // vs Tinygrad's (Ops.ADD, (lhs, rhs))
Op::Reduce { src, ranges, reduce_op } // Named fields, typed
}