Runtime

Packed ternary execution path

The current runtime is designed for the environment where the work is already validated: Linux, NVIDIA and Triton-backed execution. It is separate from the research notebooks and separate from the model logic itself.

The runtime path loads student_pruned, applies the recovered packed adapter, then swaps the targeted MLP base layers with the packed ternary execution path.

This matters because the model and the runtime are not the same thing. The model logic can survive across targets. The current production runtime is one concrete target.

The site and bundle are therefore structured around a deployable package, not around a single checkpoint file pretending to be a universal format.

RPT dog technical diagram