Benchmarks

What was actually proven

These numbers are the stable project claims we should carry into public docs: original local baseline, improved dense checkpoint and final packed recovered runtime.
VariantScoreMeaning
HF Gemma 2B original0.90Reference local benchmark result
student_pruned0.92Dense improved checkpoint after our conversion and pruning path
Packed recovered runtime1.00Final Triton-backed packed path on the same local suite