Memory Budget Ablations

CATS adapts LSV to available DRAM and improves speed at 2 GB, 6 GB, and 8 GB budgets. At a 2 GB budget, CATS with LDM=3 and LSV=5 reached 0.329 tokens/s and 2.82x speedup. CATS achieved similar or better acceptance than deep Kangaroo while…

1 sources - 5 claims

CATS adapts LSV to available DRAM and improves speed at 2 GB, 6 GB, and 8 GB budgets. At a 2 GB budget, CATS with LDM=3 and LSV=5 reached 0.329 tokens/s and 2.82x speedup. CATS achieved similar or better acceptance than deep Kangaroo while reducing bytes per token and improving wall-clock speed. Deepening Kangaroo's draft model improves acceptance but worsens bytes per token and speed. The default drafting horizon of gamma-bar = 5 captures most acceptance benefit before memory traffic and loop latency rise.