Perplexity

Depth-scaling experiments showed N-vium perplexity equal to or better than matched dense baselines. After supervised fine-tuning, N-vium improved perplexity over dense baselines across tested depths. IsoFLOP and isoSpeed comparisons favore…

1 sources - 4 claims

Depth-scaling experiments showed N-vium perplexity equal to or better than matched dense baselines. After supervised fine-tuning, N-vium improved perplexity over dense baselines across tested depths. IsoFLOP and isoSpeed comparisons favored Quadrivium over dense baselines trained for more steps. The reported mixture perplexity was much lower than any individual exit perplexity in a 24-layer Quadrivium analysis.