Adaptive Control Ablations
PS-ada produced bucket-level rerollout pass rates near 0.5 across the 1/8, 2/8, 6/8, and 7/8 buckets. Adaptive control kept rerollout pass rates closer to the 50% target. Replay produced the largest raw reduction in convergence steps in th…
1 sources - 6 claims
PS-ada produced bucket-level rerollout pass rates near 0.5 across the 1/8, 2/8, 6/8, and 7/8 buckets. Adaptive control kept rerollout pass rates closer to the 50% target. Replay produced the largest raw reduction in convergence steps in the 4B math ablations. Hard-only adaptation controlled hard buckets but ignored easy buckets. Fixed prefix replay left the 1/8 bucket too hard and the 7/8 bucket too easy. Having more valid groups per step did not by itself determine performance.