Response Time
All four models completed the 240-question assessment in under three minutes. Gemini 3 Pro had the best accuracy-speed profile in the study. GPT-5.2 was the slowest model and showed the greatest overall score instability. The models comple…
1 sources - 4 claims
All four models completed the 240-question assessment in under three minutes. Gemini 3 Pro had the best accuracy-speed profile in the study. GPT-5.2 was the slowest model and showed the greatest overall score instability. The models completed the exam much faster than the 4.5 hours allocated to human candidates.