Latency Evaluation
For the OpenAI Realtime API, AsyncIO reduced latency on HotpotQA and TinyAgent with small accuracy drops. On TinyAgent with the OpenAI Realtime API, AsyncIO reduced latency from 7.6 seconds to 4.4 seconds while accuracy decreased. Across o…
1 sources - 5 claims
For the OpenAI Realtime API, AsyncIO reduced latency on HotpotQA and TinyAgent with small accuracy drops. On TinyAgent with the OpenAI Realtime API, AsyncIO reduced latency from 7.6 seconds to 4.4 seconds while accuracy decreased. Across open-source models and tasks, Async-SFT kept accuracy near the synchronous baseline while reducing latency by 1.6 to 2.2 times. Latency measurements are partly simulated rather than fully measured in all settings. The paper does not provide inferential uncertainty for reported latency and accuracy differences.