Infoprop Dyna

1 sources - 4 claims

Infoprop Dyna had previously achieved state-of-the-art performance in simulated MuJoCo benchmarks. The method was selected because it tracks accumulated model uncertainty and mitigates exploitation of inaccurate model predictions. Infoprop Dyna improved the autonomous racing policy using both real robot interactions and learned-model rollouts. Infoprop Dyna is designed to limit exploitation of errors in learned dynamics models during synthetic rollouts.