LunarLander-v3
On LunarLander-v3, ReaPER+ achieved a normalized cumulative-return AUC advantage over PER and ReaPER. On LunarLander-v3, ReaPER+ reached the first solve earlier than ReaPER and PER. The paper used LunarLander-v3 as a classical validation b…
1 sources - 3 claims
On LunarLander-v3, ReaPER+ achieved a normalized cumulative-return AUC advantage over PER and ReaPER. On LunarLander-v3, ReaPER+ reached the first solve earlier than ReaPER and PER. The paper used LunarLander-v3 as a classical validation benchmark for the replay methods.