LunarLander-v3

On LunarLander-v3, ReaPER+ achieved a normalized cumulative-return AUC advantage over PER and ReaPER. On LunarLander-v3, ReaPER+ reached the first solve earlier than ReaPER and PER. The paper used LunarLander-v3 as a classical validation b…

1 sources - 3 claims

On LunarLander-v3, ReaPER+ achieved a normalized cumulative-return AUC advantage over PER and ReaPER. On LunarLander-v3, ReaPER+ reached the first solve earlier than ReaPER and PER. The paper used LunarLander-v3 as a classical validation benchmark for the replay methods.