Learning-Rate Clocks

In discounted reinforcement learning, the local-clock and global-clock distinction has not produced known divergence counterexamples under standard assumptions. Classical convergence proofs often index step size by visits to each state. Pr…

1 sources - 5 claims

In discounted reinforcement learning, the local-clock and global-clock distinction has not produced known divergence counterexamples under standard assumptions. Classical convergence proofs often index step size by visits to each state. Practitioners more commonly use step sizes indexed by elapsed time. In average-reward reinforcement learning, changing from local clocks to global clocks can affect stability.