202509282005
Status: #idea
Tags: #reinforcement_learning #ai
# The core mental model of reinforcement learning
All reinforcement learning models fit under the following three step framework:
1. First, the agent gathers experience samples, either from interacting with the environment or from querying a learned model of an environment.
2. Second, every reinforcement learning agent learns to estimate something, perhaps a model of the environment, or possibly a policy, a value function, or just the returns.
3. Last, every reinforcement learning agent attempts to improve a policy
---
# References
[[Grokking Deep Reinforcement Learning]]
[[Generalized policy iteration (GPI)]]