202509282005 Status: #idea Tags: #reinforcement_learning #ai # The core mental model of reinforcement learning All reinforcement learning models fit under the following three step framework: 1. First, the agent gathers experience samples, either from interacting with the environment or from querying a learned model of an environment. 2. Second, every reinforcement learning agent learns to estimate something, perhaps a model of the environment, or possibly a policy, a value function, or just the returns. 3. Last, every reinforcement learning agent attempts to improve a policy --- # References [[Grokking Deep Reinforcement Learning]] [[Generalized policy iteration (GPI)]]