Reinforcement learning is supervised learning with uncertainty

202409231726 Status: #idea Tags: #ai # Reinforcement learning is supervised learning with uncertainty In supervised learning, our goal is to learn a model that predict, given an input $x_i$, the correct output $y_i$ In reinforcement learning, our goal is to learn policy that predicts, given an input state $S_t$, the optimal action $A_t$ that maximizes our long run reward. In RL set ups, we typically only have a desired "end state" in mind (e.g. win this game of chess), and so we don't know a priori which move in each state is optimal. So we instead play out a game and back propagate credit for losses/wins to the actions taken in each state along the way. But if we had a massive dataset of states & action pairs where we *knew* the optimal action to be taken in that state to maximize reward? Then we could formulate this is as a supervised learning problem, learning a function that maps states to actions via standardized supervised learning rather than RL techniques. --- # References