1 – Introduction:
Let me begin by saying that PyBrain is more than just a Reinforcement Learning library. For example, it implements principal component analysis, supervised training by neural networks, genetic algorithms, as well as particle swarm optimizers.
This tutorial, however, will focus entirely on the Reinforcement Learning module. The target audience is essentially any decent Python programmer, with or without any experience in machine learning, as it explains everything from scratch and is relatively informal.
Reinforcement learning is the machine learning equivalent of the dopaminergic system in the basal ganglia, i.e. pleasure versus pain. You do something, your brain likes it – you get pleasure ("reward"). Nobody tells you what the optimal or desired action is (in contrast to supervised learning, for example). You just know that what you did is better than that other thing you did that gave you less pleasure, or even pain. Somehow, you try to adjust accordingly. And yet, you have to keep exploring a little bit, otherwise you might never find out that optimal action that provides the most pleasure.
I will start by describing the general concept of Q Learning from a mathematical perspective (as well as its variations), then I will proceed to the PyBrain API. Finally, I will give a few examples of a Black Jack (the card game) learning agent implemented with PyBrain.