I have created applet in processing that demonstrates Q learning algorithm.
It is based on example calculations in this document 'Q learning with example'.
Applet:
- circle in the middle is our goal. Here we want to get.
- the second circle is the current position of the algorithm.
- every cell (rectangle) has 4 values: reward for movement UP, DOWN, RIGHT, LEFT.
Very nice explanation and a applet are available here.


