Description
This time the maze got more complex by adding a K(ey)
and a D(oor).
The door only opens if the agent has previusly found
the key.
In terms of the Q-Table this is realised by adding an
additional dimension.
Instead of accessing the action ratings for a position,
we also include if the key has been optained or
not.
This is a crucial point of Q-Learning, the complexity
of the environment is directly related with the
Q-Table.
The code can be found at github.