At every time step the agent can observe the environment and take an action. In reinforcement learning an agent exists inside an environment. Q-learning is a reinforcement learning algorithm. Let's first start by explaining a bit about the algorithm I am going to use. The environment is considered solved if the agent manages to get a score of 200 or more on average in the last 100 episodes. You get rewards for getting closer to the desired end position, and for touching the ground. You also get a small penalty for firing your main engine. If the spacecraft hits the ground at the wrong angle or too fast it crashes and you get a large penalty. At every time step you have a choice between 4 actions: fire your main engine, your left engine, your right engine or do nothing. The goal of lunar lander is to land a small spacecraft between two flags. An agent playing the LunarLander environment with random actions. The goal for this blogpost is to solve the LunarLander-v2 environment. It contains implementations of different problems for reinforcement learning agents to solve, from simple grid worlds to more complex control problems. OpenAI created a platform for testing reinforcement learning agents. This time I set the challenge for myself to solve the problem, but to also solve it in a way that can maintain its results A robust deep Q learning algorithm. An infinite cycle of success and failure. They learned how to solve a problem, but then after some more training they would oscillate out of control and end up having to learn everything again. I managed to create some working implementations, but they were never very stable. Since reading about their result I have been tinkering around with reinforcement learning. Using this approach they get a better result than all the previous algorithms available, and manage to play 49 out of the 57 games to at least the level of a professional human game tester. They trained their algorithm purely on the pixels visible on screen and a score. The amazing part of this is that they didn't create a compact representation of the state of the games. In Februari 2015 the artificial intelligence company DeepMind published an article in Nature showing they can solve a number of Atari games using Deep Q learning.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |