Mémoire UCL

Deep Q-Learning for Robocode

Degryse_27641200_2017.pdf

Open access
PDF
11.12 M

Degryse, Baptiste [UCL] Lee, John [UCL]

Q-learning can be used to find an optimal action-selection policy for any given finite Markov Decision Process. The Q-network is a neural network that approximate the value of an action in vast state space. This work will study the deep Q-learning, a combination of the Q-learning and neural networks, and evaluate the impact of meta parameters. The applicative context of this work is the game Robocode and we evaluate the impact of different state representation, rewards, actions, neural network architectures in order to build an artificial intelligence. The hybrid architecture combining a deep feedforward neural network with two convolutional layers for the processing of the log of the states was successful. Adding a long short-term memory layer in parallel, and removing the random memory replay in order to have a sequential training was not suited. The resulting artificial intelligence outperformed the other public machine learning projects thanks to its simplicity of actions enabling learning complex behavior. This project can be used to gain intuition on an efficient state and action representation, on the importance of a complete reward function, as well as the advantages of an hybrid architecture to get the best of each network specificities.

metadata

Access type	:	Accès libre
Year	:	2017
Keywords	:	Deep Q-Learning, Robocode, Q-Learning, LSTM, Convolutional, Neural, Network, Q-Network
Language	:	Anglais
Faculty	:	Ecole polytechnique de Louvain
Degree	:	Master [120] : ingénieur civil en informatique

Bibliographic reference	Degryse, Baptiste. Deep Q-Learning for Robocode. Ecole polytechnique de Louvain, Université catholique de Louvain, 2017. Prom. : Lee, John.
Permanent URL	http://hdl.handle.net/2078.1/thesis:10589

Deep Q-Learning for Robocode

Footer

Languages

Search form

You are here

Deep Q-Learning for Robocode

Footer

Languages