Zonation Method for Efficient Training of Collaborative Multi-Agent Reinforcement Learning in Double Snake Game

Marvin Yonathan Hadiyanto; Budi Harsono; Indra Karnadi

doi:10.26877/asset.v6i1.17562

Zonation Method for Efficient Training of Collaborative Multi-Agent Reinforcement Learning in Double Snake Game

Marvin Yonathan Hadiyanto, Budi Harsono, Indra Karnadi

Abstract

This paper proposes a zonation method for training the two reinforcement learning agents. We demonstrate the method's effectiveness in the double snake game. The game consists of two snakes operating in a fully cooperative setting to maximize the score. The problem in this game can be related to real-world problems, namely, coordination in autonomous driving cars and the operation of collaborative mobile robots in warehouse applications. Here, we use a deep Q-network algorithm to train the two agents to play the double snake game collaboratively through a decentralized approach, where distinct state and reward functions are assigned to each agent. To improve training efficiency, we utilize the snake sensory data of the surrounding objects as the input state to reduce the neural network complexity. The obtained result show that the proposed approaches can be used to train collaborative multi-agent efficiently, especially in the limited computing resources and training time environment

Keywords

Zonation method; double snake game; collaborative multi-agent reinforcement learning; training efficiency

Full Text:

PDF

References

A. Sebastianelli, M. Tipaldi, S. L. Ullo, and L. Glielmo, â€œA Deep Q-Learning based approach applied to the Snake game,â€ in 2021 29th Mediterranean Conference on Control and Automation (MED), Jun. 2021, pp. 348â€“353, doi: 10.1109/MED51440.2021.9480232.

R. S. Sutton and Andrew G. Barto, Reinforcement learning: An introduction, Second Edi. MIT Press, 2018.

K. Arulkumaran, M. P. Deisenroth, M. Brundage, and A. A. Bharath, â€œDeep Reinforcement Learning: A Brief Survey,â€ IEEE Signal Process. Mag., vol. 34, no. 6, pp. 26â€“38, Nov. 2017, doi: 10.1109/MSP.2017.2743240.

D. P. Bertsekas, â€œFeature-based aggregation and deep reinforcement learning: a survey and some new implementations,â€ IEEE/CAA J. Autom. Sin., vol. 6, no. 1, pp. 1â€“31, Jan. 2019, doi: 10.1109/JAS.2018.7511249.

Y. Li, K. Fu, H. Sun, and X. Sun, â€œAn Aircraft Detection Framework Based on Reinforcement Learning and Convolutional Neural Networks in Remote Sensing Images,â€ Remote Sens., vol. 10, no. 2, p. 243, Feb. 2018, doi: 10.3390/rs10020243.

S. Kuutti, R. Bowden, Y. Jin, P. Barber, and S. Fallah, â€œA Survey of Deep Learning Applications to Autonomous Vehicle Control,â€ IEEE Trans. Intell. Transp. Syst., vol. 22, no. 2, pp. 712â€“733, Feb. 2021, doi: 10.1109/TITS.2019.2962338.

M. Tipaldi, L. Feruglio, P. Denis, and G. Dâ€™Angelo, â€œOn applying AI-driven flight data analysis for operational spacecraft model-based diagnostics,â€ Annu. Rev. Control, vol. 49, pp. 197â€“211, 2020, doi: 10.1016/j.arcontrol.2020.04.012.

G. Dâ€™Angelo, M. Tipaldi, F. Palmieri, and L. Glielmo, â€œA data-driven approximate dynamic programming approach based on association rule learning: Spacecraft autonomy as a case study,â€ Inf. Sci. (Ny)., vol. 504, pp. 501â€“519, Dec. 2019, doi: 10.1016/j.ins.2019.07.067.

J. Li et al., â€œSuphx : Mastering Mahjong with Deep arXiv : 2003 . 13590v2 [ cs . AI ] 1 Apr 2020,â€ pp. 1â€“28, doi: https://doi.org/10.48550/arXiv.2003.13590.

N. Brown and T. Sandholm, â€œSuperhuman AI for multiplayer poker,â€ Science (80-. )., vol. 365, no. 6456, pp. 885â€“890, Aug. 2019, doi: 10.1126/science.aay2400.

M. Samvelyan, T. Rashid, C. Schroeder, and D. W. Gregory, â€œThe StarCraft Multi-Agent Challenge,â€ no. NeurIPS, pp. 1â€“14, 2019, doi: https://doi.org/10.48550/arXiv.1902.04043.

S. Yoon and K.-J. Kim, â€œDeep Q networks for visual fighting game AI,â€ in 2017 IEEE Conference on Computational Intelligence and Games (CIG), Aug. 2017, pp. 306â€“308, doi: 10.1109/CIG.2017.8080451.

E. A. O. Diallo, A. Sugiyama, and T. Sugawara, â€œLearning to Coordinate with Deep Reinforcement Learning in Doubles Pong Game,â€ in 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA), Dec. 2017, pp. 14â€“19, doi: 10.1109/ICMLA.2017.0-184.

I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. MIT Press, 2016.

J. Wang, D. Xue, J. Zhao, W. Zhou, and H. Li, â€œMastering the Game of 3v3 Snakes with Rule-Enhanced Multi-Agent Reinforcement Learning,â€ in 2022 IEEE Conference on Games (CoG), Aug. 2022, pp. 229â€“236, doi: 10.1109/CoG51982.2022.9893608.

Z. Wei, D. Wang, M. Zhang, A.-H. Tan, C. Miao, and Y. Zhou, â€œAutonomous Agents in Snake Game via Deep Reinforcement Learning,â€ in 2018 IEEE International Conference on Agents (ICA), Jul. 2018, pp. 20â€“25, doi: 10.1109/AGENTS.2018.8460004.

K. Zhang, Z. Yang, and T. BaÅŸar, â€œMulti-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms,â€ 2021, pp. 321â€“384.

W. Zhao, E.-A. Rantala, J. Pajarinen, and J. P. Queralta, â€œLess Is More: Robust Robot Learning via Partially Observable Multi-Agent Reinforcement Learning,â€ 2023, [Online]. Available: http://arxiv.org/abs/2309.14792.

K. Zhang, Z. Yang, and T. BaÅŸar, â€œDecentralized multi-agent reinforcement learning with networked agents: recent advances,â€ Front. Inf. Technol. Electron. Eng., vol. 22, no. 6, pp. 802â€“814, 2021, doi: 10.1631/FITEE.1900661.

K. Zhang, Z. Yang, H. Liu, T. Zhang, and T. BaÅŸar, â€œFully decentralized multi-agent reinforcement learning with networked agents,â€ 35th Int. Conf. Mach. Learn. ICML 2018, vol. 13, pp. 9340â€“9371, 2018.

A. Tampuu et al., â€œMultiagent cooperation and competition with deep reinforcement learning,â€ PLoS One, vol. 12, no. 4, p. e0172395, Apr. 2017, doi: 10.1371/journal.pone.0172395.

L. Han et al., â€œGrid-wise control for multi-agent reinforcement learning in video game AI,â€ 36th Int. Conf. Mach. Learn. ICML 2019, vol. 2019-June, pp. 4558â€“4571, 2019.

O. Tanner, â€œMulti-Agent Car Parking using Reinforcement Learning,â€ arXiv, no. June, pp. 1â€“122, 2022, [Online]. Available: http://arxiv.org/abs/2206.13338.

F. Dâ€™Souza, J. Costa, and J. N. Pires, â€œDevelopment of a solution for adding a collaborative robot to an industrial AGV,â€ Ind. Robot Int. J. Robot. Res. Appl., vol. 47, no. 5, pp. 723â€“735, May 2020, doi: 10.1108/IR-01-2020-0004.

H. Lee, J. Hong, and J. Jeong, â€œMARL-Based Dual Reward Model on Segmented Actions for Multiple Mobile Robots in Automated Warehouse Environment,â€ Appl. Sci., vol. 12, no. 9, p. 4703, May 2022, doi: 10.3390/app12094703.

Troullinos and M. Dimitrios and Chalkiadakis, Georgios and Papamichail, Ioannis and Papageorgiou, â€œCollaborative Multiagent Decision Making for Lane-Free Autonomous Driving,â€ in Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems, 2021, pp. 1335â€“1343, doi: doi/10.5555/3463952.3464106.

DOI: https://doi.org/10.26877/asset.v6i1.17562

Refbacks

There are currently no refbacks.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.