Zonation Method for Efficient Training of Collaborative Multi-Agent Reinforcement Learning in Double Snake Game

Marvin Yonathan Hadiyanto, Budi Harsono, Indra Karnadi

Abstract


This paper proposes a zonation method for training the two reinforcement learning agents. We demonstrate the method's effectiveness in the double snake game. The game consists of two snakes operating in a fully cooperative setting to maximize the score. The problem in this game can be related to real-world problems, namely, coordination in autonomous driving cars and the operation of collaborative mobile robots in warehouse applications. Here, we use a deep Q-network algorithm to train the two agents to play the double snake game collaboratively through a decentralized approach, where distinct state and reward functions are assigned to each agent. To improve training efficiency, we utilize the snake sensory data of the surrounding objects as the input state to reduce the neural network complexity. The obtained result show that the proposed approaches can be used to train collaborative multi-agent efficiently, especially in the limited computing resources and training time environment

Keywords


Zonation method; double snake game; collaborative multi-agent reinforcement learning; training efficiency

Full Text:

PDF

References


A. Sebastianelli, M. Tipaldi, S. L. Ullo, and L. Glielmo, “A Deep Q-Learning based approach applied to the Snake game,” in 2021 29th Mediterranean Conference on Control and Automation (MED), Jun. 2021, pp. 348–353, doi: 10.1109/MED51440.2021.9480232.

R. S. Sutton and Andrew G. Barto, Reinforcement learning: An introduction, Second Edi. MIT Press, 2018.

K. Arulkumaran, M. P. Deisenroth, M. Brundage, and A. A. Bharath, “Deep Reinforcement Learning: A Brief Survey,” IEEE Signal Process. Mag., vol. 34, no. 6, pp. 26–38, Nov. 2017, doi: 10.1109/MSP.2017.2743240.

D. P. Bertsekas, “Feature-based aggregation and deep reinforcement learning: a survey and some new implementations,” IEEE/CAA J. Autom. Sin., vol. 6, no. 1, pp. 1–31, Jan. 2019, doi: 10.1109/JAS.2018.7511249.

Y. Li, K. Fu, H. Sun, and X. Sun, “An Aircraft Detection Framework Based on Reinforcement Learning and Convolutional Neural Networks in Remote Sensing Images,” Remote Sens., vol. 10, no. 2, p. 243, Feb. 2018, doi: 10.3390/rs10020243.

S. Kuutti, R. Bowden, Y. Jin, P. Barber, and S. Fallah, “A Survey of Deep Learning Applications to Autonomous Vehicle Control,” IEEE Trans. Intell. Transp. Syst., vol. 22, no. 2, pp. 712–733, Feb. 2021, doi: 10.1109/TITS.2019.2962338.

M. Tipaldi, L. Feruglio, P. Denis, and G. D’Angelo, “On applying AI-driven flight data analysis for operational spacecraft model-based diagnostics,” Annu. Rev. Control, vol. 49, pp. 197–211, 2020, doi: 10.1016/j.arcontrol.2020.04.012.

G. D’Angelo, M. Tipaldi, F. Palmieri, and L. Glielmo, “A data-driven approximate dynamic programming approach based on association rule learning: Spacecraft autonomy as a case study,” Inf. Sci. (Ny)., vol. 504, pp. 501–519, Dec. 2019, doi: 10.1016/j.ins.2019.07.067.

J. Li et al., “Suphx : Mastering Mahjong with Deep arXiv : 2003 . 13590v2 [ cs . AI ] 1 Apr 2020,” pp. 1–28, doi: https://doi.org/10.48550/arXiv.2003.13590.

N. Brown and T. Sandholm, “Superhuman AI for multiplayer poker,” Science (80-. )., vol. 365, no. 6456, pp. 885–890, Aug. 2019, doi: 10.1126/science.aay2400.

M. Samvelyan, T. Rashid, C. Schroeder, and D. W. Gregory, “The StarCraft Multi-Agent Challenge,” no. NeurIPS, pp. 1–14, 2019, doi: https://doi.org/10.48550/arXiv.1902.04043.

S. Yoon and K.-J. Kim, “Deep Q networks for visual fighting game AI,” in 2017 IEEE Conference on Computational Intelligence and Games (CIG), Aug. 2017, pp. 306–308, doi: 10.1109/CIG.2017.8080451.

E. A. O. Diallo, A. Sugiyama, and T. Sugawara, “Learning to Coordinate with Deep Reinforcement Learning in Doubles Pong Game,” in 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA), Dec. 2017, pp. 14–19, doi: 10.1109/ICMLA.2017.0-184.

I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. MIT Press, 2016.

J. Wang, D. Xue, J. Zhao, W. Zhou, and H. Li, “Mastering the Game of 3v3 Snakes with Rule-Enhanced Multi-Agent Reinforcement Learning,” in 2022 IEEE Conference on Games (CoG), Aug. 2022, pp. 229–236, doi: 10.1109/CoG51982.2022.9893608.

Z. Wei, D. Wang, M. Zhang, A.-H. Tan, C. Miao, and Y. Zhou, “Autonomous Agents in Snake Game via Deep Reinforcement Learning,” in 2018 IEEE International Conference on Agents (ICA), Jul. 2018, pp. 20–25, doi: 10.1109/AGENTS.2018.8460004.

K. Zhang, Z. Yang, and T. Başar, “Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms,” 2021, pp. 321–384.

W. Zhao, E.-A. Rantala, J. Pajarinen, and J. P. Queralta, “Less Is More: Robust Robot Learning via Partially Observable Multi-Agent Reinforcement Learning,” 2023, [Online]. Available: http://arxiv.org/abs/2309.14792.

K. Zhang, Z. Yang, and T. Başar, “Decentralized multi-agent reinforcement learning with networked agents: recent advances,” Front. Inf. Technol. Electron. Eng., vol. 22, no. 6, pp. 802–814, 2021, doi: 10.1631/FITEE.1900661.

K. Zhang, Z. Yang, H. Liu, T. Zhang, and T. Başar, “Fully decentralized multi-agent reinforcement learning with networked agents,” 35th Int. Conf. Mach. Learn. ICML 2018, vol. 13, pp. 9340–9371, 2018.

A. Tampuu et al., “Multiagent cooperation and competition with deep reinforcement learning,” PLoS One, vol. 12, no. 4, p. e0172395, Apr. 2017, doi: 10.1371/journal.pone.0172395.

L. Han et al., “Grid-wise control for multi-agent reinforcement learning in video game AI,” 36th Int. Conf. Mach. Learn. ICML 2019, vol. 2019-June, pp. 4558–4571, 2019.

O. Tanner, “Multi-Agent Car Parking using Reinforcement Learning,” arXiv, no. June, pp. 1–122, 2022, [Online]. Available: http://arxiv.org/abs/2206.13338.

F. D’Souza, J. Costa, and J. N. Pires, “Development of a solution for adding a collaborative robot to an industrial AGV,” Ind. Robot Int. J. Robot. Res. Appl., vol. 47, no. 5, pp. 723–735, May 2020, doi: 10.1108/IR-01-2020-0004.

H. Lee, J. Hong, and J. Jeong, “MARL-Based Dual Reward Model on Segmented Actions for Multiple Mobile Robots in Automated Warehouse Environment,” Appl. Sci., vol. 12, no. 9, p. 4703, May 2022, doi: 10.3390/app12094703.

Troullinos and M. Dimitrios and Chalkiadakis, Georgios and Papamichail, Ioannis and Papageorgiou, “Collaborative Multiagent Decision Making for Lane-Free Autonomous Driving,” in Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems, 2021, pp. 1335–1343, doi: doi/10.5555/3463952.3464106.




DOI: https://doi.org/10.26877/asset.v6i1.17562

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

SLOT GACOR
https://kampus.lol/halowir/
https://vokasi.unpad.ac.id/gacor/?ABKISGOD=INFINI88 https://vokasi.unpad.ac.id/gacor/?ABKISGOD=FREECHIPS https://vokasi.unpad.ac.id/gacor/?ABKISGOD=DATAHK https://vokasi.unpad.ac.id/gacor/?ABKISGOD=TOTO+4D

https://build.president.ac.id/

https://build.president.ac.id/modules/

https://build.president.ac.id/views/

https://yudisium.ft.unmul.ac.id/pages/

https://yudisium.ft.unmul.ac.id/products/

https://yudisium.ft.unmul.ac.id/data/

https://ssstik.temanku.okukab.go.id/

https://snaptik.temanku.okukab.go.id/

https://jendralamen168.dinsos.banggaikab.go.id/gacor/

https://dinsos.dinsos.banggaikab.go.id/

https://kema.unpad.ac.id/wp-content/bet200/

https://kema.unpad.ac.id/wp-content/spulsa/

https://kema.unpad.ac.id/wp-content/stai/

https://kema.unpad.ac.id/wp-content/stoto/

Advance Sustainable Science, Engineering and Technology (ASSET)

E-ISSN: 2715-4211
Published by Science and Technology Research Centre

Universitas PGRI Semarang, Indonesia

Website: http://journal.upgris.ac.id/index.php/asset/index 
Email: asset@upgris.ac.id