Multiagent Systems

New submissions

Submissions received from Tue 16 Apr 24 to Wed 17 Apr 24, announced Thu, 18 Apr 24

New submissions
Cross-lists
Replacements

[ total of 7 entries: 1-7 ]
[ showing up to 1000 entries per page: fewer | more ]

New submissions for Thu, 18 Apr 24

[1] arXiv:2404.11014 [pdf, other]: Title: Towards Multi-agent Reinforcement Learning based Traffic Signal Control through Spatio-temporal Hypergraphs

Authors: Kang Wang, Zhishu Shen, Zhen Lei, Tiehua Zhang

Subjects: Multiagent Systems (cs.MA); Artificial Intelligence (cs.AI)

Traffic signal control systems (TSCSs) are integral to intelligent traffic management, fostering efficient vehicle flow. Traditional approaches often simplify road networks into standard graphs, which results in a failure to consider the dynamic nature of traffic data at neighboring intersections, thereby neglecting higher-order interconnections necessary for real-time control. To address this, we propose a novel TSCS framework to realize intelligent traffic control. This framework collaborates with multiple neighboring edge computing servers to collect traffic information across the road network. To elevate the efficiency of traffic signal control, we have crafted a multi-agent soft actor-critic (MA-SAC) reinforcement learning algorithm. Within this algorithm, individual agents are deployed at each intersection with a mandate to optimize traffic flow across the entire road network collectively. Furthermore, we introduce hypergraph learning into the critic network of MA-SAC to enable the spatio-temporal interactions from multiple intersections in the road network. This method fuses hypergraph and spatio-temporal graph structures to encode traffic data and capture the complex spatial and temporal correlations between multiple intersections. Our empirical evaluation, tested on varied datasets, demonstrates the superiority of our framework in minimizing average vehicle travel times and sustaining high-throughput performance. This work facilitates the development of more intelligent and reactive urban traffic management solutions.
[2] arXiv:2404.11351 [pdf, ps, other]: Title: Circular Distribution of Agents using Convex Layers

Authors: Gautam Kumar, Ashwini Ratnoo

Subjects: Multiagent Systems (cs.MA)

This paper considers the problem of conflict-free distribution of agents on a circular periphery encompassing all agents. The two key elements of the proposed policy include the construction of a set of convex layers (nested convex polygons) using the initial positions of the agents, and a novel search space region for each of the agents. The search space for an agent on a convex layer is defined as the region enclosed between the lines passing through the agent's position and normal to its supporting edges. Guaranteeing collision-free paths, a goal assignment policy designates a unique goal position within the search space of an agent. In contrast to the existing literature, this work presents a one-shot, collision-free solution to the circular distribution problem by utilizing only the initial positions of the agents. Illustrative examples demonstrate the effectiveness of the proposed policy.

Cross-lists for Thu, 18 Apr 24

[3] arXiv:2404.10786 (cross-list from cs.DC) [pdf, ps, other]: Title: Sustainability of Data Center Digital Twins with Reinforcement Learning

Authors: Soumyendu Sarkar, Avisek Naug, Antonio Guillen, Ricardo Luna, Vineet Gundecha, Ashwin Ramesh Babu, Sajad Mousavi

Comments: 2024 Proceedings of the AAAI Conference on Artificial Intelligence

Journal-ref: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 20, pp. 22322-22330, Mar. 2024

Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multiagent Systems (cs.MA); Systems and Control (eess.SY)

The rapid growth of machine learning (ML) has led to an increased demand for computational power, resulting in larger data centers (DCs) and higher energy consumption. To address this issue and reduce carbon emissions, intelligent design and control of DC components such as IT servers, cabinets, HVAC cooling, flexible load shifting, and battery energy storage are essential. However, the complexity of designing and controlling them in tandem presents a significant challenge. While some individual components like CFD-based design and Reinforcement Learning (RL) based HVAC control have been researched, there's a gap in the holistic design and optimization covering all elements simultaneously. To tackle this, we've developed DCRL-Green, a multi-agent RL environment that empowers the ML community to design data centers and research, develop, and refine RL controllers for carbon footprint reduction in DCs. It is a flexible, modular, scalable, and configurable platform that can handle large High Performance Computing (HPC) clusters. Furthermore, in its default setup, DCRL-Green provides a benchmark for evaluating single as well as multi-agent RL algorithms. It easily allows users to subclass the default implementations and design their own control approaches, encouraging community development for sustainable data centers. Open Source Link: https://github.com/HewlettPackard/dc-rl
[4] arXiv:2404.10976 (cross-list from cs.LG) [pdf, other]: Title: Group-Aware Coordination Graph for Multi-Agent Reinforcement Learning

Authors: Wei Duan, Jie Lu, Junyu Xuan

Comments: Accepted by IJCAI 2024

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Multiagent Systems (cs.MA)

Cooperative Multi-Agent Reinforcement Learning (MARL) necessitates seamless collaboration among agents, often represented by an underlying relation graph. Existing methods for learning this graph primarily focus on agent-pair relations, neglecting higher-order relationships. While several approaches attempt to extend cooperation modelling to encompass behaviour similarities within groups, they commonly fall short in concurrently learning the latent graph, thereby constraining the information exchange among partially observed agents. To overcome these limitations, we present a novel approach to infer the Group-Aware Coordination Graph (GACG), which is designed to capture both the cooperation between agent pairs based on current observations and group-level dependencies from behaviour patterns observed across trajectories. This graph is further used in graph convolution for information exchange between agents during decision-making. To further ensure behavioural consistency among agents within the same group, we introduce a group distance loss, which promotes group cohesion and encourages specialization between groups. Our evaluations, conducted on StarCraft II micromanagement tasks, demonstrate GACG's superior performance. An ablation study further provides experimental evidence of the effectiveness of each component of our method.
[5] arXiv:2404.11144 (cross-list from cs.AI) [pdf, other]: Title: Self-adaptive PSRO: Towards an Automatic Population-based Game Solver

Authors: Pengdeng Li, Shuxin Li, Chang Yang, Xinrun Wang, Xiao Huang, Hau Chan, Bo An

Comments: Accepted to 33rd International Joint Conference on Artificial Intelligence (IJCAI 2024)

Subjects: Artificial Intelligence (cs.AI); Computer Science and Game Theory (cs.GT); Multiagent Systems (cs.MA)

Policy-Space Response Oracles (PSRO) as a general algorithmic framework has achieved state-of-the-art performance in learning equilibrium policies of two-player zero-sum games. However, the hand-crafted hyperparameter value selection in most of the existing works requires extensive domain knowledge, forming the main barrier to applying PSRO to different games. In this work, we make the first attempt to investigate the possibility of self-adaptively determining the optimal hyperparameter values in the PSRO framework. Our contributions are three-fold: (1) Using several hyperparameters, we propose a parametric PSRO that unifies the gradient descent ascent (GDA) and different PSRO variants. (2) We propose the self-adaptive PSRO (SPSRO) by casting the hyperparameter value selection of the parametric PSRO as a hyperparameter optimization (HPO) problem where our objective is to learn an HPO policy that can self-adaptively determine the optimal hyperparameter values during the running of the parametric PSRO. (3) To overcome the poor performance of online HPO methods, we propose a novel offline HPO approach to optimize the HPO policy based on the Transformer architecture. Experiments on various two-player zero-sum games demonstrate the superiority of SPSRO over different baselines.
[6] arXiv:2404.11354 (cross-list from math.OC) [pdf, other]: Title: Distributed Fractional Bayesian Learning for Adaptive Optimization

Authors: Yaqun Yang, Jinlong Lei, Guanghui Wen, Yiguang Hong

Comments: 16 pages, 6 figures

Subjects: Optimization and Control (math.OC); Distributed, Parallel, and Cluster Computing (cs.DC); Machine Learning (cs.LG); Multiagent Systems (cs.MA)

This paper considers a distributed adaptive optimization problem, where all agents only have access to their local cost functions with a common unknown parameter, whereas they mean to collaboratively estimate the true parameter and find the optimal solution over a connected network. A general mathematical framework for such a problem has not been studied yet. We aim to provide valuable insights for addressing parameter uncertainty in distributed optimization problems and simultaneously find the optimal solution. Thus, we propose a novel Prediction while Optimization scheme, which utilizes distributed fractional Bayesian learning through weighted averaging on the log-beliefs to update the beliefs of unknown parameters, and distributed gradient descent for renewing the estimation of the optimal solution. Then under suitable assumptions, we prove that all agents' beliefs and decision variables converge almost surely to the true parameter and the optimal solution under the true parameter, respectively. We further establish a sublinear convergence rate for the belief sequence. Finally, numerical experiments are implemented to corroborate the theoretical analysis.

Replacements for Thu, 18 Apr 24

[7] arXiv:2306.12037 (replaced) [pdf, other]: Title: Distributed Random Reshuffling Methods with Improved Convergence

Authors: Kun Huang, Linli Zhou, Shi Pu

Comments: 16 pages, 8 figures

Subjects: Optimization and Control (math.OC); Machine Learning (cs.LG); Multiagent Systems (cs.MA)

New submissions
Cross-lists
Replacements

[ total of 7 entries: 1-7 ]
[ showing up to 1000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, recent, 2404, contact, help (Access key information)

> cs > cs.MA

Multiagent Systems

New submissions

New submissions for Thu, 18 Apr 24

Cross-lists for Thu, 18 Apr 24

Replacements for Thu, 18 Apr 24