Multiagent Systems
New submissions
[ showing up to 500 entries per page: fewer | more ]
New submissions for Fri, 29 Mar 24
- [1] arXiv:2403.18985 [pdf, ps, other]
-
Title: Robustness and Visual Explanation for Black Box Image, Video, and ECG Signal Classification with Reinforcement LearningAuthors: Soumyendu Sarkar, Ashwin Ramesh Babu, Sajad Mousavi, Vineet Gundecha, Avisek Naug, Sahand GhorbanpourComments: AAAI Proceedings reference: this https URLJournal-ref: 2024 Proceedings of the AAAI Conference on Artificial IntelligenceSubjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV); Multiagent Systems (cs.MA)
We present a generic Reinforcement Learning (RL) framework optimized for crafting adversarial attacks on different model types spanning from ECG signal analysis (1D), image classification (2D), and video classification (3D). The framework focuses on identifying sensitive regions and inducing misclassifications with minimal distortions and various distortion types. The novel RL method outperforms state-of-the-art methods for all three applications, proving its efficiency. Our RL approach produces superior localization masks, enhancing interpretability for image classification and ECG analysis models. For applications such as ECG analysis, our platform highlights critical ECG segments for clinicians while ensuring resilience against prevalent distortions. This comprehensive tool aims to bolster both resilience with adversarial training and transparency across varied applications and data types.
- [2] arXiv:2403.19253 [pdf, other]
-
Title: Inferring Latent Temporal Sparse Coordination Graph for Multi-Agent Reinforcement LearningSubjects: Machine Learning (cs.LG); Multiagent Systems (cs.MA)
Effective agent coordination is crucial in cooperative Multi-Agent Reinforcement Learning (MARL). While agent cooperation can be represented by graph structures, prevailing graph learning methods in MARL are limited. They rely solely on one-step observations, neglecting crucial historical experiences, leading to deficient graphs that foster redundant or detrimental information exchanges. Additionally, high computational demands for action-pair calculations in dense graphs impede scalability. To address these challenges, we propose inferring a Latent Temporal Sparse Coordination Graph (LTS-CG) for MARL. The LTS-CG leverages agents' historical observations to calculate an agent-pair probability matrix, where a sparse graph is sampled from and used for knowledge exchange between agents, thereby simultaneously capturing agent dependencies and relation uncertainty. The computational complexity of this procedure is only related to the number of agents. This graph learning process is further augmented by two innovative characteristics: Predict-Future, which enables agents to foresee upcoming observations, and Infer-Present, ensuring a thorough grasp of the environmental context from limited data. These features allow LTS-CG to construct temporal graphs from historical and real-time information, promoting knowledge exchange during policy learning and effective collaboration. Graph learning and agent training occur simultaneously in an end-to-end manner. Our demonstrated results on the StarCraft II benchmark underscore LTS-CG's superior performance.
- [3] arXiv:2403.19375 [pdf, other]
-
Title: Multi-Agent Team Access Monitoring: Environments that Benefit from Target Information SharingSubjects: Robotics (cs.RO); Multiagent Systems (cs.MA)
Robotic access monitoring of multiple target areas has applications including checkpoint enforcement, surveillance and containment of fire and flood hazards. Monitoring access for a single target region has been successfully modeled as a minimum-cut problem. We generalize this model to support multiple target areas using two approaches: iterating on individual targets and examining the collections of targets holistically. Through simulation we measure the performance of each approach on different scenarios.
- [4] arXiv:2403.19648 [pdf, other]
-
Title: Human-compatible driving partners through data-regularized self-play reinforcement learningSubjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multiagent Systems (cs.MA)
A central challenge for autonomous vehicles is coordinating with humans. Therefore, incorporating realistic human agents is essential for scalable training and evaluation of autonomous driving systems in simulation. Simulation agents are typically developed by imitating large-scale, high-quality datasets of human driving. However, pure imitation learning agents empirically have high collision rates when executed in a multi-agent closed-loop setting. To build agents that are realistic and effective in closed-loop settings, we propose Human-Regularized PPO (HR-PPO), a multi-agent algorithm where agents are trained through self-play with a small penalty for deviating from a human reference policy. In contrast to prior work, our approach is RL-first and only uses 30 minutes of imperfect human demonstrations. We evaluate agents in a large set of multi-agent traffic scenes. Results show our HR-PPO agents are highly effective in achieving goals, with a success rate of 93%, an off-road rate of 3.5%, and a collision rate of 3%. At the same time, the agents drive in a human-like manner, as measured by their similarity to existing human driving logs. We also find that HR-PPO agents show considerable improvements on proxy measures for coordination with human driving, particularly in highly interactive scenarios. We open-source our code and trained agents at https://github.com/Emerge-Lab/nocturne_lab and provide demonstrations of agent behaviors at https://sites.google.com/view/driving-partners.
Replacements for Fri, 29 Mar 24
- [5] arXiv:2310.03394 (replaced) [pdf, other]
-
Title: Kinodynamic Motion Planning for a Team of Multirotors Transporting a Cable-Suspended Payload in Cluttered EnvironmentsComments: Submitted to IROS, 2024Subjects: Robotics (cs.RO); Multiagent Systems (cs.MA)
- [6] arXiv:2402.01088 (replaced) [pdf, other]
-
Title: The Danger Of Arrogance: Welfare Equilibra As A Solution To Stackelberg Self-Play In Non-Coincidental GamesComments: 31 pages, 23 figuresSubjects: Computer Science and Game Theory (cs.GT); Multiagent Systems (cs.MA)
[ showing up to 500 entries per page: fewer | more ]
Disable MathJax (What is MathJax?)
Links to: arXiv, form interface, find, cs, recent, 2403, contact, help (Access key information)