Robotics
New submissions
[ showing up to 2000 entries per page: fewer | more ]
New submissions for Fri, 29 Mar 24
- [1] arXiv:2403.18841 [pdf, other]
-
Title: Minimal activation with maximal reach: Reachability clouds of bio-inspired slender manipulatorsComments: 13 pages, 4 figuresSubjects: Robotics (cs.RO)
In the field of soft robotics, flexibility, adaptability, and functionality define a new era of robotic systems that can safely deform, reach, and grasp. To optimize the design of soft robotic systems, it is critical to understand their configuration space and full range of motion across a wide variety of design parameters. Here we integrate extreme mechanics and soft robotics to provide quantitative insights into the design of bio-inspired soft slender manipulators using the concept of reachability clouds. For a minimal three-actuator design inspired by the elephant trunk, we establish an efficient and robust reduced-order method to generate reachability clouds of almost half a million points each to visualize the accessible workspace of a wide variety of manipulator designs. We generate an atlas of 256 reachability clouds by systematically varying the key design parameters including the fiber count, revolution, tapering angle, and activation magnitude. Our results demonstrate that reachability clouds not only offer an immediately clear perspective into the inverse problem of control, but also introduce powerful metrics to characterize reachable volumes, unreachable regions, and actuator redundancy to quantify the performance of soft slender robots. Our study provides new insights into the design of soft robotic systems with minimal activation and maximal reach with potential applications in medical robotics, flexible manufacturing, and the autonomous exploration of space.
- [2] arXiv:2403.18960 [pdf, other]
-
Title: Robust In-Hand Manipulation with Extrinsic ContactsComments: Accepted at ICRA 24Subjects: Robotics (cs.RO)
We present in-hand manipulation tasks where a robot moves an object in grasp, maintains its external contact mode with the environment, and adjusts its in-hand pose simultaneously. The proposed manipulation task leads to complex contact interactions which can be very susceptible to uncertainties in kinematic and physical parameters. Therefore, we propose a robust in-hand manipulation method, which consists of two parts. First, an in-gripper mechanics model that computes a na\"ive motion cone assuming all parameters are precise. Then, a robust planning method refines the motion cone to maintain desired contact mode regardless of parametric errors. Real-world experiments were conducted to illustrate the accuracy of the mechanics model and the effectiveness of the robust planning framework in the presence of kinematics parameter errors.
- [3] arXiv:2403.18965 [pdf, other]
-
Title: LORD: Large Models based Opposite Reward Design for Autonomous DrivingSubjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Reinforcement learning (RL) based autonomous driving has emerged as a promising alternative to data-driven imitation learning approaches. However, crafting effective reward functions for RL poses challenges due to the complexity of defining and quantifying good driving behaviors across diverse scenarios. Recently, large pretrained models have gained significant attention as zero-shot reward models for tasks specified with desired linguistic goals. However, the desired linguistic goals for autonomous driving such as "drive safely" are ambiguous and incomprehensible by pretrained models. On the other hand, undesired linguistic goals like "collision" are more concrete and tractable. In this work, we introduce LORD, a novel large models based opposite reward design through undesired linguistic goals to enable the efficient use of large pretrained models as zero-shot reward models. Through extensive experiments, our proposed framework shows its efficiency in leveraging the power of large pretrained models for achieving safe and enhanced autonomous driving. Moreover, the proposed approach shows improved generalization capabilities as it outperforms counterpart methods across diverse and challenging driving scenarios.
- [4] arXiv:2403.18972 [pdf, other]
-
Title: Risk-Aware Robotics: Tail Risk Measures in Planning, Control, and VerificationAuthors: Prithvi Akella, Anushri Dixit, Mohamadreza Ahmadi, Lars Lindemann, Margaret P. Chapman, George J. Pappas, Aaron D. Ames, Joel W. BurdickSubjects: Robotics (cs.RO); Systems and Control (eess.SY)
The need for a systematic approach to risk assessment has increased in recent years due to the ubiquity of autonomous systems that alter our day-to-day experiences and their need for safety, e.g., for self-driving vehicles, mobile service robots, and bipedal robots. These systems are expected to function safely in unpredictable environments and interact seamlessly with humans, whose behavior is notably challenging to forecast. We present a survey of risk-aware methodologies for autonomous systems. We adopt a contemporary risk-aware approach to mitigate rare and detrimental outcomes by advocating the use of tail risk measures, a concept borrowed from financial literature. This survey will introduce these measures and explain their relevance in the context of robotic systems for planning, control, and verification applications.
- [5] arXiv:2403.19006 [pdf, ps, other]
-
Title: Ensuring Safe Autonomy: Navigating the Future of Autonomous VehiclesAuthors: Patrick WolfComments: S. Bernardi, T. Zoppi (Editors), "Fast Abstracts and Student Forum Proceedings" - EDCC 2024 - 19th European Dependable Computing Conference, Leuven, Belgium, 8-11 AprilSubjects: Robotics (cs.RO)
Autonomous driving vehicles provide a vast potential for realizing use cases in the on-road and off-road domains. Consequently, remarkable solutions exist to autonomous systems' environmental perception and control. Nevertheless, proof of safety remains an open challenge preventing such machinery from being introduced to markets and deployed in real world. Traditional approaches for safety assurance of autonomously driving vehicles often lead to underperformance due to conservative safety assumptions that cannot handle the overall complexity. Besides, the more sophisticated safety systems rely on the vehicle's perception systems. However, perception is often unreliable due to uncertainties resulting from disturbances or the lack of context incorporation for data interpretation. Accordingly, this paper illustrates the potential of a modular, self-adaptive autonomy framework with integrated dynamic risk management to overcome the abovementioned drawbacks.
- [6] arXiv:2403.19010 [pdf, other]
-
Title: Gaussian Process-based Traversability Analysis for Terrain Mapless NavigationComments: This paper has been accepted for publication at 2024 IEEE International Conference on Robotics and Automation (ICRA 2024)Subjects: Robotics (cs.RO)
Efficient navigation through uneven terrain remains a challenging endeavor for autonomous robots. We propose a new geometric-based uneven terrain mapless navigation framework combining a Sparse Gaussian Process (SGP) local map with a Rapidly-Exploring Random Tree* (RRT*) planner. Our approach begins with the generation of a high-resolution SGP local map, providing an interpolated representation of the robot's immediate environment. This map captures crucial environmental variations, including height, uncertainties, and slope characteristics. Subsequently, we construct a traversability map based on the SGP representation to guide our planning process. The RRT* planner efficiently generates real-time navigation paths, avoiding untraversable terrain in pursuit of the goal. This combination of SGP-based terrain interpretation and RRT* planning enables ground robots to safely navigate environments with varying elevations and steep obstacles. We evaluate the performance of our proposed approach through robust simulation testing, highlighting its effectiveness in achieving safe and efficient navigation compared to existing methods.
- [7] arXiv:2403.19027 [pdf, other]
-
Title: Should I Help a Delivery Robot? Cultivating Prosocial Norms through ObservationsComments: Accepted as a Late Breaking Work at CHI'24Subjects: Robotics (cs.RO); Human-Computer Interaction (cs.HC)
We propose leveraging prosocial observations to cultivate new social norms to encourage prosocial behaviors toward delivery robots. With an online experiment, we quantitatively assess updates in norm beliefs regarding human-robot prosocial behaviors through observational learning. Results demonstrate the initially perceived normativity of helping robots is influenced by familiarity with delivery robots and perceptions of robots' social intelligence. Observing human-robot prosocial interactions notably shifts peoples' normative beliefs about prosocial actions; thereby changing their perceived obligations to offer help to delivery robots. Additionally, we found that observing robots offering help to humans, rather than receiving help, more significantly increased participants' feelings of obligation to help robots. Our findings provide insights into prosocial design for future mobility systems. Improved familiarity with robot capabilities and portraying them as desirable social partners can help foster wider acceptance. Furthermore, robots need to be designed to exhibit higher levels of interactivity and reciprocal capabilities for prosocial behavior.
- [8] arXiv:2403.19060 [pdf, other]
-
Title: Towards Human-Centered Construction Robotics: An RL-Driven Companion Robot For Contextually Assisting Carpentry WorkersComments: 8 pages, 9 figures. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessibleSubjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG)
In the dynamic construction industry, traditional robotic integration has primarily focused on automating specific tasks, often overlooking the complexity and variability of human aspects in construction workflows. This paper introduces a human-centered approach with a ``work companion rover" designed to assist construction workers within their existing practices, aiming to enhance safety and workflow fluency while respecting construction labor's skilled nature. We conduct an in-depth study on deploying a robotic system in carpentry formwork, showcasing a prototype that emphasizes mobility, safety, and comfortable worker-robot collaboration in dynamic environments through a contextual Reinforcement Learning (RL)-driven modular framework. Our research advances robotic applications in construction, advocating for collaborative models where adaptive robots support rather than replace humans, underscoring the potential for an interactive and collaborative human-robot workforce.
- [9] arXiv:2403.19093 [pdf, other]
-
Title: Task2Morph: Differentiable Task-inspired Framework for Contact-Aware Robot DesignComments: 9 pages, 10 figures, published to IROSJournal-ref: 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2023: 452-459Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI)
Optimizing the morphologies and the controllers that adapt to various tasks is a critical issue in the field of robot design, aka. embodied intelligence. Previous works typically model it as a joint optimization problem and use search-based methods to find the optimal solution in the morphology space. However, they ignore the implicit knowledge of task-to-morphology mapping which can directly inspire robot design. For example, flipping heavier boxes tends to require more muscular robot arms. This paper proposes a novel and general differentiable task-inspired framework for contact-aware robot design called Task2Morph. We abstract task features highly related to task performance and use them to build a task-to-morphology mapping. Further, we embed the mapping into a differentiable robot design process, where the gradient information is leveraged for both the mapping learning and the whole optimization. The experiments are conducted on three scenarios, and the results validate that Task2Morph outperforms DiffHand, which lacks a task-inspired morphology module, in terms of efficiency and effectiveness.
- [10] arXiv:2403.19102 [pdf, other]
-
Title: Automatic Fingerpad Customization for Precise and Stable Grasping of 3D-Print PartsSubjects: Robotics (cs.RO)
The rise in additive manufacturing comes with unique opportunities and challenges. Massive part customization and rapid design changes are made possible with additive manufacturing, however, manufacturing industries that desire the implementation of robotics automation to improve production efficiency could face challenges in the gripper design and grasp planning due to highly complex geometrical shapes resulting from massive part customization. Yet, current gripper design for such objects are often manual and rely on ad-hoc design intuition. This would be limiting as such grippers would lack the ability to grasp different objects or grasp points, which is important for practical implementations. Hence, we introduce a fast, end-to-end approach to customize rigid gripper fingerpads that could achieve precise and stable grasping for different objects at multiple grasp points. Our approach relies on two key components: (i) a method based on set Boolean operations, e.g. intersections, subtractions, and unions to extract object features and synthesize gripper surfaces that conform to different local shapes to form caging grasps; (ii) a method to evaluate the grasp quality of synthesized grippers. We experimentally demonstrate the validity of our approach by synthesizing fingerpads that, once mounted on a physical robot gripper, are able to grasp different objects at multiple grasp points, all with tightly constrained grasps.
- [11] arXiv:2403.19122 [pdf, other]
-
Title: Safety-Critical Planning and Control for Dynamic Obstacle Avoidance Using Control Barrier FunctionsComments: 9 pages, 4 figures. arXiv admin note: text overlap with arXiv:2210.04361Subjects: Robotics (cs.RO); Optimization and Control (math.OC)
Dynamic obstacle avoidance is a challenging topic for optimal control and optimization-based trajectory planning problems, especially when in a tight environment. Many existing works use control barrier functions (CBFs) to enforce safety constraints within control systems. Inside these works, CBFs are usually formulated under model predictive control (MPC) framework to anticipate future states and make informed decisions, or integrated with path planning algorithms as a safety enhancement tool. However, these approaches usually require knowledge of the obstacle boundary equations or have very slow computational efficiency. In this paper, we propose a novel framework to the iterative MPC with discrete-time CBFs (DCBFs) to generate a collision-free trajectory. The DCBFs are obtained from convex polyhedra generated in sequential grid maps, without the need to know the boundary equations of obstacles. Additionally, a path planning algorithm is incorporated into this framework to ensure the global optimality of the generated trajectory. We demonstrate through numerical examples that our framework enables a unicycle robot to safely and efficiently navigate through tight and dynamically changing environments, tackling both convex and nonconvex obstacles with remarkable computing efficiency and reliability in control and trajectory generation.
- [12] arXiv:2403.19129 [pdf, other]
-
Title: Stable Object Placing using Curl and Diff Features of Vision-based Tactile SensorsComments: 9 pages, 7 figuresSubjects: Robotics (cs.RO)
Ensuring stable object placement is crucial to prevent objects from toppling over, breaking, or causing spills. When an object makes initial contact to a surface, and some force is exerted, the moment of rotation caused by the instability of the object's placing can cause the object to rotate in a certain direction (henceforth referred to as direction of corrective rotation). Existing methods often employ a Force/Torque (F/T) sensor to estimate the direction of corrective rotation by detecting the moment of rotation as a torque. However, its effectiveness may be hampered by sensor noise and the tension of the external wiring of robot cables. To address these issues, we propose a method for stable object placing using GelSights, vision-based tactile sensors, as an alternative to F/T sensors. Our method estimates the direction of corrective rotation of objects using the displacement of the black dot pattern on the elastomeric surface of GelSight. We calculate the Curl from vector analysis, indicative of the rotational field magnitude and direction of the displacement of the black dots pattern. Simultaneously, we calculate the difference (Diff) of displacement between the left and right fingers' GelSight's black dots. Then, the robot can manipulate the objects' pose using Curl and Diff features, facilitating stable placing. Across experiments, handling 18 differently characterized objects, our method achieves precise placing accuracy (less than 1-degree error) in nearly 100% of cases. An accompanying video is available at the following link: https://youtu.be/fQbmCksVHlU
- [13] arXiv:2403.19293 [pdf, other]
-
Title: Adaptive Preload Control of Cable-Driven Parallel Robots for Handling TaskComments: Submitted to "Annals of Scientific Society for Assembly, Handling and Industrial Robotics" (MHI2024 conference/colloquium)Subjects: Robotics (cs.RO); Systems and Control (eess.SY)
This paper presents a method for dynamic adjustment of cable preloads based on the actuation redundancy of \acp{CDPR}, which allows increasing or decreasing the platform stiffness depending on task requirements. This is achieved by computing preload parameters with an extended nullspace formulation of the kinematics. The method facilitates the operator's ability to specify a defined preload within the operation space. The algorithms are implemented in a real-time environment, allowing for the use of optimization in hybrid position-force control. To validate the effectiveness of this approach, a simulation study is performed, and the obtained results are compared to existing methods. Furthermore, the method is investigated experimentally and compared with the conventional position-controlled operation of a cable robot. The results demonstrate the feasibility of adaptively adjusting cable preloads during platform motion and manipulation of additional objects.
- [14] arXiv:2403.19310 [pdf, other]
-
Title: MRNaB: Mixed Reality-based Robot Navigation Interface using Optical-see-through MR-beaconSubjects: Robotics (cs.RO)
Recent advancements in robotics have led to the development of numerous interfaces to enhance the intuitiveness of robot navigation. However, the reliance on traditional 2D displays imposes limitations on the simultaneous visualization of information. Mixed Reality (MR) technology addresses this issue by enhancing the dimensionality of information visualization, allowing users to perceive multiple pieces of information concurrently. This paper proposes Mixed reality-based robot navigation interface using an optical-see-through MR-beacon (MRNaB), a novel approach that incorporates an MR-beacon, situated atop the real-world environment, to function as a signal transmitter for robot navigation. This MR-beacon is designed to be persistent, eliminating the need for repeated navigation inputs for the same location. Our system is mainly constructed into four primary functions: "Add", "Move", "Delete", and "Select". These allow for the addition of a MR-beacon, location movement, its deletion, and the selection of MR-beacon for navigation purposes, respectively. The effectiveness of the proposed method was then validated through experiments by comparing it with the traditional 2D system. As the result, MRNaB was proven to increase the performance of the user when doing navigation to a certain place subjectively and objectively. For additional material, please check: https://mertcookimg.github.io/mrnab
- [15] arXiv:2403.19332 [pdf, other]
-
Title: Learning a Formally Verified Control Barrier Function in Stochastic EnvironmentComments: 8 pages, 3 figuresSubjects: Robotics (cs.RO)
Safety is a fundamental requirement of control systems. Control Barrier Functions (CBFs) are proposed to ensure the safety of the control system by constructing safety filters or synthesizing control inputs. However, the safety guarantee and performance of safe controllers rely on the construction of valid CBFs. Inspired by universal approximatability, CBFs are represented by neural networks, known as neural CBFs (NCBFs). This paper presents an algorithm for synthesizing formally verified continuous-time neural Control Barrier Functions in stochastic environments in a single step. The proposed training process ensures efficacy across the entire state space with only a finite number of data points by constructing a sample-based learning framework for Stochastic Neural CBFs (SNCBFs). Our methodology eliminates the need for post hoc verification by enforcing Lipschitz bounds on the neural network, its Jacobian, and Hessian terms. We demonstrate the effectiveness of our approach through case studies on the inverted pendulum system and obstacle avoidance in autonomous driving, showcasing larger safe regions compared to baseline methods.
- [16] arXiv:2403.19369 [pdf, other]
-
Title: RAIL: Robot Affordance Imagination with Large Language ModelsSubjects: Robotics (cs.RO)
This paper introduces an automatic affordance reasoning paradigm tailored to minimal semantic inputs, addressing the critical challenges of classifying and manipulating unseen classes of objects in household settings. Inspired by human cognitive processes, our method integrates generative language models and physics-based simulators to foster analytical thinking and creative imagination of novel affordances. Structured with a tripartite framework consisting of analysis, imagination, and evaluation, our system "analyzes" the requested affordance names into interaction-based definitions, "imagines" the virtual scenarios, and "evaluates" the object affordance. If an object is recognized as possessing the requested affordance, our method also predicts the optimal pose for such functionality, and how a potential user can interact with it. Tuned on only a few synthetic examples across 3 affordance classes, our pipeline achieves a very high success rate on affordance classification and functional pose prediction of 8 classes of novel objects, outperforming learning-based baselines. Validation through real robot manipulating experiments demonstrates the practical applicability of the imagined user interaction, showcasing the system's ability to independently conceptualize unseen affordances and interact with new objects and scenarios in everyday settings.
- [17] arXiv:2403.19375 [pdf, other]
-
Title: Multi-Agent Team Access Monitoring: Environments that Benefit from Target Information SharingSubjects: Robotics (cs.RO); Multiagent Systems (cs.MA)
Robotic access monitoring of multiple target areas has applications including checkpoint enforcement, surveillance and containment of fire and flood hazards. Monitoring access for a single target region has been successfully modeled as a minimum-cut problem. We generalize this model to support multiple target areas using two approaches: iterating on individual targets and examining the collections of targets holistically. Through simulation we measure the performance of each approach on different scenarios.
- [18] arXiv:2403.19460 [pdf, other]
-
Title: RiEMann: Near Real-Time SE(3)-Equivariant Robot Manipulation without Point Cloud SegmentationSubjects: Robotics (cs.RO); Artificial Intelligence (cs.AI)
We present RiEMann, an end-to-end near Real-time SE(3)-Equivariant Robot Manipulation imitation learning framework from scene point cloud input. Compared to previous methods that rely on descriptor field matching, RiEMann directly predicts the target poses of objects for manipulation without any object segmentation. RiEMann learns a manipulation task from scratch with 5 to 10 demonstrations, generalizes to unseen SE(3) transformations and instances of target objects, resists visual interference of distracting objects, and follows the near real-time pose change of the target object. The scalable action space of RiEMann facilitates the addition of custom equivariant actions such as the direction of turning the faucet, which makes articulated object manipulation possible for RiEMann. In simulation and real-world 6-DOF robot manipulation experiments, we test RiEMann on 5 categories of manipulation tasks with a total of 25 variants and show that RiEMann outperforms baselines in both task success rates and SE(3) geodesic distance errors on predicted poses (reduced by 68.6%), and achieves a 5.4 frames per second (FPS) network inference speed. Code and video results are available at https://riemann-web.github.io/.
- [19] arXiv:2403.19461 [pdf, other]
-
Title: Learning Sampling Distribution and Safety Filter for Autonomous Driving with VQ-VAE and Differentiable OptimizationSubjects: Robotics (cs.RO)
Sampling trajectories from a distribution followed by ranking them based on a specified cost function is a common approach in autonomous driving. Typically, the sampling distribution is hand-crafted (e.g a Gaussian, or a grid). Recently, there have been efforts towards learning the sampling distribution through generative models such as Conditional Variational Autoencoder (CVAE). However, these approaches fail to capture the multi-modality of the driving behaviour due to the Gaussian latent prior of the CVAE. Thus, in this paper, we re-imagine the distribution learning through vector quantized variational autoencoder (VQ-VAE), whose discrete latent-space is well equipped to capture multi-modal sampling distribution. The VQ-VAE is trained with demonstration data of optimal trajectories. We further propose a differentiable optimization based safety filter to minimally correct the VQVAE sampled trajectories to ensure collision avoidance. We use backpropagation through the optimization layers in a self-supervised learning set-up to learn good initialization and optimal parameters of the safety filter. We perform extensive comparisons with state-of-the-art CVAE-based baseline in dense and aggressive traffic scenarios and show a reduction of up to 12 times in collision-rate while being competitive in driving speeds.
- [20] arXiv:2403.19545 [pdf, other]
-
Title: Lamarckian Inheritance Improves Robot Evolution in Dynamic EnvironmentsComments: Nature. arXiv admin note: substantial text overlap with arXiv:2309.13099; text overlap with arXiv:2303.12594, arXiv:2309.14387Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI)
This study explores the integration of Lamarckian system into evolutionary robotics (ER), comparing it with the traditional Darwinian model across various environments. By adopting Lamarckian principles, where robots inherit learned traits, alongside Darwinian learning without inheritance, we investigate adaptation in dynamic settings. Our research, conducted in six distinct environmental setups, demonstrates that Lamarckian systems outperform Darwinian ones in adaptability and efficiency, particularly in challenging conditions. Our analysis highlights the critical role of the interplay between controller \& morphological evolution and environment adaptation, with parent-offspring similarities and newborn \&survivors before and after learning providing insights into the effectiveness of trait inheritance. Our findings suggest Lamarckian principles could significantly advance autonomous system design, highlighting the potential for more adaptable and robust robotic solutions in complex, real-world applications. These theoretical insights were validated using real physical robots, bridging the gap between simulation and practical application.
- [21] arXiv:2403.19578 [pdf, other]
-
Title: Keypoint Action Tokens Enable In-Context Imitation Learning in RoboticsSubjects: Robotics (cs.RO); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
We show that off-the-shelf text-based Transformers, with no additional training, can perform few-shot in-context visual imitation learning, mapping visual observations to action sequences that emulate the demonstrator's behaviour. We achieve this by transforming visual observations (inputs) and trajectories of actions (outputs) into sequences of tokens that a text-pretrained Transformer (GPT-4 Turbo) can ingest and generate, via a framework we call Keypoint Action Tokens (KAT). Despite being trained only on language, we show that these Transformers excel at translating tokenised visual keypoint observations into action trajectories, performing on par or better than state-of-the-art imitation learning (diffusion policies) in the low-data regime on a suite of real-world, everyday tasks. Rather than operating in the language domain as is typical, KAT leverages text-based Transformers to operate in the vision and action domains to learn general patterns in demonstration data for highly efficient imitation learning, indicating promising new avenues for repurposing natural language models for embodied tasks. Videos are available at https://www.robot-learning.uk/keypoint-action-tokens.
- [22] arXiv:2403.19602 [pdf, other]
-
Title: Behavior Trees in Industrial Applications: A Case Study in Underground Explosive ChargingAuthors: Mattias Hallen (1), Matteo Iovino (2), Shiva Sander-Tavallaey (2), Christian Smith (3) ((1) ABB Mining R&D, Umeå, Sweden, (2) ABB Corporate Research, Västerås, Sweden, (3) Division of Robotics, Perception and Learning, KTH - Royal Institute of Technology, Stockholm, Sweden)Subjects: Robotics (cs.RO)
In industrial applications Finite State Machines (FSMs) are often used to implement decision making policies for autonomous systems. In recent years, the use of Behavior Trees (BT) as an alternative policy representation has gained considerable attention. The benefits of using BTs over FSMs are modularity and reusability, enabling a system that is easy to extend and modify. However, there exists few published studies on successful implementations of BTs for industrial applications. This paper contributes with the lessons learned from implementing BTs in a complex industrial use case, where a robotic system assembles explosive charges and places them in holes on the rock face. The main result of the paper is that even if it is possible to model the entire system as a BT, combining BTs with FSMs can increase the readability and maintainability of the system. The benefit of such combination is remarked especially in the use case studied in this paper, where the full system cannot run autonomously but human supervision and feedback are needed.
- [23] arXiv:2403.19607 [pdf, other]
-
Title: SAID-NeRF: Segmentation-AIDed NeRF for Depth Completion of Transparent ObjectsAuthors: Avinash Ummadisingu, Jongkeum Choi, Koki Yamane, Shimpei Masuda, Naoki Fukaya, Kuniyuki TakahashiComments: 8 pages. An accompanying video is available at this https URLSubjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
Acquiring accurate depth information of transparent objects using off-the-shelf RGB-D cameras is a well-known challenge in Computer Vision and Robotics. Depth estimation/completion methods are typically employed and trained on datasets with quality depth labels acquired from either simulation, additional sensors or specialized data collection setups and known 3d models. However, acquiring reliable depth information for datasets at scale is not straightforward, limiting training scalability and generalization. Neural Radiance Fields (NeRFs) are learning-free approaches and have demonstrated wide success in novel view synthesis and shape recovery. However, heuristics and controlled environments (lights, backgrounds, etc) are often required to accurately capture specular surfaces. In this paper, we propose using Visual Foundation Models (VFMs) for segmentation in a zero-shot, label-free way to guide the NeRF reconstruction process for these objects via the simultaneous reconstruction of semantic fields and extensions to increase robustness. Our proposed method Segmentation-AIDed NeRF (SAID-NeRF) shows significant performance on depth completion datasets for transparent objects and robotic grasping.
- [24] arXiv:2403.19622 [pdf, other]
-
Title: RH20T-P: A Primitive-Level Robotic Dataset Towards Composable Generalization AgentsAuthors: Zeren Chen, Zhelun Shi, Xiaoya Lu, Lehan He, Sucheng Qian, Hao Shu Fang, Zhenfei Yin, Wanli Ouyang, Jing Shao, Yu Qiao, Cewu Lu, Lu ShengComments: 24 pages, 12 figures, 6 tablesSubjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
The ultimate goals of robotic learning is to acquire a comprehensive and generalizable robotic system capable of performing both seen skills within the training distribution and unseen skills in novel environments. Recent progress in utilizing language models as high-level planners has demonstrated that the complexity of tasks can be reduced through decomposing them into primitive-level plans, making it possible to generalize on novel robotic tasks in a composable manner. Despite the promising future, the community is not yet adequately prepared for composable generalization agents, particularly due to the lack of primitive-level real-world robotic datasets. In this paper, we propose a primitive-level robotic dataset, namely RH20T-P, which contains about 33000 video clips covering 44 diverse and complicated robotic tasks. Each clip is manually annotated according to a set of meticulously designed primitive skills, facilitating the future development of composable generalization agents. To validate the effectiveness of RH20T-P, we also construct a potential and scalable agent based on RH20T-P, called RA-P. Equipped with two planners specialized in task decomposition and motion planning, RA-P can adapt to novel physical skills through composable generalization. Our website and videos can be found at https://sites.google.com/view/rh20t-primitive/main. Dataset and code will be made available soon.
- [25] arXiv:2403.19648 [pdf, other]
-
Title: Human-compatible driving partners through data-regularized self-play reinforcement learningSubjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multiagent Systems (cs.MA)
A central challenge for autonomous vehicles is coordinating with humans. Therefore, incorporating realistic human agents is essential for scalable training and evaluation of autonomous driving systems in simulation. Simulation agents are typically developed by imitating large-scale, high-quality datasets of human driving. However, pure imitation learning agents empirically have high collision rates when executed in a multi-agent closed-loop setting. To build agents that are realistic and effective in closed-loop settings, we propose Human-Regularized PPO (HR-PPO), a multi-agent algorithm where agents are trained through self-play with a small penalty for deviating from a human reference policy. In contrast to prior work, our approach is RL-first and only uses 30 minutes of imperfect human demonstrations. We evaluate agents in a large set of multi-agent traffic scenes. Results show our HR-PPO agents are highly effective in achieving goals, with a success rate of 93%, an off-road rate of 3.5%, and a collision rate of 3%. At the same time, the agents drive in a human-like manner, as measured by their similarity to existing human driving logs. We also find that HR-PPO agents show considerable improvements on proxy measures for coordination with human driving, particularly in highly interactive scenarios. We open-source our code and trained agents at https://github.com/Emerge-Lab/nocturne_lab and provide demonstrations of agent behaviors at https://sites.google.com/view/driving-partners.
- [26] arXiv:2403.19649 [pdf, other]
-
Title: GraspXL: Generating Grasping Motions for Diverse Objects at ScaleComments: Project Page: this https URLSubjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
Human hands possess the dexterity to interact with diverse objects such as grasping specific parts of the objects and/or approaching them from desired directions. More importantly, humans can grasp objects of any shape without object-specific skills. Recent works synthesize grasping motions following single objectives such as a desired approach heading direction or a grasping area. Moreover, they usually rely on expensive 3D hand-object data during training and inference, which limits their capability to synthesize grasping motions for unseen objects at scale. In this paper, we unify the generation of hand-object grasping motions across multiple motion objectives, diverse object shapes and dexterous hand morphologies in a policy learning framework GraspXL. The objectives are composed of the graspable area, heading direction during approach, wrist rotation, and hand position. Without requiring any 3D hand-object interaction data, our policy trained with 58 objects can robustly synthesize diverse grasping motions for more than 500k unseen objects with a success rate of 82.2%. At the same time, the policy adheres to objectives, which enables the generation of diverse grasps per object. Moreover, we show that our framework can be deployed to different dexterous hands and work with reconstructed or generated objects. We quantitatively and qualitatively evaluate our method to show the efficacy of our approach. Our model and code will be available.
Cross-lists for Fri, 29 Mar 24
- [27] arXiv:2403.18947 (cross-list from cs.LG) [pdf, other]
-
Title: Self-Supervised Interpretable Sensorimotor Learning via Latent Functional ModularityComments: 10 pages, 6 figures. Accepted for an oral presentation at the AAAI 2024 Workshop on Explainable AI Approaches for Deep Reinforcement LearningSubjects: Machine Learning (cs.LG); Robotics (cs.RO)
We introduce MoNet, a novel method that combines end-to-end learning with modular network architectures for self-supervised and interpretable sensorimotor learning. MoNet is composed of three functionally distinct neural modules: Perception, Planning, and Control. Leveraging its inherent modularity through a cognition-guided contrastive loss function, MoNet efficiently learns task-specific decision-making processes in latent space, without requiring task-level supervision. Moreover, our method incorporates an online post-hoc explainability approach, which enhances the interpretability of the end-to-end inferences without a trade-off in sensorimotor performance. In real-world indoor environments, MoNet demonstrates effective visual autonomous navigation, surpassing baseline models by 11% to 47% in task specificity analysis. We further delve into the interpretability of our network through the post-hoc analysis of perceptual saliency maps and latent decision vectors. This offers insights into the incorporation of explainable artificial intelligence within the realm of robotic learning, encompassing both perceptual and behavioral perspectives.
- [28] arXiv:2403.19024 (cross-list from cs.LG) [pdf, other]
-
Title: Exploiting Symmetry in Dynamics for Model-Based Reinforcement Learning with Asymmetric RewardsSubjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Robotics (cs.RO); Systems and Control (eess.SY)
Recent work in reinforcement learning has leveraged symmetries in the model to improve sample efficiency in training a policy. A commonly used simplifying assumption is that the dynamics and reward both exhibit the same symmetry. However, in many real-world environments, the dynamical model exhibits symmetry independent of the reward model: the reward may not satisfy the same symmetries as the dynamics. In this paper, we investigate scenarios where only the dynamics are assumed to exhibit symmetry, extending the scope of problems in reinforcement learning and learning in control theory where symmetry techniques can be applied. We use Cartan's moving frame method to introduce a technique for learning dynamics which, by construction, exhibit specified symmetries. We demonstrate through numerical experiments that the proposed method learns a more accurate dynamical model.
- [29] arXiv:2403.19062 (cross-list from eess.SY) [pdf, other]
-
Title: GENESIS-RL: GEnerating Natural Edge-cases with Systematic Integration of Safety considerations and Reinforcement LearningAuthors: Hsin-Jung Yang, Joe Beck, Md Zahid Hasan, Ekin Beyazit, Subhadeep Chakraborty, Tichakorn Wongpiromsarn, Soumik SarkarSubjects: Systems and Control (eess.SY); Robotics (cs.RO)
In the rapidly evolving field of autonomous systems, the safety and reliability of the system components are fundamental requirements. These components are often vulnerable to complex and unforeseen environments, making natural edge-case generation essential for enhancing system resilience. This paper presents GENESIS-RL, a novel framework that leverages system-level safety considerations and reinforcement learning techniques to systematically generate naturalistic edge cases. By simulating challenging conditions that mimic the real-world situations, our framework aims to rigorously test entire system's safety and reliability. Although demonstrated within the autonomous driving application, our methodology is adaptable across diverse autonomous systems. Our experimental validation, conducted on high-fidelity simulator underscores the overall effectiveness of this framework.
- [30] arXiv:2403.19104 (cross-list from cs.CV) [pdf, other]
-
Title: CRKD: Enhanced Camera-Radar Object Detection with Cross-modality Knowledge DistillationComments: Accepted to CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
In the field of 3D object detection for autonomous driving, LiDAR-Camera (LC) fusion is the top-performing sensor configuration. Still, LiDAR is relatively high cost, which hinders adoption of this technology for consumer automobiles. Alternatively, camera and radar are commonly deployed on vehicles already on the road today, but performance of Camera-Radar (CR) fusion falls behind LC fusion. In this work, we propose Camera-Radar Knowledge Distillation (CRKD) to bridge the performance gap between LC and CR detectors with a novel cross-modality KD framework. We use the Bird's-Eye-View (BEV) representation as the shared feature space to enable effective knowledge distillation. To accommodate the unique cross-modality KD path, we propose four distillation losses to help the student learn crucial features from the teacher model. We present extensive evaluations on the nuScenes dataset to demonstrate the effectiveness of the proposed CRKD framework. The project page for CRKD is https://song-jingyu.github.io/CRKD.
- [31] arXiv:2403.19438 (cross-list from cs.CV) [pdf, other]
-
Title: SubjectDrive: Scaling Generative Data in Autonomous Driving via Subject ControlAuthors: Binyuan Huang, Yuqing Wen, Yucheng Zhao, Yaosi Hu, Yingfei Liu, Fan Jia, Weixin Mao, Tiancai Wang, Chi Zhang, Chang Wen Chen, Zhenzhong Chen, Xiangyu ZhangComments: Project page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
Autonomous driving progress relies on large-scale annotated datasets. In this work, we explore the potential of generative models to produce vast quantities of freely-labeled data for autonomous driving applications and present SubjectDrive, the first model proven to scale generative data production in a way that could continuously improve autonomous driving applications. We investigate the impact of scaling up the quantity of generative data on the performance of downstream perception models and find that enhancing data diversity plays a crucial role in effectively scaling generative data production. Therefore, we have developed a novel model equipped with a subject control mechanism, which allows the generative model to leverage diverse external data sources for producing varied and useful data. Extensive evaluations confirm SubjectDrive's efficacy in generating scalable autonomous driving training data, marking a significant step toward revolutionizing data production methods in this field.
- [32] arXiv:2403.19474 (cross-list from cs.CV) [pdf, other]
-
Title: SG-PGM: Partial Graph Matching Network with Semantic Geometric Fusion for 3D Scene Graph Alignment and Its Downstream TasksComments: 16 pages, 10 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
Scene graphs have been recently introduced into 3D spatial understanding as a comprehensive representation of the scene. The alignment between 3D scene graphs is the first step of many downstream tasks such as scene graph aided point cloud registration, mosaicking, overlap checking, and robot navigation. In this work, we treat 3D scene graph alignment as a partial graph-matching problem and propose to solve it with a graph neural network. We reuse the geometric features learned by a point cloud registration method and associate the clustered point-level geometric features with the node-level semantic feature via our designed feature fusion module. Partial matching is enabled by using a learnable method to select the top-k similar node pairs. Subsequent downstream tasks such as point cloud registration are achieved by running a pre-trained registration network within the matched regions. We further propose a point-matching rescoring method, that uses the node-wise alignment of the 3D scene graph to reweight the matching candidates from a pre-trained point cloud registration method. It reduces the false point correspondences estimated especially in low-overlapping cases. Experiments show that our method improves the alignment accuracy by 10~20% in low-overlap and random transformation scenarios and outperforms the existing work in multiple downstream tasks.
- [33] arXiv:2403.19549 (cross-list from cs.CV) [pdf, other]
-
Title: GlORIE-SLAM: Globally Optimized RGB-only Implicit Encoding Point Cloud SLAMSubjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
Recent advancements in RGB-only dense Simultaneous Localization and Mapping (SLAM) have predominantly utilized grid-based neural implicit encodings and/or struggle to efficiently realize global map and pose consistency. To this end, we propose an efficient RGB-only dense SLAM system using a flexible neural point cloud scene representation that adapts to keyframe poses and depth updates, without needing costly backpropagation. Another critical challenge of RGB-only SLAM is the lack of geometric priors. To alleviate this issue, with the aid of a monocular depth estimator, we introduce a novel DSPO layer for bundle adjustment which optimizes the pose and depth of keyframes along with the scale of the monocular depth. Finally, our system benefits from loop closure and online global bundle adjustment and performs either better or competitive to existing dense neural RGB SLAM methods in tracking, mapping and rendering accuracy on the Replica, TUM-RGBD and ScanNet datasets. The source code will be made available.
Replacements for Fri, 29 Mar 24
- [34] arXiv:2211.14361 (replaced) [pdf, other]
-
Title: gatekeeper: Online Safety Verification and Control for Nonlinear Systems in Dynamic EnvironmentsComments: Accepted at IROS 2023, 8 pages, 4 figures, Conditional Accept at IEEE T-ROSubjects: Robotics (cs.RO); Systems and Control (eess.SY)
- [35] arXiv:2309.01898 (replaced) [pdf, other]
-
Title: Safe Legged Locomotion using Collision Cone Control Barrier Functions (C3BFs)Comments: 5 Pages, 5 Figures. Updated citation. arXiv admin note: substantial text overlap with arXiv:2303.15871Subjects: Robotics (cs.RO); Systems and Control (eess.SY)
- [36] arXiv:2310.03394 (replaced) [pdf, other]
-
Title: Kinodynamic Motion Planning for a Team of Multirotors Transporting a Cable-Suspended Payload in Cluttered EnvironmentsComments: Submitted to IROS, 2024Subjects: Robotics (cs.RO); Multiagent Systems (cs.MA)
- [37] arXiv:2311.11016 (replaced) [pdf, other]
-
Title: SNI-SLAM: Semantic Neural Implicit SLAMAuthors: Siting Zhu, Guangming Wang, Hermann Blum, Jiuming Liu, Liang Song, Marc Pollefeys, Hesheng WangComments: Accepted to CVPR 2024Subjects: Robotics (cs.RO)
- [38] arXiv:2312.11598 (replaced) [pdf, other]
-
Title: SkillDiffuser: Interpretable Hierarchical Planning via Skill Abstractions in Diffusion-Based Task ExecutionComments: Accepted by CVPR 2024. Camera ready version. Project page: this https URLSubjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [39] arXiv:2312.12566 (replaced) [pdf, other]
-
Title: Johnsen-Rahbek Capstan Clutch: A High Torque Electrostatic ClutchSubjects: Robotics (cs.RO); Systems and Control (eess.SY)
- [40] arXiv:2402.04061 (replaced) [pdf, other]
-
Title: TopoNav: Topological Navigation for Efficient Exploration in Sparse Reward EnvironmentsAuthors: Jumman Hossain, Abu-Zaher Faridee, Nirmalya Roy, Jade Freeman, Timothy Gregory, Theron T. TroutComments: Paper under review for IROS 2024Subjects: Robotics (cs.RO); Machine Learning (cs.LG)
- [41] arXiv:2403.11742 (replaced) [pdf, other]
-
Title: Accelerating Model Predictive Control for Legged Robots through Distributed OptimizationSubjects: Robotics (cs.RO)
- [42] arXiv:2403.16535 (replaced) [pdf, other]
-
Title: Arm-Constrained Curriculum Learning for Loco-Manipulation of the Wheel-Legged RobotAuthors: Zifan Wang, Yufei Jia, Lu Shi, Haoyu Wang, Haizhou Zhao, Xueyang Li, Jinni Zhou, Jun Ma, Guyue ZhouSubjects: Robotics (cs.RO)
- [43] arXiv:2308.10757 (replaced) [pdf, other]
-
Title: To Whom are You Talking? A Deep Learning Model to Endow Social Robots with Addressee Estimation SkillsComments: Accepted v. of IJCNN 2023 publication. Funded by the Horizon Europe project TERAIS (G.A. 101079338), the UKRI Node on Trust (EP/V026682/1), the EU projects TRAINCREASE and MUSAE, and the US project THRIVE++. Cite: this https URL Code: this https URL Data: this https URL 10 pages, 8 Figures, 3 TablesJournal-ref: 2023 International Joint Conference on Neural Networks (IJCNN), pp. 1-10Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Robotics (cs.RO)
- [44] arXiv:2311.03284 (replaced) [pdf, other]
-
Title: Safe Collective Control under Noisy Inputs and Competing Constraints via Non-Smooth Barrier FunctionsComments: Accepted to the 2024 European Control Conference. See Section VI.B (in particular, Theorem 1, Proposition 2, and Remark 2) for updates incorporating new results (from Reference 3) on almost-sure safety of ZCBFsSubjects: Systems and Control (eess.SY); Robotics (cs.RO)
- [45] arXiv:2402.19161 (replaced) [pdf, other]
-
Title: MemoNav: Working Memory Model for Visual NavigationComments: Accepted to CVPR 2024. Code: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
- [46] arXiv:2403.09412 (replaced) [pdf, other]
-
Title: OpenGraph: Open-Vocabulary Hierarchical 3D Graph Representation in Large-Scale Outdoor EnvironmentsSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
- [47] arXiv:2403.17633 (replaced) [pdf, other]
-
Title: UADA3D: Unsupervised Adversarial Domain Adaptation for 3D Object Detection with Sparse LiDAR and Large Domain GapsSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[ showing up to 2000 entries per page: fewer | more ]
Disable MathJax (What is MathJax?)
Links to: arXiv, form interface, find, cs, recent, 2403, contact, help (Access key information)