We gratefully acknowledge support from
the Simons Foundation and member institutions.

Robotics

New submissions

[ total of 28 entries: 1-28 ]
[ showing up to 1000 entries per page: fewer | more ]

New submissions for Thu, 9 Feb 23

[1]  arXiv:2302.03728 [pdf, other]
Title: Magnetic Ball Chain Robots for Endoluminal Interventions
Subjects: Robotics (cs.RO); Systems and Control (eess.SY)

This paper introduces a novel class of hyperredundant robots comprised of chains of permanently magnetized spheres enclosed in a cylindrical polymer skin. With their shape controlled using an externally-applied magnetic field, the spherical joints of these robots enable them to bend to very small radii of curvature. These robots can be used as steerable tips for endoluminal instruments. A kinematic model is derived based on minimizing magnetic and elastic potential energy. Simulation is used to demonstrate the enhanced steerability of these robots in comparison to magnetic soft continuum robots designed using either distributed or lumped magnetic material. Experiments are included to validate the model and to demonstrate the steering capability of ball chain robots in bifurcating channels.

[2]  arXiv:2302.03793 [pdf, other]
Title: Self-Supervised Unseen Object Instance Segmentation via Long-Term Robot Interaction
Comments: 11 pages, 7 figures, 5 tables
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

We introduce a novel robotic system for improving unseen object instance segmentation in the real world by leveraging long-term robot interaction with objects. Previous approaches either grasp or push an object and then obtain the segmentation mask of the grasped or pushed object after one action. Instead, our system defers the decision on segmenting objects after a sequence of robot pushing actions. By applying multi-object tracking and video object segmentation on the images collected via robot pushing, our system can generate segmentation masks of all the objects in these images in a self-supervised way. These include images where objects are very close to each other, and segmentation errors usually occur on these images for existing object segmentation networks. We demonstrate the usefulness of our system by fine-tuning segmentation networks trained on synthetic data with real-world data collected by our system. We show that, after fine-tuning, the segmentation accuracy of the networks is significantly improved both in the same domain and across different domains. In addition, we verify that the fine-tuned networks improve top-down robotic grasping of unseen objects in the real world.

[3]  arXiv:2302.03901 [pdf, other]
Title: Guided Learning from Demonstration for Robust Transferability
Comments: 7 Pages, 7 Figures, accepted to the 2023 IEEE International Conference on Robotics and Automation (ICRA)
Subjects: Robotics (cs.RO)

Learning from demonstration (LfD) has the potential to greatly increase the applicability of robotic manipulators in modern industrial applications. Recent progress in LfD methods have put more emphasis in learning robustness than in guiding the demonstration itself in order to improve robustness. The latter is particularly important to consider when the target system reproducing the motion is structurally different to the demonstration system, as some demonstrated motions may not be reproducible. In light of this, this paper introduces a new guided learning from demonstration paradigm where an interactive graphical user interface (GUI) guides the user during demonstration, preventing them from demonstrating non-reproducible motions. The key aspect of our approach is determining the space of reproducible motions based on a motion planning framework which finds regions in the task space where trajectories are guaranteed to be of bounded length. We evaluate our method on two different setups with a six-degree-of-freedom (DOF) UR5 as the target system. First our method is validated using a seven-DOF Sawyer as the demonstration system. Then an extensive user study is carried out where several participants are asked to demonstrate, with and without guidance, a mock weld task using a hand held tool tracked by a VICON system. With guidance users were able to always carry out the task successfully in comparison to only 44% of the time without guidance.

[4]  arXiv:2302.03939 [pdf, other]
Title: Learning Interaction-aware Motion Prediction Model for Decision-making in Autonomous Driving
Subjects: Robotics (cs.RO)

Predicting the behaviors of other road users is crucial to safe and intelligent decision-making for autonomous vehicles (AVs). However, most motion prediction models ignore the influence of the AV's actions and the planning module has to treat other agents as unalterable moving obstacles. To address this problem, this paper proposes an interaction-aware motion prediction model that is able to predict other agents' future trajectories according to the ego agent's future plan, i.e., their reactions to the ego's actions. Specifically, we employ Transformers to effectively encode the driving scene and incorporate the AV's plan in decoding the predicted trajectories. To train the model to accurately predict the reactions of other agents, we develop an online learning framework, where the ego agent explores the environment and collects other agents' reactions to itself. We validate the decision-making and learning framework in three highly interactive simulated driving scenarios. The results reveal that our decision-making method significantly outperforms the reinforcement learning methods in terms of data efficiency and performance. We also find that using the interaction-aware model can bring better performance than the non-interaction-aware model and the exploration process helps improve the success rate in testing.

[5]  arXiv:2302.03942 [pdf, other]
Title: Challenges in Designing Teacher Robots with Motivation Based Gestures
Comments: The manuscript was submitted, accepted and also presented at the RCHI workshop at the 17th Annual ACM/IEEE International Conference on Human-Robot Interaction (HRI 2022). this https URL However, it was not published in the proceedings or journal and current google site does not work to view our contribution
Subjects: Robotics (cs.RO)

Humanoid robots are increasingly being integrated into learning contexts to assist teaching and learning. However, challenges remain how to design and incorporate such robots in an educational context. As an important part of teaching includes monitoring the motivational and emotional state of the learner and adapting the interaction style and learning content accordingly, in this paper, we discuss the role of gestures displayed by a humanoid robot (i.e., Pepper robot) in a learning and teaching context and present our ongoing research on designing and developing a teacher robot.

[6]  arXiv:2302.03954 [pdf]
Title: Temporal Video-Language Alignment Network for Reward Shaping in Reinforcement Learning
Subjects: Robotics (cs.RO)

Designing appropriate reward functions for Reinforcement Learning (RL) approaches has been a significant problem, especially for complex environments such as Atari games. Utilizing natural language instructions to provide intermediate rewards to RL agents in a process known as reward shaping can help the agent in reaching the goal state faster. In this work, we propose a natural language-based reward shaping approach that maps trajectories from the Montezuma's Revenge game environment to corresponding natural language instructions using an extension of the LanguagE-Action Reward Network (LEARN) framework. These trajectory-language mappings are further used to generate intermediate rewards which are integrated into reward functions that can be utilized to learn an optimal policy for any standard RL algorithms. For a set of 15 tasks from Atari's Montezuma's Revenge game, the Ext-LEARN approach leads to the successful completion of tasks more often on average than the reward shaping approach that uses the LEARN framework and performs even better than the reward shaping framework without natural language-based rewards.

[7]  arXiv:2302.03971 [pdf, other]
Title: Communicative Robot Signals: Presenting a New Typology for Human-Robot Interaction
Comments: Full paper at HRI '23, March 13-16, 2023, Stockholm, Sweden
Subjects: Robotics (cs.RO)

We present a new typology for classifying signals from robots when they communicate with humans. For inspiration, we use ethology, the study of animal behaviour and previous efforts from literature as guides in defining the typology. The typology is based on communicative signals that consist of five properties: the origin where the signal comes from, the deliberateness of the signal, the signal's reference, the genuineness of the signal, and its clarity (i.e., how implicit or explicit it is). Using the accompanying worksheet, the typology is straightforward to use to examine communicative signals from previous human-robot interactions and provides guidance for designers to use the typology when designing new robot behaviours.

[8]  arXiv:2302.04031 [pdf, other]
Title: FR-LIO: Fast and Robust Lidar-Inertial Odometry by Tightly-Coupled Iterated Kalman Smoother and Robocentric Voxels
Subjects: Robotics (cs.RO)

This paper presents a fast lidar-inertial odometry (LIO) system that is robust to aggressive motion. To achieve robust tracking in aggressive motion scenes, we exploit the continuous scanning property of lidar to adaptively divide the full scan into multiple partial scans (named sub-frames) according to the motion intensity. And to avoid the degradation of sub-frames resulting from insufficient constraints, we propose a robust state estimation method based on a tightly-coupled iterated error state Kalman smoother (ESKS) framework. Furthermore, we propose a robocentric voxel map (RC-Vox) to improve the system's efficiency. The RC-Vox allows efficient maintenance of map points and k nearest neighbor (k-NN) queries by mapping local map points into a fixed-size, two-layer 3D array structure. Extensive experiments were conducted on 27 sequences from 4 public datasets and our own dataset. The results show that our system can achieve stable tracking in aggressive motion scenes that cannot be handled by other state-of-the-art methods, while our system can achieve competitive performance with these methods in general scenes. In terms of efficiency, the RC-Vox allows our system to achieve the fastest speed compared with the current advanced LIO systems.

[9]  arXiv:2302.04094 [pdf, other]
Title: Learning Graph-Enhanced Commander-Executor for Multi-Agent Navigation
Comments: This paper is accepted by aamas 2023
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Multiagent Systems (cs.MA)

This paper investigates the multi-agent navigation problem, which requires multiple agents to reach the target goals in a limited time. Multi-agent reinforcement learning (MARL) has shown promising results for solving this issue. However, it is inefficient for MARL to directly explore the (nearly) optimal policy in the large search space, which is exacerbated as the agent number increases (e.g., 10+ agents) or the environment is more complex (e.g., 3D simulator). Goal-conditioned hierarchical reinforcement learning (HRL) provides a promising direction to tackle this challenge by introducing a hierarchical structure to decompose the search space, where the low-level policy predicts primitive actions in the guidance of the goals derived from the high-level policy. In this paper, we propose Multi-Agent Graph-Enhanced Commander-Executor (MAGE-X), a graph-based goal-conditioned hierarchical method for multi-agent navigation tasks. MAGE-X comprises a high-level Goal Commander and a low-level Action Executor. The Goal Commander predicts the probability distribution of goals and leverages them to assign each agent the most appropriate final target. The Action Executor utilizes graph neural networks (GNN) to construct a subgraph for each agent that only contains crucial partners to improve cooperation. Additionally, the Goal Encoder in the Action Executor captures the relationship between the agent and the designated goal to encourage the agent to reach the final target. The results show that MAGE-X outperforms the state-of-the-art MARL baselines with a 100% success rate with only 3 million training steps in multi-agent particle environments (MPE) with 50 agents, and at least a 12% higher success rate and 2x higher data efficiency in a more complicated quadrotor 3D navigation task.

Cross-lists for Thu, 9 Feb 23

[10]  arXiv:2302.03724 (cross-list from eess.SY) [pdf, other]
Title: Assigning Optimal Integer Harmonic Periods to Hard Real Time Tasks
Comments: 10 pages
Subjects: Systems and Control (eess.SY); Performance (cs.PF); Robotics (cs.RO)

Selecting period values for tasks is a very important step in the design process of a real-time system, especially due to the significance of its impact on system schedulability. It is well known that, under RMS, the utilization bound for a harmonic task set is 100%. Also, polynomial-time algorithms have been developed for response-time analysis of harmonic task sets. In practice, the largest acceptable value for the period of a task is determined by the performance and safety requirements of the application. In this paper, we address the problem of assigning harmonic periods to a task set such that every task gets assigned an integer period less than or equal to its application specified upper bound and the task utilization of every task is less than 1. We focus on integer solutions given the discrete nature of time in real-time computer systems.
We first express this problem of assigning harmonic periods to a task set as a discrete piecewise optimization problem. We then present the 'Discrete Piecewise Harmonic Search' (DPHS) algorithm that outputs an optimal harmonic task assignment. We then define conditions for a metric to be rational for harmonization. We show that commonly used metrics like, the total percentage error (TPE), total system utilization (TSU), first order error (FOE), and maximum percentage error (MPE), are rational. We next prove that the DPHS algorithm finds the optimal feasible assignment, if one exists, for these rational metrics. We apply the DPHS algorithm to harmonize task sets used in real-world applications to highlight its benefits. We compare the performance of the DPHS algorithm against a brute-force search and find that the DPHS searches up to 94\% fewer task sets than the brute-force search that obtains the optimal solution.

[11]  arXiv:2302.03802 (cross-list from cs.CV) [pdf, other]
Title: Standing Between Past and Future: Spatio-Temporal Modeling for Multi-Camera 3D Multi-Object Tracking
Comments: 15 pages, 8 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Robotics (cs.RO)

This work proposes an end-to-end multi-camera 3D multi-object tracking (MOT) framework. It emphasizes spatio-temporal continuity and integrates both past and future reasoning for tracked objects. Thus, we name it "Past-and-Future reasoning for Tracking" (PF-Track). Specifically, our method adapts the "tracking by attention" framework and represents tracked instances coherently over time with object queries. To explicitly use historical cues, our "Past Reasoning" module learns to refine the tracks and enhance the object features by cross-attending to queries from previous frames and other objects. The "Future Reasoning" module digests historical information and predicts robust future trajectories. In the case of long-term occlusions, our method maintains the object positions and enables re-association by integrating motion predictions. On the nuScenes dataset, our method improves AMOTA by a large margin and remarkably reduces ID-Switches by 90% compared to prior approaches, which is an order of magnitude less. The code and models are made available at https://github.com/TRI-ML/PF-Track.

[12]  arXiv:2302.04013 (cross-list from cs.LG) [pdf, other]
Title: Zero-shot Sim2Real Adaptation Across Environments
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Robotics (cs.RO)

Simulation based learning often provides a cost-efficient recourse to reinforcement learning applications in robotics. However, simulators are generally incapable of accurately replicating real-world dynamics, and thus bridging the sim2real gap is an important problem in simulation based learning. Current solutions to bridge the sim2real gap involve hybrid simulators that are augmented with neural residual models. Unfortunately, they require a separate residual model for each individual environment configuration (i.e., a fixed setting of environment variables such as mass, friction etc.), and thus are not transferable to new environments quickly. To address this issue, we propose a Reverse Action Transformation (RAT) policy which learns to imitate simulated policies in the real-world. Once learnt from a single environment, RAT can then be deployed on top of a Universal Policy Network to achieve zero-shot adaptation to new environments. We empirically evaluate our approach in a set of continuous control tasks and observe its advantage as a few-shot and zero-shot learner over competing baselines.

[13]  arXiv:2302.04163 (cross-list from eess.SY) [pdf, ps, other]
Title: Task Space Control of Robot Manipulators based on Visual SLAM
Subjects: Systems and Control (eess.SY); Robotics (cs.RO)

This paper aims to address the open problem of designing a globally stable vision-based controller for robot manipulators. Accordingly, based on a hybrid mechanism, this paper proposes a novel task-space control law attained by taking the gradient of a potential function in SE(3). The key idea is to employ the Visual Simultaneous Localization and Mapping (VSLAM) algorithm to estimate a robot pose. The estimated robot pose is then used in the proposed hybrid controller as feedback information. Invoking Barbalats lemma and Lyapunov's stability theorem, it is guaranteed that the resulting closed-loop system is globally asymptotically stable, which is the main accomplishment of the proposed structure. Simulation studies are conducted on a six degrees of freedom (6-DOF) robot manipulator to demonstrate the effectiveness and validate the performance of the proposed VSLAM-based control scheme.

[14]  arXiv:2302.04233 (cross-list from cs.CV) [pdf, other]
Title: SkyEye: Self-Supervised Bird's-Eye-View Semantic Mapping Using Monocular Frontal View Images
Comments: 14 pages, 7 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)

Bird's-Eye-View (BEV) semantic maps have become an essential component of automated driving pipelines due to the rich representation they provide for decision-making tasks. However, existing approaches for generating these maps still follow a fully supervised training paradigm and hence rely on large amounts of annotated BEV data. In this work, we address this limitation by proposing the first self-supervised approach for generating a BEV semantic map using a single monocular image from the frontal view (FV). During training, we overcome the need for BEV ground truth annotations by leveraging the more easily available FV semantic annotations of video sequences. Thus, we propose the SkyEye architecture that learns based on two modes of self-supervision, namely, implicit supervision and explicit supervision. Implicit supervision trains the model by enforcing spatial consistency of the scene over time based on FV semantic sequences, while explicit supervision exploits BEV pseudolabels generated from FV semantic annotations and self-supervised depth estimates. Extensive evaluations on the KITTI-360 dataset demonstrate that our self-supervised approach performs on par with the state-of-the-art fully supervised methods and achieves competitive results using only 1% of direct supervision in the BEV compared to fully supervised approaches. Finally, we publicly release both our code and the BEV datasets generated from the KITTI-360 and Waymo datasets.

Replacements for Thu, 9 Feb 23

[15]  arXiv:2202.00129 (replaced) [pdf, other]
Title: Fundamental Performance Limits for Sensor-Based Robot Control and Policy Learning
Comments: Submitted to the International Journal of Robotics Research for review
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Information Theory (cs.IT); Machine Learning (cs.LG); Optimization and Control (math.OC)
[16]  arXiv:2209.00414 (replaced) [pdf, other]
Title: Hey, robot! An investigation of getting robot's attention through touch
Comments: 14 pages, 4 figures; 'International Conference on Social Robotics (ICSR)'. Lecture Notes in Computer Science, vol 13817. Springer, Cham, pp. 388-401 (2022)
Subjects: Robotics (cs.RO); Human-Computer Interaction (cs.HC)
[17]  arXiv:2209.07899 (replaced) [pdf, other]
Title: Versatile Skill Control via Self-supervised Adversarial Imitation of Unlabeled Mixed Motions
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[18]  arXiv:2209.08349 (replaced) [src]
Title: Reinforcement Learning for Self-exploration in Narrow Spaces
Comments: Plan to redo the experiments in section 4, and will use different evaluation methods due to expert comments
Subjects: Robotics (cs.RO)
[19]  arXiv:2209.08499 (replaced) [pdf, other]
Title: Multi-segmented Adaptive Feet for Versatile Legged Locomotion in Natural Terrain
Subjects: Robotics (cs.RO)
[20]  arXiv:2301.05206 (replaced) [pdf, other]
Title: ImMesh: An Immediate LiDAR Localization and Meshing Framework
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[21]  arXiv:2301.09708 (replaced) [pdf, other]
Title: Koopman Operators for Modeling and Control of Soft Robotics
Comments: Accepted to Current Robotics Report
Subjects: Robotics (cs.RO)
[22]  arXiv:2301.12934 (replaced) [pdf, other]
Title: Coarse-to-fine Hybrid 3D Mapping System with Co-calibrated Omnidirectional Camera and Non-repetitive LiDAR
Comments: Accepted by IEEE Robotics and Automation Letters (RA-L)
Subjects: Robotics (cs.RO)
[23]  arXiv:2302.01703 (replaced) [pdf, other]
Title: DAMS-LIO: A Degeneration-Aware and Modular Sensor-Fusion LiDAR-inertial Odometry
Subjects: Robotics (cs.RO)
[24]  arXiv:2302.02956 (replaced) [pdf, other]
Title: RoboCup 2022 AdultSize Winner NimbRo: Upgraded Perception, Capture Steps Gait and Phase-based In-walk Kicks
Journal-ref: In: RoboCup 2022: Robot World Cup XXV. LNCS 13561, Springer, May 2023
Subjects: Robotics (cs.RO)
[25]  arXiv:2209.00465 (replaced) [pdf, other]
Title: On Grounded Planning for Embodied Tasks with Language Models
Comments: Accepted to AAAI 2023
Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG); Robotics (cs.RO)
[26]  arXiv:2210.06518 (replaced) [pdf, other]
Title: Semi-Supervised Offline Reinforcement Learning with Action-Free Trajectories
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[27]  arXiv:2212.01346 (replaced) [pdf, other]
Title: Guaranteed Conformance of Neurosymbolic Models to Natural Constraints
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[28]  arXiv:2302.02367 (replaced) [pdf, other]
Title: FastPillars: A Deployment-friendly Pillar-based 3D Detector
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[ total of 28 entries: 1-28 ]
[ showing up to 1000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, recent, 2302, contact, help  (Access key information)