We gratefully acknowledge support from
the Simons Foundation and member institutions.

Computer Science

New submissions

[ total of 395 entries: 1-395 ]
[ showing up to 2000 entries per page: fewer | more ]

New submissions for Thu, 23 Sep 21

[1]  arXiv:2109.10347 [pdf, other]
Title: Security-Hardening Software Libraries with Ada and SPARK -- A TCP Stack Use Case
Comments: 37 pages, 4 figures, 2 tables, white paper, Software/Program Verification
Subjects: Cryptography and Security (cs.CR)

This white paper demonstrates how the assurance, reliability, and security of an existing professional-grade, open-source embedded TCP/IP stack implementation written in the C programming language is significantly enhanced by adopting the SPARK technology. A multifaceted approach achieves this. Firstly, the TCP layer's C code is being replaced with formally verified SPARK, a subset of the Ada programming language supported by formal verification tools. Then the lower layers, still written in C and on which the TCP layer depends, are modeled using SPARK contracts and validated using symbolic execution with KLEE. Finally, formal contracts for the upper layers are defined to call the TCP layer. The work allowed the detection and correction of two bugs in the TCP layer. In an increasingly connected world, where Cyber Security is of paramount importance, the powerful approach detailed in this work can be applied to any existing critical C library to harden their reliability and security significantly.

[2]  arXiv:2109.10376 [pdf, other]
Title: Learning through structure: towards deep neuromorphic knowledge graph embeddings
Comments: Accepted for publication at the International Conference on Neuromorphic Computing (ICNC 2021)
Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Neurons and Cognition (q-bio.NC)

Computing latent representations for graph-structured data is an ubiquitous learning task in many industrial and academic applications ranging from molecule synthetization to social network analysis and recommender systems. Knowledge graphs are among the most popular and widely used data representations related to the Semantic Web. Next to structuring factual knowledge in a machine-readable format, knowledge graphs serve as the backbone of many artificial intelligence applications and allow the ingestion of context information into various learning algorithms. Graph neural networks attempt to encode graph structures in low-dimensional vector spaces via a message passing heuristic between neighboring nodes. Over the recent years, a multitude of different graph neural network architectures demonstrated ground-breaking performances in many learning tasks. In this work, we propose a strategy to map deep graph learning architectures for knowledge graph reasoning to neuromorphic architectures. Based on the insight that randomly initialized and untrained (i.e., frozen) graph neural networks are able to preserve local graph structures, we compose a frozen neural network with shallow knowledge graph embedding models. We experimentally show that already on conventional computing hardware, this leads to a significant speedup and memory reduction while maintaining a competitive performance level. Moreover, we extend the frozen architecture to spiking neural networks, introducing a novel, event-based and highly sparse knowledge graph embedding algorithm that is suitable for implementation in neuromorphic hardware.

[3]  arXiv:2109.10380 [pdf, other]
Title: Deep Policies for Online Bipartite Matching: A Reinforcement Learning Approach
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)

From assigning computing tasks to servers and advertisements to users, sequential online matching problems arise in a wide variety of domains. The challenge in online matching lies in making irrevocable assignments while there is uncertainty about future inputs. In the theoretical computer science literature, most policies are myopic or greedy in nature. In real-world applications where the matching process is repeated on a regular basis, the underlying data distribution can be leveraged for better decision-making. We present an end-to-end Reinforcement Learning framework for deriving better matching policies based on trial-and-error on historical data. We devise a set of neural network architectures, design feature representations, and empirically evaluate them across two online matching problems: Edge-Weighted Online Bipartite Matching and Online Submodular Bipartite Matching. We show that most of the learning approaches perform significantly better than classical greedy algorithms on four synthetic and real-world datasets. Our code is publicly available at https://github.com/lyeskhalil/CORL.git.

[4]  arXiv:2109.10385 [pdf, other]
Title: Learning to Guide Human Attention on Mobile Telepresence Robots with 360 degree Vision
Subjects: Robotics (cs.RO)

Mobile telepresence robots (MTRs) allow people to navigate and interact with a remote environment that is in a place other than the person's true location. Thanks to the recent advances in 360 degree vision, many MTRs are now equipped with an all-degree visual perception capability. However, people's visual field horizontally spans only about 120 degree of the visual field captured by the robot. To bridge this observability gap toward human-MTR shared autonomy, we have developed a framework, called GHAL360, to enable the MTR to learn a goal-oriented policy from reinforcements for guiding human attention using visual indicators. Three telepresence environments were constructed using datasets that are extracted from Matterport3D and collected from a real robot respectively. Experimental results show that GHAL360 outperformed the baselines from the literature in the efficiency of a human-MTR team completing target search tasks.

[5]  arXiv:2109.10387 [pdf, other]
Title: Toward Reusable Science with Readable Code and Reproducibility
Comments: 10 pages, 10 figures
Subjects: Digital Libraries (cs.DL)

An essential part of research and scientific communication is researchers' ability to reproduce the results of others. While there have been increasing standards for authors to make data and code available, many of these files are hard to re-execute in practice, leading to a lack of research reproducibility. This poses a major problem for students and researchers in the same field who cannot leverage the previously published findings for study or further inquiry. To address this, we propose an open-source platform named RE3 that helps improve the reproducibility and readability of research projects involving R code. Our platform incorporates assessing code readability with a machine learning model trained on a code readability survey and an automatic containerization service that executes code files and warns users of reproducibility errors. This process helps ensure the reproducibility and readability of projects and therefore fast-track their verification and reuse.

[6]  arXiv:2109.10390 [pdf, other]
Title: Coast Sargassum Level Estimation from Smartphone Pictures
Comments: Under preparation for submission
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Since 2011, significant and atypical arrival of two species of surface dwelling algae, Sargassum natans and Sargassum Fluitans, have been detected in the Mexican Caribbean. This massive accumulation of algae has had a great environmental and economic impact. Therefore, for the government, ecologists, and local businesses, it is important to keep track of the amount of sargassum that arrives on the Caribbean coast. High-resolution satellite imagery is expensive or may be time delayed. Therefore, we propose to estimate the amount of sargassum based on ground-level smartphone photographs. From the computer vision perspective, the problem is quite difficult since no information about the 3D world is provided, in consequence, we have to model it as a classification problem, where a set of five labels define the amount. For this purpose, we have built a dataset with more than one thousand examples from public forums such as Facebook or Instagram and we have tested several state-of-the-art convolutional networks. As a result, the VGG network trained under fine-tuning showed the best performance. Even though the reached accuracy could be improved with more examples, the current prediction distribution is narrow, so the predictions are adequate for keeping a record and taking quick ecological actions.

[7]  arXiv:2109.10392 [pdf, other]
Title: Multi-Modal Model Predictive Control through Batch Non-Holonomic Trajectory Optimization: Application to Highway Driving
Comments: Submitted to IEEE Robotics and Automation Letters (RA-L)
Subjects: Robotics (cs.RO)

Standard Model Predictive Control (MPC) or trajectory optimization approaches perform only a local search to solve a complex non-convex optimization problem. As a result, they cannot capture the multi-modal characteristic of human driving. A global optimizer can be a potential solution but is computationally intractable in a real-time setting. In this paper, we present a real-time MPC capable of searching over different driving modalities. Our basic idea is simple: we run several goal-directed parallel trajectory optimizations and score the resulting trajectories based on user-defined meta cost functions. This allows us to perform a global search over several locally optimal motion plans. Although conceptually straightforward, realizing this idea in real-time with existing optimizers is highly challenging from technical and computational standpoints. With this motivation, we present a novel batch non-holonomic trajectory optimization whose underlying matrix algebra is easily parallelizable across problem instances and reduces to computing large batch matrix-vector products. This structure, in turn, is achieved by deriving a linearization-free multi-convex reformulation of the non-holonomic kinematics and collision avoidance constraints. We extensively validate our approach using both synthetic and real data sets (NGSIM) of traffic scenarios. We highlight how our algorithm automatically takes lane-change and overtaking decisions based on the defined meta cost function. Our batch optimizer achieves trajectories with lower meta cost, up to 6x faster than competing baselines.

[8]  arXiv:2109.10393 [pdf, other]
Title: Towards a Real-Time Facial Analysis System
Comments: Accepted in IEEE MMSP 2021
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

Facial analysis is an active research area in computer vision, with many practical applications. Most of the existing studies focus on addressing one specific task and maximizing its performance. For a complete facial analysis system, one needs to solve these tasks efficiently to ensure a smooth experience. In this work, we present a system-level design of a real-time facial analysis system. With a collection of deep neural networks for object detection, classification, and regression, the system recognizes age, gender, facial expression, and facial similarity for each person that appears in the camera view. We investigate the parallelization and interplay of individual tasks. Results on common off-the-shelf architecture show that the system's accuracy is comparable to the state-of-the-art methods, and the recognition speed satisfies real-time requirements. Moreover, we propose a multitask network for jointly predicting the first three attributes, i.e., age, gender, and facial expression. Source code and trained models are available at https://github.com/mahehu/TUT-live-age-estimator.

[9]  arXiv:2109.10397 [pdf, other]
Title: Grammatical Profiling for Semantic Change Detection
Comments: CoNLL 2021
Subjects: Computation and Language (cs.CL)

Semantics, morphology and syntax are strongly interdependent. However, the majority of computational methods for semantic change detection use distributional word representations which encode mostly semantics. We investigate an alternative method, grammatical profiling, based entirely on changes in the morphosyntactic behaviour of words. We demonstrate that it can be used for semantic change detection and even outperforms some distributional semantic methods. We present an in-depth qualitative and quantitative analysis of the predictions made by our grammatical profiling system, showing that they are plausible and interpretable.

[10]  arXiv:2109.10400 [pdf, other]
Title: ARROCH Augmented Reality for Robots Collaborating with a Human
Subjects: Robotics (cs.RO)

Human-robot collaboration frequently requires extensive communication, e.g., using natural language and gestures. Augmented reality (AR) has provided an alternative way of bridging the communication gap between robots and people. However, most current AR-based human-robot communication methods are unidirectional, focusing on how the human adapts to robot behaviors, and are limited to single-robot domains. In this paper, we develop AR for Robots Collaborating with a Human (ARROCH), a novel algorithm and system that supports bidirectional, multi-turn, human-multi-robot communication in indoor multi-room environments. The human can see through obstacles to observe the robots' current states and intentions, and provide feedback, while the robots' behaviors are then adjusted toward human-multi-robot teamwork. Experiments have been conducted with real robots and human participants using collaborative delivery tasks. Results show that ARROCH outperformed a standard non-AR approach in both user experience and teamwork efficiency. In addition, we have developed a novel simulation environment using Unity (for AR and human simulation) and Gazebo (for robot simulation). Results in simulation demonstrate ARROCH's superiority over AR-based baselines in human-robot collaboration.

[11]  arXiv:2109.10408 [pdf, ps, other]
Title: Non-intrusive Balancing Transformation of Highly Stiff Systems with Lightly-damped Impulse Response
Subjects: Systems and Control (eess.SY)

Balanced truncation (BT) is a model reduction method that utilizes a coordinate transformation to retain eigen-directions that are highly observable and reachable. To address realizability and scalability of BT applied to highly stiff and lightly-damped systems, a non-intrusive data-driven method is developed for balancing discrete-time systems via the eigensystem realization algorithm (ERA). The advantage of ERA for balancing transformation makes full-state outputs tractable. Further, ERA enables balancing despite stiffness, by eliminating computation of balancing modes and adjoint simulations. As a demonstrative example, we create balanced ROMs for a one-dimensional reactive flow with pressure forcing, where the stiffness introduced by the chemical source term is extreme (condition number $10^{13}$), preventing analytical implementation of BT. We investigate the performance of ROMs in prediction of dynamics with unseen forcing inputs and demonstrate stability and accuracy of balanced ROMs in truly predictive scenarios whereas without ERA, POD-Galerkin and Least-squares Petrov-Galerkin projections fail to represent the true dynamics. We show that after the initial transients under unit impulse forcing, the system undergoes lightly-damped oscillations, which magnifies the influence of sampling properties on predictive performance of the balanced ROMs. The importance of proper sampling is established via sensitivity analysis in a predictive setting.

[12]  arXiv:2109.10410 [pdf, other]
Title: RETRONLU: Retrieval Augmented Task-Oriented Semantic Parsing
Comments: 12 pages, 9 figures, 5 Tables
Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR); Machine Learning (cs.LG)

While large pre-trained language models accumulate a lot of knowledge in their parameters, it has been demonstrated that augmenting it with non-parametric retrieval-based memory has a number of benefits from accuracy improvements to data efficiency for knowledge-focused tasks, such as question answering. In this paper, we are applying retrieval-based modeling ideas to the problem of multi-domain task-oriented semantic parsing for conversational assistants. Our approach, RetroNLU, extends a sequence-to-sequence model architecture with a retrieval component, used to fetch existing similar examples and provide them as an additional input to the model. In particular, we analyze two settings, where we augment an input with (a) retrieved nearest neighbor utterances (utterance-nn), and (b) ground-truth semantic parses of nearest neighbor utterances (semparse-nn). Our technique outperforms the baseline method by 1.5% absolute macro-F1, especially at the low resource setting, matching the baseline model accuracy with only 40% of the data. Furthermore, we analyze the nearest neighbor retrieval component's quality, model sensitivity and break down the performance for semantic parses of different utterance complexity.

[13]  arXiv:2109.10412 [pdf]
Title: Social, Environmental, and Technical: Factors at Play in the Current Use and Future Design of Small-Group Captioning
Comments: 25 pages, 3 figures, to be published in the PACMHCI-CSCW2 October 2021 edition, to be presented at CSCW 2021
Subjects: Human-Computer Interaction (cs.HC)

Real-time captioning is a critical accessibility tool for many d/Deaf and hard of hearing (DHH) people. While the vast majority of captioning work has focused on formal settings and technical innovations, in contrast, we investigate captioning for informal, interactive small-group conversations, which have a high degree of spontaneity and foster dynamic social interactions. This paper reports on semi-structured interviews and design probe activities we conducted with 15 DHH participants to understand their use of existing real-time captioning services and future design preferences for both in-person and remote small-group communication. We found that our participants' experiences of captioned small-group conversations are shaped by social, environmental, and technical considerations (e.g., interlocutors' pre-established relationships, the type of captioning displays available, and how far captions lag behind speech). When considering future captioning tools, participants were interested in greater feedback on non-speech elements of conversation (e.g., speaker identity, speech rate, volume) both for their personal use and to guide hearing interlocutors toward more accessible communication. We contribute a qualitative account of DHH people's real-time captioning experiences during small-group conversation and future design considerations to better support the groups being captioned, both in person and online.

[14]  arXiv:2109.10415 [pdf, other]
Title: What Would it Take to get Biomedical QA Systems into Practice?
Comments: Accepted by MRQA workshop at EMNLP 2021
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

Medical question answering (QA) systems have the potential to answer clinicians uncertainties about treatment and diagnosis on demand, informed by the latest evidence. However, despite the significant progress in general QA made by the NLP community, medical QA systems are still not widely used in clinical environments. One likely reason for this is that clinicians may not readily trust QA system outputs, in part because transparency, trustworthiness, and provenance have not been key considerations in the design of such models. In this paper we discuss a set of criteria that, if met, we argue would likely increase the utility of biomedical QA systems, which may in turn lead to adoption of such systems in practice. We assess existing models, tasks, and datasets with respect to these criteria, highlighting shortcomings of previously proposed approaches and pointing toward what might be more usable QA systems.

[15]  arXiv:2109.10417 [pdf, other]
Title: Attacks on Visualization-Based Malware Detection: Balancing Effectiveness and Executability
Subjects: Cryptography and Security (cs.CR)

With the rapid development of machine learning for image classification, researchers have found new applications of visualization techniques in malware detection. By converting binary code into images, researchers have shown satisfactory results in applying machine learning to extract features that are difficult to discover manually. Such visualization-based malware detection methods can capture malware patterns from many different malware families and improve malware detection speed. On the other hand, recent research has also shown adversarial attacks against such visualization-based malware detection. Attackers can generate adversarial examples by perturbing the malware binary in non-reachable regions, such as padding at the end of the binary. Alternatively, attackers can perturb the malware image embedding and then verify the executability of the malware post-transformation. One major limitation of the first attack scenario is that a simple pre-processing step can remove the perturbations before classification. For the second attack scenario, it is hard to maintain the original malware's executability and functionality. In this work, we provide literature review on existing malware visualization techniques and attacks against them. We summarize the limitation of the previous work, and design a new adversarial example attack against visualization-based malware detection that can evade pre-processing filtering and maintain the original malware functionality. We test our attack on a public malware dataset and achieve a 98% success rate.

[16]  arXiv:2109.10429 [pdf, other]
Title: Exploring Coevolutionary Dynamics of Competitive Arms-Races Between Infinitely Diverse Heterogenous Adaptive Automated Trader-Agents
Comments: 17 pages; 4 figures; 54 references
Subjects: Computational Engineering, Finance, and Science (cs.CE); Trading and Market Microstructure (q-fin.TR)

We report on a series of experiments in which we study the coevolutionary "arms-race" dynamics among groups of agents that engage in adaptive automated trading in an accurate model of contemporary financial markets. At any one time, every trader in the market is trying to make as much profit as possible given the current distribution of different other trading strategies that it finds itself pitched against in the market; but the distribution of trading strategies and their observable behaviors is constantly changing, and changes in any one trader are driven to some extent by the changes in all the others. Prior studies of coevolutionary dynamics in markets have concentrated on systems where traders can choose one of a small number of fixed pure strategies, and can change their choice occasionally, thereby giving a market with a discrete phase-space, made up of a finite set of possible system states. Here we present first results from two independent sets of experiments, where we use minimal-intelligence trading-agents but in which the space of possible strategies is continuous and hence infinite. Our work reveals that by taking only a small step in the direction of increased realism we move immediately into high-dimensional phase-spaces, which then present difficulties in visualising and understanding the coevolutionary dynamics unfolding within the system. We conclude that further research is required to establish better analytic tools for monitoring activity and progress in co-adapting markets. We have released relevant Python code as open-source on GitHub, to enable others to continue this work.

[17]  arXiv:2109.10430 [pdf, other]
Title: GAP2WSS: A Genetic Algorithm based on the Pareto Principle for Web Service Selection
Subjects: Neural and Evolutionary Computing (cs.NE); Distributed, Parallel, and Cluster Computing (cs.DC)

Despite all the progress in Web service selection, the need for an approach with a better optimality and performance still remains. This paper presents a genetic algorithm by adopting the Pareto principle that is called GAP2WSS for selecting a Web service for each task of a composite Web service from a pool of candidate Web services. In contrast to the existing approaches, all global QoS constraints, interservice constraints, and transactional constraints are considered simultaneously. At first, all candidate Web services are scored and ranked per each task using the proposed mechanism. Then, the top 20 percent of the candidate Web services of each task are considered as the candidate Web services of the corresponding task to reduce the problem search space. Finally, the Web service selection problem is solved by focusing only on these 20 percent candidate Web services of each task using a genetic algorithm. Empirical studies demonstrate this approach leads to a higher efficiency and efficacy as compared with the case that all the candidate Web services are considered in solving the problem.

[18]  arXiv:2109.10431 [pdf, other]
Title: Fairness without Imputation: A Decision Tree Approach for Fair Prediction with Missing Values
Subjects: Machine Learning (cs.LG); Computers and Society (cs.CY); Information Theory (cs.IT); Machine Learning (stat.ML)

We investigate the fairness concerns of training a machine learning model using data with missing values. Even though there are a number of fairness intervention methods in the literature, most of them require a complete training set as input. In practice, data can have missing values, and data missing patterns can depend on group attributes (e.g. gender or race). Simply applying off-the-shelf fair learning algorithms to an imputed dataset may lead to an unfair model. In this paper, we first theoretically analyze different sources of discrimination risks when training with an imputed dataset. Then, we propose an integrated approach based on decision trees that does not require a separate process of imputation and learning. Instead, we train a tree with missing incorporated as attribute (MIA), which does not require explicit imputation, and we optimize a fairness-regularized objective function. We demonstrate that our approach outperforms existing fairness intervention methods applied to an imputed dataset, through several experiments on real-world datasets.

[19]  arXiv:2109.10432 [pdf, other]
Title: Beyond Discriminant Patterns: On the Robustness of Decision Rule Ensembles
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)

Local decision rules are commonly understood to be more explainable, due to the local nature of the patterns involved. With numerical optimization methods such as gradient boosting, ensembles of local decision rules can gain good predictive performance on data involving global structure. Meanwhile, machine learning models are being increasingly used to solve problems in high-stake domains including healthcare and finance. Here, there is an emerging consensus regarding the need for practitioners to understand whether and how those models could perform robustly in the deployment environments, in the presence of distributional shifts. Past research on local decision rules has focused mainly on maximizing discriminant patterns, without due consideration of robustness against distributional shifts. In order to fill this gap, we propose a new method to learn and ensemble local decision rules, that are robust both in the training and deployment environments. Specifically, we propose to leverage causal knowledge by regarding the distributional shifts in subpopulations and deployment environments as the results of interventions on the underlying system. We propose two regularization terms based on causal knowledge to search for optimal and stable rules. Experiments on both synthetic and benchmark datasets show that our method is effective and robust against distributional shifts in multiple environments.

[20]  arXiv:2109.10433 [pdf, other]
Title: Transcoding Billions of Unicode Characters per Second with SIMD Instructions
Comments: Software: this https URL
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Data Structures and Algorithms (cs.DS)

In software, text is often represented using Unicode formats (UTF-8 and UTF-16). We frequently have to convert text from one format to the other, a process called transcoding. Popular transcoding functions are slower than state-of-the-art disks and networks. These transcoding functions make little use of the single-instruction-multiple-data (SIMD) instructions available on commodity processors. By designing transcoding algorithms for SIMD instructions, we multiply the speed of transcoding on current systems (x64 and ARM). To ensure reproducibility, we make our software freely available as an open source library.

[21]  arXiv:2109.10435 [pdf]
Title: ICEV dismantling or recycling on a challenging environment
Comments: 17 pages and no figures
Subjects: Systems and Control (eess.SY)

Nowadays Sustainability is a huge issue. Sustainability deals with the need for the protection of the natural environment and ecosystems health and requires innovation and commitment with the future. This manuscript uses the infinite servers with Poisson arrivals queue system, modelling Internal Combustion Engine Vehicles (ICEV), normally cars but not only, which turn idle when conventional energy becomes scarce, or a new status quo is required. In such a case, they are recycled, becoming either EV-Electric Vehicles or HEV-Hybrid Electric Vehicles or FCEV-Fuel Cell Electric Vehicles, or are dismantled (DV-Dismantled Vehicles). Our model shows that when the rhythm ICEV become EV, HEV, FCEV and DV is greater than the rate at which they get idle the system tends to balance. In a cost-benefit analysis perspective, there are minimum benefits above which, both dismantling and recycling, are interesting. Additionally, the most interesting is the one for which the minimum benefit is the least.

[22]  arXiv:2109.10436 [pdf, ps, other]
Title: Classification with Nearest Disjoint Centroids
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

In this paper, we develop a new classification method based on nearest centroid, and it is called the nearest disjoint centroid classifier. Our method differs from the nearest centroid classifier in the following two aspects: (1) the centroids are defined based on disjoint subsets of features instead of all the features, and (2) the distance is induced by the dimensionality-normalized norm instead of the Euclidean norm. We provide a few theoretical results regarding our method. In addition, we propose a simple algorithm based on adapted k-means clustering that can find the disjoint subsets of features used in our method, and extend the algorithm to perform feature selection. We evaluate and compare the performance of our method to other closely related classifiers on both simulated data and real-world gene expression datasets. The results demonstrate that our method is able to outperform other competing classifiers by having smaller misclassification rates and/or using fewer features in various settings and situations.

[23]  arXiv:2109.10441 [pdf, other]
Title: Evaluating Debiasing Techniques for Intersectional Biases
Comments: To appear in EMNLP 2021
Subjects: Computation and Language (cs.CL)

Bias is pervasive in NLP models, motivating the development of automatic debiasing techniques. Evaluation of NLP debiasing methods has largely been limited to binary attributes in isolation, e.g., debiasing with respect to binary gender or race, however many corpora involve multiple such attributes, possibly with higher cardinality. In this paper we argue that a truly fair model must consider `gerrymandering' groups which comprise not only single attributes, but also intersectional groups. We evaluate a form of bias-constrained model which is new to NLP, as well an extension of the iterative nullspace projection technique which can handle multiple protected attributes.

[24]  arXiv:2109.10442 [pdf, other]
Title: Selecting Datasets for Evaluating an Enhanced Deep Learning Framework
Comments: 5 pages, 2 figures, Submitted to SATNAC 2021, Drakensberg, South Africa
Subjects: Machine Learning (cs.LG)

A framework was developed to address limitations associated with existing techniques for analysing sequences. This work deals with the steps followed to select suitable datasets characterised by discrete irregular sequential patterns. To identify, select, explore and evaluate which datasets from various sources extracted from more than 400 research articles, an interquartile range method for outlier calculation and a qualitative Billauer's algorithm was adapted to provide periodical peak detection in such datasets.
The developed framework was then tested using the most appropriate datasets.
The research concluded that the financial market-daily currency exchange domain is the most suitable kind of data set for the evaluation of the designed deep learning framework, as it provides high levels of discrete irregular patterns.

[25]  arXiv:2109.10443 [pdf, other]
Title: Geometric Fabrics: Generalizing Classical Mechanics to Capture the Physics of Behavior
Subjects: Robotics (cs.RO); Systems and Control (eess.SY)

Classical mechanical systems are central to controller design in energy shaping methods of geometric control. However, their expressivity is limited by position-only metrics and the intimate link between metric and geometry. Recent work on Riemannian Motion Policies (RMPs) has shown that shedding these restrictions results in powerful design tools, but at the expense of theoretical guarantees. In this work, we generalize classical mechanics to what we call geometric fabrics, whose expressivity and theory enable the design of systems that outperform RMPs in practice. Geometric fabrics strictly generalize classical mechanics forming a new physics of behavior by first generalizing them to Finsler geometries and then explicitly bending them to shape their behavior. We develop the theory of fabrics and present both a collection of controlled experiments examining their theoretical properties and a set of robot system experiments showing improved performance over a well-engineered and hardened implementation of RMPs, our current state-of-the-art in controller design.

[26]  arXiv:2109.10444 [pdf, other]
Title: Fairness-aware Class Imbalanced Learning
Comments: To appear in EMNLP 2021
Subjects: Computation and Language (cs.CL)

Class imbalance is a common challenge in many NLP tasks, and has clear connections to bias, in that bias in training data often leads to higher accuracy for majority groups at the expense of minority groups. However there has traditionally been a disconnect between research on class-imbalanced learning and mitigating bias, and only recently have the two been looked at through a common lens. In this work we evaluate long-tail learning methods for tweet sentiment and occupation classification, and extend a margin-loss based approach with methods to enforce fairness. We empirically show through controlled experiments that the proposed approaches help mitigate both class imbalance and demographic biases.

[27]  arXiv:2109.10445 [pdf, other]
Title: Robust Visual Teach and Repeat for UGVs Using 3D Semantic Maps
Subjects: Robotics (cs.RO)

In this paper, we propose a Visual Teach and Repeat (VTR) algorithm using semantic landmarks extracted from environmental objects for ground robots with fixed mount monocular cameras. The proposed algorithm is robust to changes in the starting pose of the camera/robot, where a pose is defined as the planar position plus the orientation around the vertical axis. VTR consists of a teach phase in which a robot moves in a prescribed path, and a repeat phase in which the robot tries to repeat the same path starting from the same or a different pose. Most available VTR algorithms are pose dependent and cannot perform well in the repeat phase when starting from an initial pose far from that of the teach phase. To achieve more robust pose independency, during the teach phase, we collect the camera poses and the 3D point clouds of the environment using ORB-SLAM. We also detect objects in the environment using YOLOv3. We then combine the two outputs to build a 3D semantic map of the environment consisting of the 3D position of the objects and the robot path. In the repeat phase, we relocalize the robot based on the detected objects and the stored semantic map. The robot is then able to move toward the teach path, and repeat it in both forward and backward directions. The results show that our algorithm is highly robust with respect to pose variations as well as environmental alterations. Our code and data are available at the following Github page: https://github.com/mmahdavian/semantic_visual_teach_repeat

[28]  arXiv:2109.10447 [pdf, ps, other]
Title: Adding Negation to Lambda Mu
Comments: 37 pages
Subjects: Logic in Computer Science (cs.LO)

We present $\cal L$, an extension of Parigot's $\lambda\mu$-calculus by adding negation as a type constructor, together with syntactic constructs that represent negation introduction and elimination. We will define a notion of reduction that extends $\lambda\mu$'s reduction system with two new reduction rules, and show that the system satisfies subject reduction. Using Aczel's generalisation of Tait and Martin-L\"of's notion of parallel reduction, we show that this extended reduction is confluent. Although the notion of type assignment has its limitations with respect to representation of proofs in natural deduction with implication and negation, we will show that all propositions that can be shown in there have a witness in $\cal L$. Using Girard's approach of reducibility candidates, we show that all typeable terms are strongly normalisable, and conclude the paper by showing that type assignment for $\cal L$ enjoys the principal typing property.

[29]  arXiv:2109.10450 [pdf, other]
Title: Towards cyber-physical systems robust to communication delays: A differential game approach
Comments: 7 pages, 5 figures, Submitted to IEEE Control Systems Letters
Subjects: Systems and Control (eess.SY); Robotics (cs.RO); Dynamical Systems (math.DS)

Collaboration between interconnected cyber-physical systems is becoming increasingly pervasive. Time-delays in communication channels between such systems are known to induce catastrophic failure modes, like high frequency oscillations in robotic manipulators in bilateral teleoperation or string instability in platoons of autonomous vehicles. This paper considers nonlinear time-delay systems representing coupled robotic agents, and proposes controllers that are robust to time-varying communication delays. We introduce approximations that allow the delays to be considered as implicit control inputs themselves, and formulate the problem as a zero-sum differential game between the stabilizing controllers and the delays acting adversarially. The ensuing optimal control law is finally compared to known results from Lyapunov-Krasovskii based approaches via numerical experiments.

[30]  arXiv:2109.10453 [pdf, other]
Title: Extracting Fine-Grained Knowledge Graphs of Scientific Claims: Dataset and Transformer-Based Results
Comments: 8 pages, 4 tables, 4 figures, accepted at EMNLP 2021
Subjects: Computation and Language (cs.CL)

Recent transformer-based approaches demonstrate promising results on relational scientific information extraction. Existing datasets focus on high-level description of how research is carried out. Instead we focus on the subtleties of how experimental associations are presented by building SciClaim, a dataset of scientific claims drawn from Social and Behavior Science (SBS), PubMed, and CORD-19 papers. Our novel graph annotation schema incorporates not only coarse-grained entity spans as nodes and relations as edges between them, but also fine-grained attributes that modify entities and their relations, for a total of 12,738 labels in the corpus. By including more label types and more than twice the label density of previous datasets, SciClaim captures causal, comparative, predictive, statistical, and proportional associations over experimental variables along with their qualifications, subtypes, and evidence. We extend work in transformer-based joint entity and relation extraction to effectively infer our schema, showing the promise of fine-grained knowledge graphs in scientific claims and beyond.

[31]  arXiv:2109.10454 [pdf, other]
Title: Modewise Operators, the Tensor Restricted Isometry Property, and Low-Rank Tensor Recovery
Subjects: Numerical Analysis (math.NA)

Recovery of sparse vectors and low-rank matrices from a small number of linear measurements is well-known to be possible under various model assumptions on the measurements. The key requirement on the measurement matrices is typically the restricted isometry property, that is, approximate orthonormality when acting on the subspace to be recovered. Among the most widely used random matrix measurement models are (a) independent sub-gaussian models and (b) randomized Fourier-based models, allowing for the efficient computation of the measurements.
For the now ubiquitous tensor data, direct application of the known recovery algorithms to the vectorized or matricized tensor is awkward and memory-heavy because of the huge measurement matrices to be constructed and stored. In this paper, we propose modewise measurement schemes based on sub-gaussian and randomized Fourier measurements. These modewise operators act on the pairs or other small subsets of the tensor modes separately. They require significantly less memory than the measurements working on the vectorized tensor, provably satisfy the tensor restricted isometry property and experimentally can recover the tensor data from fewer measurements and do not require impractical storage.

[32]  arXiv:2109.10455 [pdf, other]
Title: An Audio Synthesis Framework Derived from Industrial Process Control
Authors: Ashwin Pillay
Comments: 10 pages, 24 figures
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)

Since its conception, digital synthesis has significantly influenced the advancement of music, leading to new genres and production styles. Through existing synthesis techniques, one can recreate naturally occurring sounds as well as generate innovative artificial timbres. However, research in audio technology continues to pursue new methods of synthesizing sounds, keeping the transformation of music constant. This research attempts to formulate the framework of a new synthesis technique by redefining the popular Proportional-Integral-Derivative (PID) algorithm used in feedback-based process control. The framework is then implemented as a Python application to study the available control parameters and their effect on the synthesized output. Further, applications of this technique as an audio signal and LFO generator, including its potentiality as an alternative to FM and Wavetable synthesis techniques, are studied in detail. The research concludes by highlighting some of the imperfections in the current framework and the possible research directions to be considered to address them.

[33]  arXiv:2109.10457 [pdf, other]
Title: Infrastructure Node-based Vehicle Localization for Autonomous Driving
Comments: 7 pages, 8 figures
Subjects: Robotics (cs.RO)

Vehicle localization is essential for autonomous vehicle (AV) navigation and Advanced Driver Assistance Systems (ADAS). Accurate vehicle localization is often achieved via expensive inertial navigation systems or by employing compute-intensive vision processing (LiDAR/camera) to augment the low-cost and noisy inertial sensors. Here we have developed a framework for fusing the information obtained from a smart infrastructure node (ix-node) with the autonomous vehicles on-board localization engine to estimate the robust and accurate pose of the ego-vehicle even with cheap inertial sensors. A smart ix-node is typically used to augment the perception capability of an autonomous vehicle, especially when the onboard perception sensors of AVs are blocked by the dynamic and static objects in the environment thereby making them ineffectual. In this work, we utilize this perception output from an ix-node to increase the localization accuracy of the AV. The fusion of ix-node perception output with the vehicle's low-cost inertial sensors allows us to perform reliable vehicle localization without the need for relying on expensive inertial navigation systems or compute-intensive vision processing onboard the AVs. The proposed approach has been tested on real-world datasets collected from a test track in Ann Arbor, Michigan. Detailed analysis of the experimental results shows that incorporating ix-node data improves localization performance.

[34]  arXiv:2109.10458 [pdf, other]
Title: Achieving Counterfactual Fairness for Causal Bandit
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computers and Society (cs.CY)

In online recommendation, customers arrive in a sequential and stochastic manner from an underlying distribution and the online decision model recommends a chosen item for each arriving individual based on some strategy. We study how to recommend an item at each step to maximize the expected reward while achieving user-side fairness for customers, i.e., customers who share similar profiles will receive a similar reward regardless of their sensitive attributes and items being recommended. By incorporating causal inference into bandits and adopting soft intervention to model the arm selection strategy, we first propose the d-separation based UCB algorithm (D-UCB) to explore the utilization of the d-separation set in reducing the amount of exploration needed to achieve low cumulative regret. Based on that, we then propose the fair causal bandit (F-UCB) for achieving the counterfactual individual fairness. Both theoretical analysis and empirical evaluation demonstrate effectiveness of our algorithms.

[35]  arXiv:2109.10459 [pdf, other]
Title: Optimal excitation and measurement pattern for cascade networks
Comments: Submitted for review
Subjects: Systems and Control (eess.SY); Optimization and Control (math.OC); Chaotic Dynamics (nlin.CD)

This work deals with accuracy analysis of dynamical systems interconnected in a cascade structure. For a cascade network there are a number of experimental settings for which the dynamic systems within the network can be identified. We study the problem of choosing which excitation and measurement pattern delivers the most accurate parameter estimates for the whole network. The optimal experiment is based on the accuracy assessed through the asymptotic covariance matrix of the prediction error method, while the cost criterion is the number of excitations and measurements. We develop theoretical results under the assumptions that all dynamic systems are equal and with equal signal-to-noise ratio throughout the network. We show that there are experimental settings which result in equal overall precision and that there is an excitation and measurement pattern that yields more accurate results than others. From these results a guideline based on the topology of the network emerges for the choice of the experimental setting. We provide numerical results which attest that the principles behind this guideline are also valid for more general situations.

[36]  arXiv:2109.10460 [pdf, other]
Title: Graph-based Cluttered Scene Generation and Interactive Exploration using Deep Reinforcement Learning
Subjects: Robotics (cs.RO)

We introduce a novel method to teach a robotic agent to interactively explore cluttered yet structured scenes, such as kitchen pantries and grocery shelves, by leveraging the physical plausibility of the scene. We propose a novel learning framework to train an effective scene exploration policy to discover hidden objects with minimal interactions. First, we define a novel scene grammar to represent structured clutter. Then we train a Graph Neural Network (GNN) based Scene Generation agent using deep reinforcement learning (deep RL), to manipulate this Scene Grammar to create a diverse set of stable scenes, each containing multiple hidden objects. Given such cluttered scenes, we then train a Scene Exploration agent, using deep RL, to uncover hidden objects by interactively rearranging the scene. We show that our learned agents hide and discover significantly more objects than the baselines. We present quantitative results that prove the generalization capabilities of our agents. We also demonstrate sim-to-real transfer by successfully deploying the learned policy on a real UR10 robot to explore real-world cluttered scenes. The supplemental video can be found at https://www.youtube.com/watch?v=T2Jo7wwaXss.

[37]  arXiv:2109.10462 [pdf, other]
Title: A Hierarchical Network-Oriented Analysis of User Participation in Misinformation Spread on WhatsApp
Comments: Paper Accepted in Information Processing & Management, Elsevier
Subjects: Social and Information Networks (cs.SI); Artificial Intelligence (cs.AI); Computers and Society (cs.CY); Machine Learning (cs.LG); Computation (stat.CO)

WhatsApp emerged as a major communication platform in many countries in the recent years. Despite offering only one-to-one and small group conversations, WhatsApp has been shown to enable the formation of a rich underlying network, crossing the boundaries of existing groups, and with structural properties that favor information dissemination at large. Indeed, WhatsApp has reportedly been used as a forum of misinformation campaigns with significant social, political and economic consequences in several countries. In this article, we aim at complementing recent studies on misinformation spread on WhatsApp, mostly focused on content properties and propagation dynamics, by looking into the network that connects users sharing the same piece of content. Specifically, we present a hierarchical network-oriented characterization of the users engaged in misinformation spread by focusing on three perspectives: individuals, WhatsApp groups and user communities, i.e., groupings of users who, intentionally or not, share the same content disproportionately often. By analyzing sharing and network topological properties, our study offers valuable insights into how WhatsApp users leverage the underlying network connecting different groups to gain large reach in the spread of misinformation on the platform.

[38]  arXiv:2109.10465 [pdf, other]
Title: Scalable and Efficient MoE Training for Multitask Multilingual Models
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

The Mixture of Experts (MoE) models are an emerging class of sparsely activated deep learning models that have sublinear compute costs with respect to their parameters. In contrast with dense models, the sparse architecture of MoE offers opportunities for drastically growing model size with significant accuracy gain while consuming much lower compute budget. However, supporting large scale MoE training also has its own set of system and modeling challenges. To overcome the challenges and embrace the opportunities of MoE, we first develop a system capable of scaling MoE models efficiently to trillions of parameters. It combines multi-dimensional parallelism and heterogeneous memory technologies harmoniously with MoE to empower 8x larger models on the same hardware compared with existing work. Besides boosting system efficiency, we also present new training methods to improve MoE sample efficiency and leverage expert pruning strategy to improve inference time efficiency. By combining the efficient system and training methods, we are able to significantly scale up large multitask multilingual models for language generation which results in a great improvement in model accuracy. A model trained with 10 billion parameters on 50 languages can achieve state-of-the-art performance in Machine Translation (MT) and multilingual natural language generation tasks. The system support of efficient MoE training has been implemented and open-sourced with the DeepSpeed library.

[39]  arXiv:2109.10466 [pdf, ps, other]
Title: Efficient Partial Rewind of Polar Codes' Successive Cancellation-based Decoders for Re-decoding Attempts
Subjects: Information Theory (cs.IT)

Successive cancellation (SC) process is an essential component of various decoding algorithms used for polar codes and their variants. Rewinding this process seems trivial if we have access to all intermediate log-likelihood ratios (LLRs) and partial sums. However, as the block length increases, retaining all of the intermediate information becomes inefficient and impractical. Rewinding the SC process in a memory-efficient way is a problem that we address in this paper. We first explore the properties of the SC process based on the binary representation of the bit indices by introducing a new operator used for grouping the bit indices. This special grouping helps us in finding the closest bit index to the target index for rewinding. We also analytically prove that this approach gives access to the untouched intermediate information stored in the memory which is essential in resuming the SC process. Then, we adapt the proposed approach to multiple rewinds and apply it on SC-flip decoding and shifted-pruning based list decoding. The numerical evaluation of the proposed solution shows a significant reduction of >=50% in the complexity of the additional decoding attempts at medium and high SNR regimes for SC-flip decoding and less for shifted-pruning based list decoding.

[40]  arXiv:2109.10469 [pdf, other]
Title: Differentiable Scaffolding Tree for Molecular Optimization
Subjects: Machine Learning (cs.LG)

The structural design of functional molecules, also called molecular optimization, is an essential chemical science and engineering task with important applications, such as drug discovery. Deep generative models and combinatorial optimization methods achieve initial success but still struggle with directly modeling discrete chemical structures and often heavily rely on brute-force enumeration. The challenge comes from the discrete and non-differentiable nature of molecule structures. To address this, we propose differentiable scaffolding tree (DST) that utilizes a learned knowledge network to convert discrete chemical structures to locally differentiable ones. DST enables a gradient-based optimization on a chemical graph structure by back-propagating the derivatives from the target properties through a graph neural network (GNN). Our empirical studies show the gradient-based molecular optimizations are both effective and sample efficient. Furthermore, the learned graph parameters can also provide an explanation that helps domain experts understand the model output.

[41]  arXiv:2109.10471 [pdf, other]
Title: The First Vision For Vitals (V4V) Challenge for Non-Contact Video-Based Physiological Estimation
Comments: ICCVw'21. V4V Dataset and Challenge: this https URL
Subjects: Computers and Society (cs.CY); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)

Telehealth has the potential to offset the high demand for help during public health emergencies, such as the COVID-19 pandemic. Remote Photoplethysmography (rPPG) - the problem of non-invasively estimating blood volume variations in the microvascular tissue from video - would be well suited for these situations. Over the past few years a number of research groups have made rapid advances in remote PPG methods for estimating heart rate from digital video and obtained impressive results. How these various methods compare in naturalistic conditions, where spontaneous behavior, facial expressions, and illumination changes are present, is relatively unknown. To enable comparisons among alternative methods, the 1st Vision for Vitals Challenge (V4V) presented a novel dataset containing high-resolution videos time-locked with varied physiological signals from a diverse population. In this paper, we outline the evaluation protocol, the data used, and the results. V4V is to be held in conjunction with the 2021 International Conference on Computer Vision.

[42]  arXiv:2109.10473 [pdf, other]
Title: MVM3Det: A Novel Method for Multi-view Monocular 3D Detection
Comments: 7 pages, 3 figures, submitted to ICRA 2022
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Monocular 3D object detection encounters occlusion problems in many application scenarios, such as traffic monitoring, pedestrian monitoring, etc., which leads to serious false negative. Multi-view object detection effectively solves this problem by combining data from different perspectives. However, due to label confusion and feature confusion, the orientation estimation of multi-view 3D object detection is intractable, which is important for object tracking and intention prediction. In this paper, we propose a novel multi-view 3D object detection method named MVM3Det which simultaneously estimates the 3D position and orientation of the object according to the multi-view monocular information. The method consists of two parts: 1) Position proposal network, which integrates the features from different perspectives into consistent global features through feature orthogonal transformation to estimate the position. 2) Multi-branch orientation estimation network, which introduces feature perspective pooling to overcome the two confusion problems during the orientation estimation. In addition, we present a first dataset for multi-view 3D object detection named MVM3D. Comparing with State-Of-The-Art (SOTA) methods on our dataset and public dataset WildTrack, our method achieves very competitive results.

[43]  arXiv:2109.10475 [pdf, other]
Title: Salience-Aware Event Chain Modeling for Narrative Understanding
Comments: EMNLP 2021
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

Storytelling, whether via fables, news reports, documentaries, or memoirs, can be thought of as the communication of interesting and related events that, taken together, form a concrete process. It is desirable to extract the event chains that represent such processes. However, this extraction remains a challenging problem. We posit that this is due to the nature of the texts from which chains are discovered. Natural language text interleaves a narrative of concrete, salient events with background information, contextualization, opinion, and other elements that are important for a variety of necessary discourse and pragmatics acts but are not part of the principal chain of events being communicated. We introduce methods for extracting this principal chain from natural language text, by filtering away non-salient events and supportive sentences. We demonstrate the effectiveness of our methods at isolating critical event chains by comparing their effect on downstream tasks. We show that by pre-training large language models on our extracted chains, we obtain improvements in two tasks that benefit from a clear understanding of event chains: narrative prediction and event-based temporal question answering. The demonstrated improvements and ablative studies confirm that our extraction method isolates critical event chains.

[44]  arXiv:2109.10476 [pdf, other]
Title: Self-Supervised Learning to Prove Equivalence Between Programs via Semantics-Preserving Rewrite Rules
Comments: 18 pages
Subjects: Machine Learning (cs.LG); Programming Languages (cs.PL)

We target the problem of synthesizing proofs of semantic equivalence between two programs made of sequences of statements with complex symbolic expressions. We propose a neural network architecture based on the transformer to generate axiomatic proofs of equivalence between program pairs. We generate expressions which include scalars and vectors and support multi-typed rewrite rules to prove equivalence. For training the system, we develop an original training technique, which we call self-supervised sample selection. This incremental training improves the quality, generalizability and extensibility of the learned model. We study the effectiveness of the system to generate proofs of increasing length, and we demonstrate how transformer models learn to represent complex and verifiable symbolic reasoning. Our system, S4Eq, achieves 97% proof success on 10,000 pairs of programs while ensuring zero false positives by design.

[45]  arXiv:2109.10477 [pdf, other]
Title: Generating Compositional Color Representations from Text
Comments: Accepted as a full paper at CIKM 2021
Subjects: Computer Vision and Pattern Recognition (cs.CV); Information Retrieval (cs.IR); Machine Learning (cs.LG)

We consider the cross-modal task of producing color representations for text phrases. Motivated by the fact that a significant fraction of user queries on an image search engine follow an (attribute, object) structure, we propose a generative adversarial network that generates color profiles for such bigrams. We design our pipeline to learn composition - the ability to combine seen attributes and objects to unseen pairs. We propose a novel dataset curation pipeline from existing public sources. We describe how a set of phrases of interest can be compiled using a graph propagation technique, and then mapped to images. While this dataset is specialized for our investigations on color, the method can be extended to other visual dimensions where composition is of interest. We provide detailed ablation studies that test the behavior of our GAN architecture with loss functions from the contrastive learning literature. We show that the generative model achieves lower Frechet Inception Distance than discriminative ones, and therefore predicts color profiles that better match those from real images. Finally, we demonstrate improved performance in image retrieval and classification, indicating the crucial role that color plays in these downstream tasks.

[46]  arXiv:2109.10478 [pdf, other]
Title: AI in Osteoporosis
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

In this chapter we explore and evaluate methods for trabecular bone characterization and osteoporosis diagnosis with increased interest in sparse approximations. We first describe texture representation and classification techniques, patch-based methods such as Bag of Keypoints, and more recent deep neural networks. Then we introduce the concept of sparse representations for pattern recognition and we detail integrative sparse analysis methods and classifier decision fusion methods. We report cross-validation results on osteoporosis datasets of bone radiographs and compare the results produced by the different categories of methods. We conclude that advances in the AI and machine learning fields have enabled the development of methods that can be used as diagnostic tools in clinical settings.

[47]  arXiv:2109.10480 [pdf]
Title: DialogueBERT: A Self-Supervised Learning based Dialogue Pre-training Encoder
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

With the rapid development of artificial intelligence, conversational bots have became prevalent in mainstream E-commerce platforms, which can provide convenient customer service timely. To satisfy the user, the conversational bots need to understand the user's intention, detect the user's emotion, and extract the key entities from the conversational utterances. However, understanding dialogues is regarded as a very challenging task. Different from common language understanding, utterances in dialogues appear alternately from different roles and are usually organized as hierarchical structures. To facilitate the understanding of dialogues, in this paper, we propose a novel contextual dialogue encoder (i.e. DialogueBERT) based on the popular pre-trained language model BERT. Five self-supervised learning pre-training tasks are devised for learning the particularity of dialouge utterances. Four different input embeddings are integrated to catch the relationship between utterances, including turn embedding, role embedding, token embedding and position embedding. DialogueBERT was pre-trained with 70 million dialogues in real scenario, and then fine-tuned in three different downstream dialogue understanding tasks. Experimental results show that DialogueBERT achieves exciting results with 88.63% accuracy for intent recognition, 94.25% accuracy for emotion recognition and 97.04% F1 score for named entity recognition, which outperforms several strong baselines by a large margin.

[48]  arXiv:2109.10484 [pdf, other]
Title: Numerically Stable Binary Coded Computations
Comments: 25 pages, 4 figures, 1 table
Subjects: Information Theory (cs.IT); Numerical Analysis (math.NA)

This paper addresses the gradient coding and coded matrix multiplication problems in distributed optimization and coded computing. We present a numerically stable binary coding method which overcomes the drawbacks of the gradient coding method proposed by Tandon et al., and can also be leveraged by coded computing networks whose servers are of heterogeneous nature. The proposed binary encoding avoids operations over the real and complex numbers which are inherently numerically unstable, thereby enabling numerically stable distributed encodings of the partial gradients. We then make connections between gradient coding and coded matrix multiplication. Specifically, we show that any gradient coding scheme can be extended to coded matrix multiplication. Furthermore, we show how the proposed binary gradient coding scheme can be used to construct three different coded matrix multiplication schemes, each achieving different trade-offs.

[49]  arXiv:2109.10485 [pdf, other]
Title: The NiuTrans Machine Translation Systems for WMT21
Subjects: Computation and Language (cs.CL)

This paper describes NiuTrans neural machine translation systems of the WMT 2021 news translation tasks. We made submissions to 9 language directions, including English$\leftrightarrow$$\{$Chinese, Japanese, Russian, Icelandic$\}$ and English$\rightarrow$Hausa tasks. Our primary systems are built on several effective variants of Transformer, e.g., Transformer-DLCL, ODE-Transformer. We also utilize back-translation, knowledge distillation, post-ensemble, and iterative fine-tuning techniques to enhance the model performance further.

[50]  arXiv:2109.10488 [pdf, other]
Title: A Model-free Deep Reinforcement Learning Approach To Maneuver A Quadrotor Despite Single Rotor Failure
Subjects: Robotics (cs.RO)

Ability to recover from faults and continue mission is desirable for many quadrotor applications. The quadrotor's rotor may fail while performing a mission and it is essential to develop recovery strategies so that the vehicle is not damaged. In this paper, we develop a model-free deep reinforcement learning approach for a quadrotor to recover from a single rotor failure. The approach is based on Soft-actor-critic that enables the vehicle to hover, land, and perform complex maneuvers. Simulation results are presented to validate the proposed approach using a custom simulator. The results show that the proposed approach achieves hover, landing, and path following in 2D and 3D. We also show that the proposed approach is robust to wind disturbances.

[51]  arXiv:2109.10489 [pdf, ps, other]
Title: Enabling Large-Scale Federated Learning over Wireless Edge Networks
Comments: Accepted to appear in Proc. IEEE Global Communications Conference (GLOBECOM), Madrid, Spain, Dec. 2021
Subjects: Information Theory (cs.IT); Networking and Internet Architecture (cs.NI)

Major bottlenecks of large-scale Federated Learning(FL) networks are the high costs for communication and computation. This is due to the fact that most of current FL frameworks only consider a star network topology where all local trained models are aggregated at a single server (e.g., a cloud server). This causes significant overhead at the server when the number of users are huge and local models' sizes are large. This paper proposes a novel edge network architecture which decentralizes the model aggregation process at the server, thereby significantly reducing the aggregation latency of the whole network. In this architecture, we propose a highly-effective in-network computation protocol consisting of two components. First, an in-network aggregation process is designed so that the majority of aggregation computations can be offloaded from cloud server to edge nodes. Second, a joint routing and resource allocation optimization problem is formulated to minimize the aggregation latency for the whole system at every learning round. The problem turns out to be NP-hard, and thus we propose a polynomial time routing algorithm which can achieve near optimal performance with a theoretical bound. Numerical results show that our proposed framework can dramatically reduce the network latency, up to 4.6 times. Furthermore, this framework can significantly decrease cloud's traffic and computing overhead by a factor of K/M, where K is the number of users and M is the number of edge nodes, in comparison with conventional baselines.

[52]  arXiv:2109.10490 [pdf]
Title: Benchmarking Lane-changing Decision-making for Deep Reinforcement Learning
Comments: 10 pages, 5 figures, 3 tables
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)

The development of autonomous driving has attracted extensive attention in recent years, and it is essential to evaluate the performance of autonomous driving. However, testing on the road is expensive and inefficient. Virtual testing is the primary way to validate and verify self-driving cars, and the basis of virtual testing is to build simulation scenarios. In this paper, we propose a training, testing, and evaluation pipeline for the lane-changing task from the perspective of deep reinforcement learning. First, we design lane change scenarios for training and testing, where the test scenarios include stochastic and deterministic parts. Then, we deploy a set of benchmarks consisting of learning and non-learning approaches. We train several state-of-the-art deep reinforcement learning methods in the designed training scenarios and provide the benchmark metrics evaluation results of the trained models in the test scenarios. The designed lane-changing scenarios and benchmarks are both opened to provide a consistent experimental environment for the lane-changing task.

[53]  arXiv:2109.10492 [pdf, other]
Title: Single Image Dehazing with An Independent Detail-Recovery Network
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Single image dehazing is a prerequisite which affects the performance of many computer vision tasks and has attracted increasing attention in recent years. However, most existing dehazing methods emphasize more on haze removal but less on the detail recovery of the dehazed images. In this paper, we propose a single image dehazing method with an independent Detail Recovery Network (DRN), which considers capturing the details from the input image over a separate network and then integrates them into a coarse dehazed image. The overall network consists of two independent networks, named DRN and the dehazing network respectively. Specifically, the DRN aims to recover the dehazed image details through local and global branches respectively. The local branch can obtain local detail information through the convolution layer and the global branch can capture more global information by the Smooth Dilated Convolution (SDC). The detail feature map is fused into the coarse dehazed image to obtain the dehazed image with rich image details. Besides, we integrate the DRN, the physical-model-based dehazing network and the reconstruction loss into an end-to-end joint learning framework. Extensive experiments on the public image dehazing datasets (RESIDE-Indoor, RESIDE-Outdoor and the TrainA-TestA) illustrate the effectiveness of the modules in the proposed method and show that our method outperforms the state-of-the-art dehazing methods both quantitatively and qualitatively. The code is released in https://github.com/YanLi-LY/Dehazing-DRN.

[54]  arXiv:2109.10493 [pdf, other]
Title: Learning Robust Agents for Visual Navigation in Dynamic Environments: The Winning Entry of iGibson Challenge 2021
Comments: Submitted to ICRA2022
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI)

This paper presents an approach for improving navigation in dynamic and interactive environments, which won the 1st place in the iGibson Interactive Navigation Challenge 2021. While the last few years have produced impressive progress on PointGoal Navigation in static environments, relatively little effort has been made on more realistic dynamic environments. The iGibson Challenge proposed two new navigation tasks, Interactive Navigation and Social Navigation, which add displaceable obstacles and moving pedestrians into the simulator environment. Our approach to study these problems uses two key ideas. First, we employ large-scale reinforcement learning by leveraging the Habitat simulator, which supports high performance parallel computing for both simulation and synchronized learning. Second, we employ a new data augmentation technique that adds more dynamic objects into the environment, which can also be combined with traditional image-based augmentation techniques to boost the performance further. Lastly, we achieve sim-to-sim transfer from Habitat to the iGibson simulator, and demonstrate that our proposed methods allow us to train robust agents in dynamic environments with interactive objects or moving humans. Video link: https://www.youtube.com/watch?v=HxUX2HeOSE4

[55]  arXiv:2109.10497 [pdf, other]
Title: A Simple Approach to Jointly Rank Passages and Select Relevant Sentences in the OBQA Context
Comments: 5 pages
Subjects: Computation and Language (cs.CL)

In the open question answering (OBQA) task, how to select the relevant information from a large corpus is a crucial problem for reasoning and inference. Some datasets (e.g, HotpotQA) mainly focus on testing the model's reasoning ability at the sentence level. To overcome this challenge, many existing frameworks use a deep learning model to select relevant passages and then answer each question by matching a sentence in the corresponding passage. However, such frameworks require long inference time and fail to take advantage of the relationship between passages and sentences. In this work, we present a simple yet effective framework to address these problems by jointly ranking passages and selecting sentences. We propose consistency and similarity constraints to promote the correlation and interaction between passage ranking and sentence selection. In our experiments, we demonstrate that our framework can achieve competitive results and outperform the baseline by 28\% in terms of exact matching of relevant sentences on the HotpotQA dataset.

[56]  arXiv:2109.10498 [pdf, other]
Title: Less is More: Learning from Synthetic Data with Fine-grained Attributes for Person Re-Identification
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Person re-identification (re-ID) plays an important role in applications such as public security and video surveillance. Recently, learning from synthetic data, which benefits from the popularity of synthetic data engine, has attracted attention from both academia and the public eye. However, existing synthetic datasets are limited in quantity, diversity and realisticity, and cannot be efficiently used for generalizable re-ID problem. To address this challenge, we construct and label a large-scale synthetic person dataset named FineGPR with fine-grained attribute distribution. Moreover, aiming to fully exploit the potential of FineGPR and promote the efficient training from millions of synthetic data, we propose an attribute analysis pipeline AOST to learn attribute distribution in target domain, then apply style transfer network to eliminate the gap between synthetic and real-world data and thus is freely deployed to new scenarios. Experiments conducted on benchmarks demonstrate that FineGPR with AOST outperforms (or is on par with) existing real and synthetic datasets, which suggests its feasibility for re-ID and proves the proverbial less-is-more principle. We hope this fine-grained dataset could advance research towards re-ID in real scenarios.

[57]  arXiv:2109.10500 [pdf, other]
Title: HyperExpan: Taxonomy Expansion with Hyperbolic Representation Learning
Comments: To appear in Findings of ACL: EMNLP 2021
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

Taxonomies are valuable resources for many applications, but the limited coverage due to the expensive manual curation process hinders their general applicability. Prior works attempt to automatically expand existing taxonomies to improve their coverage by learning concept embeddings in Euclidean space, while taxonomies, inherently hierarchical, more naturally align with the geometric properties of a hyperbolic space. In this paper, we present HyperExpan, a taxonomy expansion algorithm that seeks to preserve the structure of a taxonomy in a more expressive hyperbolic embedding space and learn to represent concepts and their relations with a Hyperbolic Graph Neural Network (HGNN). Specifically, HyperExpan leverages position embeddings to exploit the structure of the existing taxonomies, and characterizes the concept profile information to support the inference on unseen concepts during training. Experiments show that our proposed HyperExpan outperforms baseline models with representation learning in a Euclidean feature space and achieves state-of-the-art performance on the taxonomy expansion benchmarks.

[58]  arXiv:2109.10501 [pdf, other]
Title: Third-party Evaluation of Robotic Hand Designs Using a Mechanical Glove
Comments: 5 pages, 7 figures
Journal-ref: Journal of the Robotics Society of Japan, Vol.39, No.6, pp.557-560, 2021
Subjects: Robotics (cs.RO)

A robotic hand design suitable for dexterity should be examined using functional tests. To achieve this, we designed a mechanical glove, which is a rigid wearable glove that enables us to develop the corresponding isomorphic robotic hand and evaluate its hardware properties. Subsequently, the effectiveness of multiple degrees-of-freedom (DOFs) was evaluated by human participants. Several fine motor skills were evaluated using the mechanical glove under two conditions: one- and three-DOF conditions. To the best of our knowledge, this is the first extensive evaluation method for robotic hand designs suitable for dexterity. (This paper was peer-reviewed and is the full translation from the Journal of the Robotics Society of Japan, Vol.39, No.6, pp.557-560, 2021.)

[59]  arXiv:2109.10502 [pdf, other]
Title: A Spectral Approach to Off-Policy Evaluation for POMDPs
Authors: Yash Nair, Nan Jiang
Subjects: Machine Learning (cs.LG)

We consider off-policy evaluation (OPE) in Partially Observable Markov Decision Processes, where the evaluation policy depends only on observable variables but the behavior policy depends on latent states (Tennenholtz et al. (2020a)). Prior work on this problem uses a causal identification strategy based on one-step observable proxies of the hidden state, which relies on the invertibility of certain one-step moment matrices. In this work, we relax this requirement by using spectral methods and extending one-step proxies both into the past and future. We empirically compare our OPE methods to existing ones and demonstrate their improved prediction accuracy and greater generality. Lastly, we derive a separate Importance Sampling (IS) algorithm which relies on rank, distinctness, and positivity conditions, and not on the strict sufficiency conditions of observable trajectories with respect to the reward and hidden-state structure required by Tennenholtz et al. (2020a).

[60]  arXiv:2109.10504 [pdf, other]
Title: KD-VLP: Improving End-to-End Vision-and-Language Pretraining with Object Knowledge Distillation
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Self-supervised vision-and-language pretraining (VLP) aims to learn transferable multi-modal representations from large-scale image-text data and to achieve strong performances on a broad scope of vision-language tasks after finetuning. Previous mainstream VLP approaches typically adopt a two-step strategy relying on external object detectors to encode images in a multi-modal Transformer framework, which suffer from restrictive object concept space, limited image context and inefficient computation. In this paper, we propose an object-aware end-to-end VLP framework, which directly feeds image grid features from CNNs into the Transformer and learns the multi-modal representations jointly. More importantly, we propose to perform object knowledge distillation to facilitate learning cross-modal alignment at different semantic levels. To achieve that, we design two novel pretext tasks by taking object features and their semantic labels from external detectors as supervision: 1.) Object-guided masked vision modeling task focuses on enforcing object-aware representation learning in the multi-modal Transformer; 2.) Phrase-region alignment task aims to improve cross-modal alignment by utilizing the similarities between noun phrases and object labels in the linguistic space. Extensive experiments on a wide range of vision-language tasks demonstrate the efficacy of our proposed framework, and we achieve competitive or superior performances over the existing pretraining strategies. The code is available in supplementary materials.

[61]  arXiv:2109.10506 [pdf, other]
Title: Tecnologica cosa: Modeling Storyteller Personalities in Boccaccio's Decameron
Comments: The 5th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature (co-located with EMNLP 2021)
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)

We explore Boccaccio's Decameron to see how digital humanities tools can be used for tasks that have limited data in a language no longer in contemporary use: medieval Italian. We focus our analysis on the question: Do the different storytellers in the text exhibit distinct personalities? To answer this question, we curate and release a dataset based on the authoritative edition of the text. We use supervised classification methods to predict storytellers based on the stories they tell, confirming the difficulty of the task, and demonstrate that topic modeling can extract thematic storyteller "profiles."

[62]  arXiv:2109.10509 [pdf, other]
Title: Unsupervised Contextualized Document Representation
Comments: 9 Pages, 4 Figures, 7 tables, SustaiNLP2021 @ EMNLP-2021
Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR); Machine Learning (cs.LG)

Several NLP tasks need the effective representation of text documents. Arora et. al., 2017 demonstrate that simple weighted averaging of word vectors frequently outperforms neural models. SCDV (Mekala et. al., 2017) further extends this from sentences to documents by employing soft and sparse clustering over pre-computed word vectors. However, both techniques ignore the polysemy and contextual character of words. In this paper, we address this issue by proposing SCDV+BERT(ctxd), a simple and effective unsupervised representation that combines contextualized BERT (Devlin et al., 2019) based word embedding for word sense disambiguation with SCDV soft clustering approach. We show that our embeddings outperform original SCDV, pre-train BERT, and several other baselines on many classification datasets. We also demonstrate our embeddings effectiveness on other tasks, such as concept matching and sentence similarity. In addition, we show that SCDV+BERT(ctxd) outperforms fine-tune BERT and different embedding approaches in scenarios with limited data and only few shots examples.

[63]  arXiv:2109.10510 [pdf, other]
Title: FCM: A Fine-grained Comparison Model forMulti-turn Dialogue Reasoning
Comments: EMNLP2021 Findings
Subjects: Computation and Language (cs.CL)

Despite the success of neural dialogue systems in achieving high performance on the leader-board, they cannot meet users' requirements in practice, due to their poor reasoning skills. The underlying reason is that most neural dialogue models only capture the syntactic and semantic information, but fail to model the logical consistency between the dialogue history and the generated response. Recently, a new multi-turn dialogue reasoning task has been proposed, to facilitate dialogue reasoning research. However, this task is challenging, because there are only slight differences between the illogical response and the dialogue history. How to effectively solve this challenge is still worth exploring. This paper proposes a Fine-grained Comparison Model (FCM) to tackle this problem. Inspired by human's behavior in reading comprehension, a comparison mechanism is proposed to focus on the fine-grained differences in the representation of each response candidate. Specifically, each candidate representation is compared with the whole history to obtain a history consistency representation. Furthermore, the consistency signals between each candidate and the speaker's own history are considered to drive a model to prefer a candidate that is logically consistent with the speaker's history logic. Finally, the above consistency representations are employed to output a ranking list of the candidate responses for multi-turn dialogue reasoning. Experimental results on two public dialogue datasets show that our method obtains higher ranking scores than the baseline models.

[64]  arXiv:2109.10512 [pdf, other]
Title: Backdoor Attacks on Federated Learning with Lottery Ticket Hypothesis
Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR)

Edge devices in federated learning usually have much more limited computation and communication resources compared to servers in a data center. Recently, advanced model compression methods, like the Lottery Ticket Hypothesis, have already been implemented on federated learning to reduce the model size and communication cost. However, Backdoor Attack can compromise its implementation in the federated learning scenario. The malicious edge device trains the client model with poisoned private data and uploads parameters to the center, embedding a backdoor to the global shared model after unwitting aggregative optimization. During the inference phase, the model with backdoors classifies samples with a certain trigger as one target category, while shows a slight decrease in inference accuracy to clean samples. In this work, we empirically demonstrate that Lottery Ticket models are equally vulnerable to backdoor attacks as the original dense models, and backdoor attacks can influence the structure of extracted tickets. Based on tickets' similarities between each other, we provide a feasible defense for federated learning against backdoor attacks on various datasets.

[65]  arXiv:2109.10513 [pdf, other]
Title: Anti-degenerated UWB-LiDAR Localization for Automatic Road Roller in Tunnel
Subjects: Robotics (cs.RO)

The automatic road roller, as a popular type of construction robot, has attracted much interest from both the industry and the research community in recent years. However, when it comes to tunnels where the degeneration issues are prone to happen, it is still a challenging problem to provide an accurate positioning result for the robot. In this paper, we aim to deal with this problem by fusing LiDAR and UWB measurements based on optimization. In the proposed localization method, the directions of non-degeneration will be constrained and the covariance of UWB reconstruction will be introduced to improve the accuracy of localization. Apart from these, a method that can extract the feature of the inner wall of tunnels to assist positioning is also presented in this paper. To evaluate the effectiveness of the proposed method, three experiments with real road roller were carried out and the results show that our method can achieve better performance than the existing methods and can be applied to automatic road roller working inside tunnels. Finally, we discuss the feasibility of deploying the system in real applications and make several recommendations.

[66]  arXiv:2109.10514 [pdf]
Title: Towards The Automatic Coding of Medical Transcripts to Improve Patient-Centered Communication
Comments: Society for Design and Process Science (SDPS) 2016
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)

This paper aims to provide an approach for automatic coding of physician-patient communication transcripts to improve patient-centered communication (PCC). PCC is a central part of high-quality health care. To improve PCC, dialogues between physicians and patients have been recorded and tagged with predefined codes. Trained human coders have manually coded the transcripts. Since it entails huge labor costs and poses possible human errors, automatic coding methods should be considered for efficiency and effectiveness. We adopted three machine learning algorithms (Na\"ive Bayes, Random Forest, and Support Vector Machine) to categorize lines in transcripts into corresponding codes. The result showed that there is evidence to distinguish the codes, and this is considered to be sufficient for training of human annotators.

[67]  arXiv:2109.10521 [pdf, other]
Title: Incorporating Data Uncertainty in Object Tracking Algorithms
Comments: For associated video, see this https URL
Subjects: Systems and Control (eess.SY); Computer Vision and Pattern Recognition (cs.CV)

Methodologies for incorporating the uncertainties characteristic of data-driven object detectors into object tracking algorithms are explored. Object tracking methods rely on measurement error models, typically in the form of measurement noise, false positive rates, and missed detection rates. Each of these quantities, in general, can be dependent on object or measurement location. However, for detections generated from neural-network processed camera inputs, these measurement error statistics are not sufficient to represent the primary source of errors, namely a dissimilarity between run-time sensor input and the training data upon which the detector was trained. To this end, we investigate incorporating data uncertainty into object tracking methods such as to improve the ability to track objects, and particularly those which out-of-distribution w.r.t. training data. The proposed methodologies are validated on an object tracking benchmark as well on experiments with a real autonomous aircraft.

[68]  arXiv:2109.10523 [pdf, other]
Title: Investigating and Modeling the Dynamics of Long Ties
Comments: 46 pages, 18 figures
Subjects: Social and Information Networks (cs.SI); Computer Science and Game Theory (cs.GT); Machine Learning (cs.LG); Physics and Society (physics.soc-ph)

Long ties, the social ties that bridge different communities, are widely believed to play crucial roles in spreading novel information in social networks. However, some existing network theories and prediction models indicate that long ties might dissolve quickly or eventually become redundant, thus putting into question the long-term value of long ties. Our empirical analysis of real-world dynamic networks shows that contrary to such reasoning, long ties are more likely to persist than other social ties, and that many of them constantly function as social bridges without being embedded in local networks. Using a novel cost-benefit analysis model combined with machine learning, we show that long ties are highly beneficial, which instinctively motivates people to expend extra effort to maintain them. This partly explains why long ties are more persistent than what has been suggested by many existing theories and models. Overall, our study suggests the need for social interventions that can promote the formation of long ties, such as mixing people with diverse backgrounds.

[69]  arXiv:2109.10524 [pdf, other]
Title: A Method For Adding Motion-Blur on Arbitrary Objects By using Auto-Segmentation and Color Compensation Techniques
Comments: This paper was accepted at ICIP 2021
Journal-ref: 2021 IEEE International Conference on Image Processing (ICIP)
Subjects: Computer Vision and Pattern Recognition (cs.CV)

When dynamic objects are captured by a camera, motion blur inevitably occurs. Such a blur is sometimes considered as just a noise, however, it sometimes gives an important effect to add dynamism in the scene for photographs or videos. Unlike the similar effects, such as defocus blur, which is now easily controlled even by smartphones, motion blur is still uncontrollable and makes undesired effects on photographs. In this paper, an unified framework to add motion blur on per-object basis is proposed. In the method, multiple frames are captured without motion blur and they are accumulated to create motion blur on target objects. To capture images without motion blur, shutter speed must be short, however, it makes captured images dark, and thus, a sensor gain should be increased to compensate it. Since a sensor gain causes a severe noise on image, we propose a color compensation algorithm based on non-linear filtering technique for solution. Another contribution is that our technique can be used to make HDR images for fast moving objects by using multi-exposure images. In the experiments, effectiveness of the method is confirmed by ablation study using several data sets.

[70]  arXiv:2109.10526 [pdf, other]
Title: Extremism & Whataboutism: A Case Study on Bangalore Riots
Comments: Accepted at CSCW'21 Workshop on Addressing Challenges and Opportunities in Online Extremism Research
Subjects: Social and Information Networks (cs.SI)

A common diversionary tactic used to deflect attention from contested issues is whataboutery which, when used by majoritarian groups to justify their behaviour against marginalised communities, can quickly devolve into extremism. We explore the manifestations of extreme speech in the Indian context, through a case study of violent protests and policing in the city of Bangalore, provoked by a derogatory Facebook post. Analyses of the dominant narratives on Twitter surrounding the incident reveal that, most of them employ whataboutism to deflect attention from the triggering post and serve as breeding grounds for religion-based extreme speech. We conclude by discussing how our study proposes an alternative lens of viewing extremism in the Global South.

[71]  arXiv:2109.10528 [pdf, other]
Title: A unified interpretation of the Gaussian mechanism for differential privacy through the sensitivity index
Comments: Under review at PETS 2022
Subjects: Cryptography and Security (cs.CR); Information Theory (cs.IT); Machine Learning (cs.LG)

The Gaussian mechanism (GM) represents a universally employed tool for achieving differential privacy (DP), and a large body of work has been devoted to its analysis. We argue that the three prevailing interpretations of the GM, namely $(\varepsilon, \delta)$-DP, f-DP and R\'enyi DP can be expressed by using a single parameter $\psi$, which we term the sensitivity index. $\psi$ uniquely characterises the GM and its properties by encapsulating its two fundamental quantities: the sensitivity of the query and the magnitude of the noise perturbation. With strong links to the ROC curve and the hypothesis-testing interpretation of DP, $\psi$ offers the practitioner a powerful method for interpreting, comparing and communicating the privacy guarantees of Gaussian mechanisms.

[72]  arXiv:2109.10529 [pdf, other]
Title: Numerical Continued Fraction Interpolation
Subjects: Numerical Analysis (math.NA)

We show that highly accurate approximations can often be obtained from constructing Thiele interpolating continued fractions by a Greedy selection of the interpolation points together with an early termination condition. The obtained results are comparable with the outcome from state-of-the-art rational interpolation techniques based on the barycentric form.

[73]  arXiv:2109.10534 [pdf, other]
Title: Role of Language Relatedness in Multilingual Fine-tuning of Language Models: A Case Study in Indo-Aryan Languages
Comments: Accepted in EMNLP 2021
Subjects: Computation and Language (cs.CL)

We explore the impact of leveraging the relatedness of languages that belong to the same family in NLP models using multilingual fine-tuning. We hypothesize and validate that multilingual fine-tuning of pre-trained language models can yield better performance on downstream NLP applications, compared to models fine-tuned on individual languages. A first of its kind detailed study is presented to track performance change as languages are added to a base language in a graded and greedy (in the sense of best boost of performance) manner; which reveals that careful selection of subset of related languages can significantly improve performance than utilizing all related languages. The Indo-Aryan (IA) language family is chosen for the study, the exact languages being Bengali, Gujarati, Hindi, Marathi, Oriya, Punjabi and Urdu. The script barrier is crossed by simple rule-based transliteration of the text of all languages to Devanagari. Experiments are performed on mBERT, IndicBERT, MuRIL and two RoBERTa-based LMs, the last two being pre-trained by us. Low resource languages, such as Oriya and Punjabi, are found to be the largest beneficiaries of multilingual fine-tuning. Textual Entailment, Entity Classification, Section Title Prediction, tasks of IndicGLUE and POS tagging form our test bed. Compared to monolingual fine tuning we get relative performance improvement of up to 150% in the downstream tasks. The surprise take-away is that for any language there is a particular combination of other languages which yields the best performance, and any additional language is in fact detrimental.

[74]  arXiv:2109.10535 [pdf, other]
Title: Cramér-Rao bound-informed training of neural networks for quantitative MRI
Comments: Xiaoxia Zhang, Quentin Duchemin, and Kangning Liu contributed equally to this work
Subjects: Machine Learning (cs.LG); Image and Video Processing (eess.IV); Medical Physics (physics.med-ph)

Neural networks are increasingly used to estimate parameters in quantitative MRI, in particular in magnetic resonance fingerprinting. Their advantages over the gold standard non-linear least square fitting are their superior speed and their immunity to the non-convexity of many fitting problems. We find, however, that in heterogeneous parameter spaces, i.e. in spaces in which the variance of the estimated parameters varies considerably, good performance is hard to achieve and requires arduous tweaking of the loss function, hyper parameters, and the distribution of the training data in parameter space. Here, we address these issues with a theoretically well-founded loss function: the Cram\'er-Rao bound (CRB) provides a theoretical lower bound for the variance of an unbiased estimator and we propose to normalize the squared error with respective CRB. With this normalization, we balance the contributions of hard-to-estimate and not-so-hard-to-estimate parameters and areas in parameter space, and avoid a dominance of the former in the overall training loss. Further, the CRB-based loss function equals one for a maximally-efficient unbiased estimator, which we consider the ideal estimator. Hence, the proposed CRB-based loss function provides an absolute evaluation metric. We compare a network trained with the CRB-based loss with a network trained with the commonly used means squared error loss and demonstrate the advantages of the former in numerical, phantom, and in vivo experiments.

[75]  arXiv:2109.10538 [pdf, other]
Title: Index $t$-SNE: Tracking Dynamics of High-Dimensional Datasets with Coherent Embeddings
Comments: International Conference on Big Data Visual Analytics (ICBDVA), Venice, Italy, August 12-13 2021 this https URL Best paper award
Journal-ref: International Journal of Computer and Systems Engineering (2021), 15(8), 500 - 512
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)

$t$-SNE is an embedding method that the data science community has widely Two interesting characteristics of t-SNE are the structure preservation property and the answer to the crowding problem, where all neighbors in high dimensional space cannot be represented correctly in low dimensional space. $t$-SNE preserves the local neighborhood, and similar items are nicely spaced by adjusting to the local density. These two characteristics produce a meaningful representation, where the cluster area is proportional to its size in number, and relationships between clusters are materialized by closeness on the embedding.
This algorithm is non-parametric, therefore two initializations of the algorithm would lead to two different embedding. In a forensic approach, analysts would like to compare two or more datasets using their embedding. An approach would be to learn a parametric model over an embedding built with a subset of data. While this approach is highly scalable, points could be mapped at the same exact position, making them indistinguishable. This type of model would be unable to adapt to new outliers nor concept drift.
This paper presents a methodology to reuse an embedding to create a new one, where cluster positions are preserved. The optimization process minimizes two costs, one relative to the embedding shape and the second relative to the support embedding' match. The proposed algorithm has the same complexity than the original $t$-SNE to embed new items, and a lower one when considering the embedding of a dataset sliced into sub-pieces. The method showed promising results on a real-world dataset, allowing to observe the birth, evolution and death of clusters. The proposed approach facilitates identifying significant trends and changes, which empowers the monitoring high dimensional datasets' dynamics.

[76]  arXiv:2109.10540 [pdf, other]
Title: Awakening Latent Grounding from Pretrained Language Models for Semantic Parsing
Comments: Accepted by ACL 2021 Findings. The first three authors contributed equally
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

Recent years pretrained language models (PLMs) hit a success on several downstream tasks, showing their power on modeling language. To better understand and leverage what PLMs have learned, several techniques have emerged to explore syntactic structures entailed by PLMs. However, few efforts have been made to explore grounding capabilities of PLMs, which are also essential. In this paper, we highlight the ability of PLMs to discover which token should be grounded to which concept, if combined with our proposed erasing-then-awakening approach. Empirical studies on four datasets demonstrate that our approach can awaken latent grounding which is understandable to human experts, even if it is not exposed to such labels during training. More importantly, our approach shows great potential to benefit downstream semantic parsing models. Taking text-to-SQL as a case study, we successfully couple our approach with two off-the-shelf parsers, obtaining an absolute improvement of up to 9.8%.

[77]  arXiv:2109.10547 [pdf]
Title: K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering
Comments: CIKM 2021
Subjects: Artificial Intelligence (cs.AI)

Knowledge enhanced pre-trained language models (K-PLMs) are shown to be effective for many public tasks in the literature but few of them have been successfully applied in practice. To address this problem, we propose K-AID, a systematic approach that includes a low-cost knowledge acquisition process for acquiring domain knowledge, an effective knowledge infusion module for improving model performance, and a knowledge distillation component for reducing the model size and deploying K-PLMs on resource-restricted devices (e.g., CPU) for real-world application. Importantly, instead of capturing entity knowledge like the majority of existing K-PLMs, our approach captures relational knowledge, which contributes to better-improving sentence-level text classification and text matching tasks that play a key role in question answering (QA). We conducted a set of experiments on five text classification tasks and three text matching tasks from three domains, namely E-commerce, Government, and Film&TV, and performed online A/B tests in E-commerce. Experimental results show that our approach is able to achieve substantial improvement on sentence-level question answering tasks and bring beneficial business value in industrial settings.

[78]  arXiv:2109.10549 [pdf, other]
Title: On the $2$-domination number of cylinders with small cycles
Comments: 15 pages, 1 figure
Subjects: Discrete Mathematics (cs.DM); Combinatorics (math.CO)

Domination-type parameters are difficult to manage in Cartesian product graphs and there is usually no general relationship between the parameter in both factors and in the product graph. This is the situation of the domination number, the Roman domination number or the $2$-domination number, among others. Contrary to what happens with the domination number and the Roman domination number, the $2$-domination number remains unknown in cylinders, that is, the Cartesian product of a cycle and a path and in this paper, we will compute this parameter in the cylinders with small cycles. We will develop two algorithms involving the $(\min,+)$ matrix product that will allow us to compute the desired values of $\gamma_2(C_n\Box P_m)$, with $3\leq n\leq 15$ and $m\geq 2$. We will also pose a conjecture about the general formulae for the $2$-domination number in this graph class.

[79]  arXiv:2109.10552 [pdf, other]
Title: MEPG: A Minimalist Ensemble Policy Gradient Framework for Deep Reinforcement Learning
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)

Ensemble reinforcement learning (RL) aims to mitigate instability in Q-learning and to learn a robust policy, which introduces multiple value and policy functions. In this paper, we consider finding a novel but simple ensemble Deep RL algorithm to solve the resource consumption issue. Specifically, we consider integrating multiple models into a single model. To this end, we propose the \underline{M}inimalist \underline{E}nsemble \underline{P}olicy \underline{G}radient framework (MEPG), which introduces minimalist ensemble consistent Bellman update. And we find one value network is sufficient in our framework. Moreover, we theoretically show that the policy evaluation phase in the MEPG is mathematically equivalent to a deep Gaussian Process. To verify the effectiveness of the MEPG framework, we conduct experiments on the gym simulator, which show that the MEPG framework matches or outperforms the state-of-the-art ensemble methods and model-free methods without additional computational resource costs.

[80]  arXiv:2109.10554 [pdf, ps, other]
Title: On Conflict-Free Replicated Data Types and Equivocation in Byzantine Setups
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Data Structures and Algorithms (cs.DS)

We explore the property of equivocation tolerance for Conflict-Free Replicated Data Types (CRDTs). We show that a subclass of CRDTs is equivocation-tolerant and can thereby cope with any number of Byzantine faults: Without equivocation detection, prevention or remediation, they still fulfill strong eventual consistency (SEC). We also conjecture that there is only one operation-based CRDT design supporting non-commutative operations that fulfills SEC in Byzantine environments with any number of faults.

[81]  arXiv:2109.10557 [pdf, other]
Title: A Reinforcement Learning Benchmark for Autonomous Driving in Intersection Scenarios
Subjects: Artificial Intelligence (cs.AI); Robotics (cs.RO)

In recent years, control under urban intersection scenarios becomes an emerging research topic. In such scenarios, the autonomous vehicle confronts complicated situations since it must deal with the interaction with social vehicles timely while obeying the traffic rules. Generally, the autonomous vehicle is supposed to avoid collisions while pursuing better efficiency. The existing work fails to provide a framework that emphasizes the integrity of the scenarios while being able to deploy and test reinforcement learning(RL) methods. Specifically, we propose a benchmark for training and testing RL-based autonomous driving agents in complex intersection scenarios, which is called RL-CIS. Then, a set of baselines are deployed consists of various algorithms. The test benchmark and baselines are to provide a fair and comprehensive training and testing platform for the study of RL for autonomous driving in the intersection scenario, advancing the progress of RL-based methods for intersection autonomous driving control. The code of our proposed framework can be found at https://github.com/liuyuqi123/ComplexUrbanScenarios.

[82]  arXiv:2109.10559 [pdf, other]
Title: Hierarchical Multimodal Transformer to Summarize Videos
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)

Although video summarization has achieved tremendous success benefiting from Recurrent Neural Networks (RNN), RNN-based methods neglect the global dependencies and multi-hop relationships among video frames, which limits the performance. Transformer is an effective model to deal with this problem, and surpasses RNN-based methods in several sequence modeling tasks, such as machine translation, video captioning, \emph{etc}. Motivated by the great success of transformer and the natural structure of video (frame-shot-video), a hierarchical transformer is developed for video summarization, which can capture the dependencies among frame and shots, and summarize the video by exploiting the scene information formed by shots. Furthermore, we argue that both the audio and visual information are essential for the video summarization task. To integrate the two kinds of information, they are encoded in a two-stream scheme, and a multimodal fusion mechanism is developed based on the hierarchical transformer. In this paper, the proposed method is denoted as Hierarchical Multimodal Transformer (HMT). Practically, extensive experiments show that HMT surpasses most of the traditional, RNN-based and attention-based video summarization methods.

[83]  arXiv:2109.10560 [pdf, other]
Title: Why Don't You Click: Neural Correlates of Non-Click Behaviors in Web Search
Subjects: Information Retrieval (cs.IR); Information Theory (cs.IT)

Web search heavily relies on click-through behavior as an essential feedback signal for performance improvement and evaluation. Traditionally, click is usually treated as a positive implicit feedback signal of relevance or usefulness, while non-click (especially non-click after examination) is regarded as a signal of irrelevance or uselessness. However, there are many cases where users do not click on any search results but still satisfy their information need with the contents of the results shown on the Search Engine Result Page (SERP). This raises the problem of measuring result usefulness and modeling user satisfaction in "Zero-click" search scenarios.
Previous works have solved this issue by (1) detecting user satisfaction for abandoned SERP with context information and (2) considering result-level click necessity with external assessors' annotations. However, few works have investigated the reason behind non-click behavior and estimated the usefulness of non-click results. A challenge for this research question is how to collect valuable feedback for non-click results. With neuroimaging technologies, we design a lab-based user study and reveal differences in brain signals while examining non-click search results with different usefulness levels. The findings in significant brain regions and electroencephalogram~(EEG) spectrum also suggest that the process of usefulness judgment might involve similar cognitive functions of relevance perception and satisfaction decoding. Inspired by these findings, we conduct supervised learning tasks to estimate the usefulness of non-click results with brain signals and conventional information (i.e., content and context factors). Results show that it is feasible to utilize brain signals to improve usefulness estimation performance and enhancing human-computer interactions in "Zero-click" search scenarios.

[84]  arXiv:2109.10561 [pdf]
Title: Few-Shot Sound Source Distance Estimation Using Relation Networks
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)

In this paper, we study the performance of few-shot learning, specifically meta learning empowered few-shot relation networks, over classic supervised learning in the problem of sound source distance estimation(SSDE). In previous research on deep supervised SSDE, obtaining low accuracies due to the mismatch between the training data(sound from known environments) and the test data(sound from unknown environments) has almost always been the case. By performing comparative experiments on a sufficient amount of data, we show that the few-shot relation network outperform a classic CNN which is a supervised deep learning approach, and hence it is possible to calibrate a microphone-equipped system, with a few labeled examples of audio recorded in a particular unknown environment to adjust and generalize our classifier to the possible input data and gain higher accuracies.

[85]  arXiv:2109.10563 [pdf, other]
Title: Improving 360 Monocular Depth Estimation via Non-local Dense Prediction Transformer and Joint Supervised and Self-supervised Learning
Comments: 13 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Due to difficulties in acquiring ground truth depth of equirectangular (360) images, the quality and quantity of equirectangular depth data today is insufficient to represent the various scenes in the world. Therefore, 360 depth estimation studies, which relied solely on supervised learning, are destined to produce unsatisfactory results. Although self-supervised learning methods focusing on equirectangular images (EIs) are introduced, they often have incorrect or non-unique solutions, causing unstable performance. In this paper, we propose 360 monocular depth estimation methods which improve on the areas that limited previous studies. First, we introduce a self-supervised 360 depth learning method that only utilizes gravity-aligned videos, which has the potential to eliminate the needs for depth data during the training procedure. Second, we propose a joint learning scheme realized by combining supervised and self-supervised learning. The weakness of each learning is compensated, thus leading to more accurate depth estimation. Third, we propose a non-local fusion block, which retains global information encoded by vision transformer when reconstructing the depths. With the proposed methods, we successfully apply the transformer to 360 depth estimations, to the best of our knowledge, which has not been tried before. On several benchmarks, our approach achieves significant improvements over previous works and establishes a state of the art.

[86]  arXiv:2109.10569 [pdf, other]
Title: The Curse Revisited: a Newly Quantified Concept of Meaningful Distances for Learning from High-Dimensional Noisy Data
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

Distances between data points are widely used in point cloud representation learning. Yet, it is no secret that under the effect of noise, these distances-and thus the models based upon them-may lose their usefulness in high dimensions. Indeed, the small marginal effects of the noise may then accumulate quickly, shifting empirical closest and furthest neighbors away from the ground truth. In this paper, we characterize such effects in high-dimensional data using an asymptotic probabilistic expression. Furthermore, while it has been previously argued that neighborhood queries become meaningless and unstable when there is a poor relative discrimination between the furthest and closest point, we conclude that this is not necessarily the case when explicitly separating the ground truth data from the noise. More specifically, we derive that under particular conditions, empirical neighborhood relations affected by noise are still likely to be true even when we observe this discrimination to be poor. We include thorough empirical verification of our results, as well as experiments that interestingly show our derived phase shift where neighbors become random or not is identical to the phase shift where common dimensionality reduction methods perform poorly or well for finding low-dimensional representations of high-dimensional data with dense noise.

[87]  arXiv:2109.10571 [pdf, other]
Title: Audio-Visual Grounding Referring Expression for Robotic Manipulation
Subjects: Robotics (cs.RO)

Referring expressions are commonly used when referring to a specific target in people's daily dialogue. In this paper, we develop a novel task of audio-visual grounding referring expression for robotic manipulation. The robot leverages both the audio and visual information to understand the referring expression in the given manipulation instruction and the corresponding manipulations are implemented. To solve the proposed task, an audio-visual framework is proposed for visual localization and sound recognition. We have also established a dataset which contains visual data, auditory data and manipulation instructions for evaluation. Finally, extensive experiments are conducted both offline and online to verify the effectiveness of the proposed audio-visual framework. And it is demonstrated that the robot performs better with the audio-visual data than with only the visual data.

[88]  arXiv:2109.10572 [pdf]
Title: Realism of Simulation Models in Serious Gaming: Two case studies from Urban Water Management Higher Education
Subjects: Multimedia (cs.MM); Human-Computer Interaction (cs.HC)

For games used in educational contexts, realism, i.e., the degree of congruence between the simulation models used in the games and the real-world systems represented, is an important characteristic for achieving learning goals well. However, in the past, the realism of especially entertainment games has often been identified as insufficient. Thus, this study is investigating the degree of realism provided by current games. To this purpose, two games in the domain urban water management, a subdomain of environmental engineering (EE), are examined. One is ANAWAK, a web-based serious game on water management and climate change. For ANAWAK, an analysis of the simulation model is conducted. Second, the simulation model of the entertainment game Cities: Skylines (CS) is analyzed. In addition, a survey among CS players (N=61) is conducted. Thereby, different degrees of realism in various EE subdomains are revealed. All in all, there are still considerable deficits regarding the degree of realism in the CS simulation model. However, modding as a means of achieving more realistic simulation models is more widely supported than in the past.

[89]  arXiv:2109.10573 [pdf, other]
Title: An automatic differentiation system for the age of differential privacy
Comments: 8 pages
Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR)

We introduce Tritium, an automatic differentiation-based sensitivity analysis framework for differentially private (DP) machine learning (ML). Optimal noise calibration in this setting requires efficient Jacobian matrix computations and tight bounds on the L2-sensitivity. Our framework achieves these objectives by relying on a functional analysis-based method for sensitivity tracking, which we briefly outline. This approach interoperates naturally and seamlessly with static graph-based automatic differentiation, which enables order-of-magnitude improvements in compilation times compared to previous work. Moreover, we demonstrate that optimising the sensitivity of the entire computational graph at once yields substantially tighter estimates of the true sensitivity compared to interval bound propagation techniques. Our work naturally befits recent developments in DP such as individual privacy accounting, aiming to offer improved privacy-utility trade-offs, and represents a step towards the integration of accessible machine learning tooling with advanced privacy accounting systems.

[90]  arXiv:2109.10575 [pdf, other]
Title: Autonomous Cooperative Transportation System involving Multi-Aerial Robots with Variable Attachment Mechanism
Comments: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2021)(in press)
Subjects: Robotics (cs.RO)

Cooperative transportation by multi-aerial robots has the potential to support various payloads and improve failsafe against dropping. Furthermore, changing the attachment positions of robots according payload characteristics increases the stability of transportation. However, there are almost no transportation systems capable of scaling to the payload weight and size and changing the optimal attachment positions. To address this issue, we propose a cooperative transportation system comprising autonomously executable software and suitable hardware, covering the entire process, from pre-takeoff setting to controlled flight. The proposed system decides the formation of the attachment positions by prioritizing controllability based on the center of gravity obtained from Bayesian estimations with robot pairs. We investigated the cooperative transportation of an unknown payload larger than that of whole carrier robots through numerical simulations. Furthermore, we performed cooperative transportation of an unknown payload (with a weight of about 3.2 kg and maximum length of 1.76 m) using eight robots. The proposed system was found to be versatile with regard to handling unknown payloads with different shapes and center-of-gravity positions.

[91]  arXiv:2109.10576 [pdf, other]
Title: Event-triggered observer design for linear systems
Subjects: Systems and Control (eess.SY)

We present an event-triggered observer design for linear time-invariant systems, where the measured output is sent to the observer only when a triggering condition is satisfied. We proceed by emulation and we first construct a continuous-time Luenberger observer. We then propose a dynamic rule to trigger transmissions, which only depends on the plant output and an auxiliary scalar state variable. The overall system is modeled as a hybrid system, for which a jump corresponds to an output transmission. We show that the proposed event-triggered observer guarantees global practical asymptotic stability for the estimation error dynamics. Moreover, under mild boundedness conditions on the plant state and its input, we prove that there exists a uniform strictly positive minimum inter-event time between any two consecutive transmissions, guaranteeing that the system does not exhibit Zeno solutions. Finally, the proposed approach is applied to a numerical case study of a lithium-ion battery.

[92]  arXiv:2109.10582 [pdf, other]
Title: Partial sensitivity analysis in differential privacy
Subjects: Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI)

Differential privacy (DP) allows the quantification of privacy loss when the data of individuals is subjected to algorithmic processing such as machine learning, as well as the provision of objective privacy guarantees. However, while techniques such as individual R\'enyi DP (RDP) allow for granular, per-person privacy accounting, few works have investigated the impact of each input feature on the individual's privacy loss. Here we extend the view of individual RDP by introducing a new concept we call partial sensitivity, which leverages symbolic automatic differentiation to determine the influence of each input feature on the gradient norm of a function. We experimentally evaluate our approach on queries over private databases, where we obtain a feature-level contribution of private attributes to the DP guarantee of individuals. Furthermore, we explore our findings in the context of neural network training on synthetic data by investigating the partial sensitivity of input pixels on an image classification task.

[93]  arXiv:2109.10583 [pdf, other]
Title: Efficient Object Manipulation to an Arbitrary Goal Pose: Learning-based Anytime Prioritized Planning
Subjects: Robotics (cs.RO)

We focus on the task of object manipulation to an arbitrary goal pose, in which a robot is supposed to pick an assigned object to place at the goal position with a specific pose. However, limited by the execution space of the manipulator with gripper, one-step picking, moving and releasing might be failed, where an intermediate object pose is required as a transition. In this paper, we propose a learning-driven anytime prioritized search-based solver to find a feasible solution with low path cost in a short time. In our work, the problem is formulated as a hierarchical learning problem, with the high level aiming at finding an intermediate object pose, and the low-level manipulator path planning between adjacent grasps. We learn an off-line training path cost estimator to predict approximate path planning costs, which serve as pseudo rewards to allow for pre-training the high-level planner without interacting with the simulator. To deal with the problem of distribution mismatch of the cost net and the actual execution cost space, a refined training stage is conducted with simulation interaction. A series of experiments carried out in simulation and real world indicate that our system can achieve better performances in the object manipulation task with less time and less cost.

[94]  arXiv:2109.10591 [pdf, other]
Title: High-dimensional Bayesian Optimization for CNN Auto Pruning with Clustering and Rollback
Comments: 7 pages with 1 page for references
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)

Pruning has been widely used to slim convolutional neural network (CNN) models to achieve a good trade-off between accuracy and model size so that the pruned models become feasible for power-constrained devices such as mobile phones. This process can be automated to avoid the expensive hand-crafted efforts and to explore a large pruning space automatically so that the high-performance pruning policy can be achieved efficiently. Nowadays, reinforcement learning (RL) and Bayesian optimization (BO)-based auto pruners are widely used due to their solid theoretical foundation, universality, and high compressing quality. However, the RL agent suffers from long training times and high variance of results, while the BO agent is time-consuming for high-dimensional design spaces. In this work, we propose an enhanced BO agent to obtain significant acceleration for auto pruning in high-dimensional design spaces. To achieve this, a novel clustering algorithm is proposed to reduce the dimension of the design space to speedup the searching process. Then, a roll-back algorithm is proposed to recover the high-dimensional design space so that higher pruning accuracy can be obtained. We validate our proposed method on ResNet, MobileNet, and VGG models, and our experiments show that the proposed method significantly improves the accuracy of BO when pruning very deep CNN models. Moreover, our method achieves lower variance and shorter time than the RL-based counterpart.

[95]  arXiv:2109.10593 [pdf, other]
Title: Emulating Aerosol Microphysics with a Machine Learning
Subjects: Machine Learning (cs.LG)

Aerosol particles play an important role in the climate system by absorbing and scattering radiation and influencing cloud properties. They are also one of the biggest sources of uncertainty for climate modeling. Many climate models do not include aerosols in sufficient detail. In order to achieve higher accuracy, aerosol microphysical properties and processes have to be accounted for. This is done in the ECHAM-HAM global climate aerosol model using the M7 microphysics model, but increased computational costs make it very expensive to run at higher resolutions or for a longer time. We aim to use machine learning to approximate the microphysics model at sufficient accuracy and reduce the computational cost by being fast at inference time. The original M7 model is used to generate data of input-output pairs to train a neural network on it. By using a special logarithmic transform we are able to learn the variables tendencies achieving an average $R^2$ score of $89\%$. On a GPU we achieve a speed-up of 120 compared to the original model.

[96]  arXiv:2109.10595 [pdf, other]
Title: Live Speech Portraits: Real-Time Photorealistic Talking-Head Animation
Comments: SIGGRAPH Asia 2021, 17 pages, 16 figures
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)

To the best of our knowledge, we first present a live system that generates personalized photorealistic talking-head animation only driven by audio signals at over 30 fps. Our system contains three stages. The first stage is a deep neural network that extracts deep audio features along with a manifold projection to project the features to the target person's speech space. In the second stage, we learn facial dynamics and motions from the projected audio features. The predicted motions include head poses and upper body motions, where the former is generated by an autoregressive probabilistic model which models the head pose distribution of the target person. Upper body motions are deduced from head poses. In the final stage, we generate conditional feature maps from previous predictions and send them with a candidate image set to an image-to-image translation network to synthesize photorealistic renderings. Our method generalizes well to wild audio and successfully synthesizes high-fidelity personalized facial details, e.g., wrinkles, teeth. Our method also allows explicit control of head poses. Extensive qualitative and quantitative evaluations, along with user studies, demonstrate the superiority of our method over state-of-the-art techniques.

[97]  arXiv:2109.10596 [pdf, other]
Title: Fully probabilistic design for knowledge fusion between Bayesian filters under uniform disturbances
Authors: Lenka Kuklišová Pavelková (1), Ladislav Jirsa (1), Anthony Quinn (1 and 2) ((1) Czech Academy of Sciences, Institute of Information Theory and Automation, Czech Republic, (2) Trinity College Dublin, the University of Dublin, Ireland)
Comments: 39 pages
Subjects: Machine Learning (cs.LG)

This paper considers the problem of Bayesian transfer learning-based knowledge fusion between linear state-space processes driven by uniform state and observation noise processes. The target task conditions on probabilistic state predictor(s) supplied by the source filtering task(s) to improve its own state estimate. A joint model of the target and source(s) is not required and is not elicited. The resulting decision-making problem for choosing the optimal conditional target filtering distribution under incomplete modelling is solved via fully probabilistic design (FPD), i.e. via appropriate minimization of Kullback-Leibler divergence (KLD). The resulting FPD-optimal target learner is robust, in the sense that it can reject poor-quality source knowledge. In addition, the fact that this Bayesian transfer learning (BTL) scheme does not depend on a model of interaction between the source and target tasks ensures robustness to the misspecification of such a model. The latter is a problem that affects conventional transfer learning methods. The properties of the proposed BTL scheme are demonstrated via extensive simulations, and in comparison with two contemporary alternatives.

[98]  arXiv:2109.10598 [pdf, other]
Title: Diarisation using Location tracking with agglomerative clustering
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)

Previous works have shown that spatial location information can be complementary to speaker embeddings for a speaker diarisation task. However, the models used often assume that speakers are fairly stationary throughout a meeting. This paper proposes to relax this assumption, by explicitly modelling the movements of speakers within an Agglomerative Hierarchical Clustering (AHC) diarisation framework. Kalman filters, which track the locations of speakers, are used to compute log-likelihood ratios that contribute to the cluster affinity computations for the AHC merging and stopping decisions. Experiments show that the proposed approach is able to yield improvements on a Microsoft rich meeting transcription task, compared to methods that do not use location information or that make stationarity assumptions.

[99]  arXiv:2109.10602 [pdf, ps, other]
Title: Context-aware Tree-based Deep Model for Recommender Systems
Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI)

How to predict precise user preference and how to make efficient retrieval from a big corpus are two major challenges of large-scale industrial recommender systems. In tree-based methods, a tree structure T is adopted as index and each item in corpus is attached to a leaf node on T . Then the recommendation problem is converted into a hierarchical retrieval problem solved by a beam search process efficiently. In this paper, we argue that the tree index used to support efficient retrieval in tree-based methods also has rich hierarchical information about the corpus. Furthermore, we propose a novel context-aware tree-based deep model (ConTDM) for recommender systems. In ConTDM, a context-aware user preference prediction model M is designed to utilize both horizontal and vertical contexts on T . Horizontally, a graph convolutional layer is used to enrich the representation of both users and nodes on T with their neighbors. Vertically, a parent fusion layer is designed in M to transmit the user preference representation in higher levels of T to the current level, grasping the essence that tree-based methods are generating the candidate set from coarse to detail during the beam search retrieval. Besides, we argue that the proposed user preference model in ConTDM can be conveniently extended to other tree-based methods for recommender systems. Both experiments on large scale real-world datasets and online A/B test in large scale industrial applications show the significant improvements brought by ConTDM.

[100]  arXiv:2109.10604 [pdf, other]
Title: NOAHQA: Numerical Reasoning with Interpretable Graph Question Answering Dataset
Comments: Findings of EMNLP 2021. Code will be released at: this https URL
Subjects: Computation and Language (cs.CL)

While diverse question answering (QA) datasets have been proposed and contributed significantly to the development of deep learning models for QA tasks, the existing datasets fall short in two aspects. First, we lack QA datasets covering complex questions that involve answers as well as the reasoning processes to get the answers. As a result, the state-of-the-art QA research on numerical reasoning still focuses on simple calculations and does not provide the mathematical expressions or evidences justifying the answers. Second, the QA community has contributed much effort to improving the interpretability of QA models. However, these models fail to explicitly show the reasoning process, such as the evidence order for reasoning and the interactions between different pieces of evidence. To address the above shortcomings, we introduce NOAHQA, a conversational and bilingual QA dataset with questions requiring numerical reasoning with compound mathematical expressions. With NOAHQA, we develop an interpretable reasoning graph as well as the appropriate evaluation metric to measure the answer quality. We evaluate the state-of-the-art QA models trained using existing QA datasets on NOAHQA and show that the best among them can only achieve 55.5 exact match scores, while the human performance is 89.7. We also present a new QA model for generating a reasoning graph where the reasoning graph metric still has a large gap compared with that of humans, e.g., 28 scores.

[101]  arXiv:2109.10606 [pdf, other]
Title: Privacy-preserving Credit Scoring via Functional Encryption
Comments: Computational Science and Its Applications -- ICCSA 2021 -- Springer International Publishing
Subjects: Cryptography and Security (cs.CR)

The majority of financial organizations managing confidential data are aware of security threats and leverage widely accepted solutions (e.g., storage encryption, transport-level encryption, intrusion detection systems) to prevent or detect attacks. Yet these hardening measures do little to face even worse threats posed on data-in-use. Solutions such as Homomorphic Encryption (HE) and hardware-assisted Trusted Execution Environment (TEE) are nowadays among the preferred approaches for mitigating this type of threat. However, given the high-performance overhead of HE, financial institutions -- whose processing rate requirements are stringent -- are more oriented towards TEE-based solutions. The X-Margin Inc. company, for example, offers secure financial computations by combining the Intel SGX TEE technology and HE-based Zero-Knowledge Proofs, which shield customers' data-in-use even against malicious insiders, i.e., users having privileged access to the system. Despite such a solution offers strong security guarantees, it is constrained by having to trust Intel and by the SGX hardware extension availability. In this paper, we evaluate a new frontier for X-Margin, i.e., performing privacy-preserving credit risk scoring via an emerging cryptographic scheme: Functional Encryption (FE), which allows a user to only learn a function of the encrypted data. We describe how the X-Margin application can benefit from this innovative approach and -- most importantly -- evaluate its performance impact.

[102]  arXiv:2109.10607 [pdf, other]
Title: Accuracy Evaluation of Touch Tasks in Commodity Virtual and Augmented Reality Head-Mounted Displays
Comments: To appear in SUI 2021, November 09-10, Virtual Conference
Subjects: Human-Computer Interaction (cs.HC)

An increasing number of consumer-oriented head-mounted displays (HMD) for augmented and virtual reality (AR/VR) are capable of finger and hand tracking. We report on the accuracy of off-the-shelf VR and AR HMDs when used for touch-based tasks such as pointing or drawing. Specifically, we report on the finger tracking accuracy of the VR head-mounted displays Oculus Quest, Vive Pro and the Leap Motion controller, when attached to a VR HMD, as well as the finger tracking accuracy of the AR head-mounted displays Microsoft HoloLens 2 and Magic Leap. We present the results of two experiments in which we compare the accuracy for absolute and relative pointing tasks using both human participants and a robot. The results suggest that HTC Vive has a lower spatial accuracy than the Oculus Quest and Leap Motion and that the Microsoft HoloLens 2 provides higher spatial accuracy than Magic Leap One. These findings can serve as decision support for researchers and practitioners in choosing which systems to use in the future.

[103]  arXiv:2109.10608 [pdf, ps, other]
Title: Noisy-to-Noisy Voice Conversion Framework with Denoising Model
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)

In a conventional voice conversion (VC) framework, a VC model is often trained with a clean dataset consisting of speech data carefully recorded and selected by minimizing background interference. However, collecting such a high-quality dataset is expensive and time-consuming. Leveraging crowd-sourced speech data in training is more economical. Moreover, for some real-world VC scenarios such as VC in video and VC-based data augmentation for speech recognition systems, the background sounds themselves are also informative and need to be maintained. In this paper, to explore VC with the flexibility of handling background sounds, we propose a noisy-to-noisy (N2N) VC framework composed of a denoising module and a VC module. With the proposed framework, we can convert the speaker's identity while preserving the background sounds. Both objective and subjective evaluations are conducted, and the results reveal the effectiveness of the proposed framework.

[104]  arXiv:2109.10610 [pdf, other]
Title: Relative-error stability of numerical algorithms
Subjects: Numerical Analysis (math.NA)

We {formalize the} definition of a stable algorithm that is (i) adapted to the use of multiple and variable precision arithmetic, (ii) sufficiently close to the actual practice of computing to be useful, and (iii) sufficiently robust from a mathematical point of view as to allow for the rigorous proof of theorems. This allows us to state some widely satisfied hypotheses, depending only on two functions $f$ and $g$, under which the composition of a stable algorithm for $f$ and a stable algorithm for $g$ is a stable algorithm for the composition $f \circ g$.

[105]  arXiv:2109.10611 [pdf, ps, other]
Title: Model Reference Adaptive Control with Linear-like Closed-loop Behavior
Comments: This is an extended version of a paper which will appear at the 60th IEEE Conference on Decision and Control. arXiv admin note: text overlap with arXiv:1902.09372
Subjects: Systems and Control (eess.SY); Optimization and Control (math.OC)

It is typically proven in adaptive control that asymptotic stabilization and tracking holds, and that at best a bounded-noise bounded-state property is proven. Recently, it has been shown in both the pole-placement control and the $d$-step ahead control settings that if, as part of the adaptive controller, a parameter estimator based on the original projection algorithm is used and the parameter estimates are restricted to a convex set, then the closed-loop system experiences linear-like behavior: exponential stability, a bounded gain on the noise in every $p$-norm, and a convolution bound on the exogenous inputs; this can be leveraged to provide tolerance to unmodelled dynamics and plant parameter time-variation. In this paper, we extend the approach to the more general Model Reference Adaptive Control (MRAC) problem and demonstrate that we achieve the same desirable linear-like closed-loop properties.

[106]  arXiv:2109.10613 [pdf, other]
Title: COVR: A test-bed for Visually Grounded Compositional Generalization with real images
Comments: EMNLP 2021
Subjects: Computation and Language (cs.CL)

While interest in models that generalize at test time to new compositions has risen in recent years, benchmarks in the visually-grounded domain have thus far been restricted to synthetic images. In this work, we propose COVR, a new test-bed for visually-grounded compositional generalization with real images. To create COVR, we use real images annotated with scene graphs, and propose an almost fully automatic procedure for generating question-answer pairs along with a set of context images. COVR focuses on questions that require complex reasoning, including higher-order operations such as quantification and aggregation. Due to the automatic generation process, COVR facilitates the creation of compositional splits, where models at test time need to generalize to new concepts and compositions in a zero- or few-shot setting. We construct compositional splits using COVR and demonstrate a myriad of cases where state-of-the-art pre-trained language-and-vision models struggle to compositionally generalize.

[107]  arXiv:2109.10616 [pdf, other]
Title: Enriching and Controlling Global Semantics for Text Summarization
Comments: Accepted to the main EMNLP 2021 conference
Subjects: Computation and Language (cs.CL)

Recently, Transformer-based models have been proven effective in the abstractive summarization task by creating fluent and informative summaries. Nevertheless, these models still suffer from the short-range dependency problem, causing them to produce summaries that miss the key points of document. In this paper, we attempt to address this issue by introducing a neural topic model empowered with normalizing flow to capture the global semantics of the document, which are then integrated into the summarization model. In addition, to avoid the overwhelming effect of global semantics on contextualized representation, we introduce a mechanism to control the amount of global semantics supplied to the text generation module. Our method outperforms state-of-the-art summarization models on five common text summarization datasets, namely CNN/DailyMail, XSum, Reddit TIFU, arXiv, and PubMed.

[108]  arXiv:2109.10617 [pdf, other]
Title: Solving Large Steiner Tree Problems in Graphs for Cost-Efficient Fiber-To-The-Home Network Expansion
Comments: Submitted to ICAART 2022, 10 pages, 18 figures
Subjects: Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

The expansion of Fiber-To-The-Home (FTTH) networks creates high costs due to expensive excavation procedures. Optimizing the planning process and minimizing the cost of the earth excavation work therefore lead to large savings. Mathematically, the FTTH network problem can be described as a minimum Steiner Tree problem. Even though the Steiner Tree problem has already been investigated intensively in the last decades, it might be further optimized with the help of new computing paradigms and emerging approaches. This work studies upcoming technologies, such as Quantum Annealing, Simulated Annealing and nature-inspired methods like Evolutionary Algorithms or slime-mold-based optimization. Additionally, we investigate partitioning and simplifying methods. Evaluated on several real-life problem instances, we could outperform a traditional, widely-used baseline (NetworkX Approximate Solver) on most of the domains. Prior partitioning of the initial graph and the presented slime-mold-based approach were especially valuable for a cost-efficient approximation. Quantum Annealing seems promising, but was limited by the number of available qubits.

[109]  arXiv:2109.10619 [pdf, other]
Title: Identifying Fast/Slow Thinking without Prior
Subjects: Computer Science and Game Theory (cs.GT)

System 1 vs. 2 theory describes two modes of thought, a fast, instinctive one and a slow, logical one. When we ask a question (e.g. A bat and ball cost $1.10. The bat costs $1 more than the ball. How much does the ball cost?), with prior, we can identify fast/slow thinking ($.10/$.05). But what if we do not have prior? A very clever method, surprisingly popular, additionally asks what percentage of other people answer $.10/$.05 and selects the answer that is more popular than people predict. However, the distribution report is non-minimal for many people especially for non-binary choices, the choices design requires prior and only the best answer is selected. Here we propose a simple minimal paradigm that elicits the full hierarchy of the collected answers: we ask a single open response question and elicit each respondent's answer (e.g. $.05) and guess(es) for other people's answers (e.g. $.10). We record the number of people who report a specific answer-guess pair (e.g. 10 people answer $.05 and guess $.10) by an answer-guess matrix. By ranking the answers to maximize the sum of the upper triangular area of the matrix, we obtain and visualize the hierarchy of the answers without any prior. Our paradigm has minimal requirement for both the respondent (no distribution report) and the requester (no choices design; check the hierarchy visually) and can be also used to research how people reason about other people's minds.

[110]  arXiv:2109.10632 [pdf, other]
Title: Locality Matters: A Scalable Value Decomposition Approach for Cooperative Multi-Agent Reinforcement Learning
Subjects: Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multiagent Systems (cs.MA); Systems and Control (eess.SY); Machine Learning (stat.ML)

Cooperative multi-agent reinforcement learning (MARL) faces significant scalability issues due to state and action spaces that are exponentially large in the number of agents. As environments grow in size, effective credit assignment becomes increasingly harder and often results in infeasible learning times. Still, in many real-world settings, there exist simplified underlying dynamics that can be leveraged for more scalable solutions. In this work, we exploit such locality structures effectively whilst maintaining global cooperation. We propose a novel, value-based multi-agent algorithm called LOMAQ, which incorporates local rewards in the Centralized Training Decentralized Execution paradigm. Additionally, we provide a direct reward decomposition method for finding these local rewards when only a global signal is provided. We test our method empirically, showing it scales well compared to other methods, significantly improving performance and convergence speed.

[111]  arXiv:2109.10633 [pdf, other]
Title: Reactive Answer Set Programming
Comments: Under consideration in Theory and Practice of Logic Programming (TPLP)
Subjects: Artificial Intelligence (cs.AI)

Logic Production System (LPS) is a logic-based framework for modelling reactive behaviour. Based on abductive logic programming, it combines reactive rules with logic programs, a database and a causal theory that specifies transitions between the states of the database. This paper proposes a systematic mapping of the Kernel of this framework (called KELPS) into an answer set program (ASP). For this purpose a new variant of KELPS with finite models, called $n$-distance KELPS, is introduced. A formal definition of the mapping from this $n$-distance KELPS to ASP is given and proven sound and complete. The Answer Set Programming paradigm allows to capture additional behaviours to the basic reactivity of KELPS, in particular proactive, preemptive and prospective behaviours. These are all discussed and illustrated with examples. Then a hybrid framework is proposed that integrates KELPS and ASP, allowing to combine the strengths of both paradigms. Under consideration in Theory and Practice of Logic Programming (TPLP).

[112]  arXiv:2109.10634 [pdf, other]
Title: Filtered integration rules for finite Hilbert transforms
Subjects: Numerical Analysis (math.NA)

A product quadrature rule, based on the filtered de la Vall\'ee Poussin polynomial approximation, is proposed for evaluating the finite Hilbert transform in [-1; 1]. Convergence results are stated in weighted uniform norm for functions belonging to suitable Besov type subspaces. Several numerical tests are provided, also comparing the rule with other formulas known in literature.

[113]  arXiv:2109.10637 [pdf, other]
Title: Facilitating human-wildlife cohabitation through conflict prediction
Comments: 7 pages, 4 figures
Subjects: Artificial Intelligence (cs.AI)

With increasing world population and expanded use of forests as cohabited regions, interactions and conflicts with wildlife are increasing, leading to large-scale loss of lives (animal and human) and livelihoods (economic). While community knowledge is valuable, forest officials and conservation organisations can greatly benefit from predictive analysis of human-wildlife conflict, leading to targeted interventions that can potentially help save lives and livelihoods. However, the problem of prediction is a complex socio-technical problem in the context of limited data in low-resource regions.
Identifying the "right" features to make accurate predictions of conflicts at the required spatial granularity using a sparse conflict training dataset} is the key challenge that we address in this paper. Specifically, we do an illustrative case study on human-wildlife conflicts in the Bramhapuri Forest Division in Chandrapur, Maharashtra, India. Most existing work has considered human-wildlife conflicts in protected areas and to the best of our knowledge, this is the first effort at prediction of human-wildlife conflicts in unprotected areas and using those predictions for deploying interventions on the ground.

[114]  arXiv:2109.10638 [pdf]
Title: Scholarly outputs of EU Research Funding Programs: Understanding differences between datasets of publications reported by grant holders and OpenAIRE Research Graph in H2020
Subjects: Digital Libraries (cs.DL)

Linking research results to grants is an essential prerequisite for an effective monitoring and evaluation of funding programs. For the EU research funding programs, there are multiple datasets linking scholarly publications to the individual grants, including both open data and those from commercial bibliometric databases. In this paper, we systematically compare openly available data from two data sources: on one hand those reported by the Grant holders (and subsequently published by the European Commission on open data portal) and those from the OpenAIRE Research Graph which collect data from multiple sources. We describe the dataflow leading to their creation and assess the quality of data by validating, on sample basis, the link <project, publications>. We report that, by and large, OpenAIRE Research Graph offers a more complete dataset of scholarly outputs of from EU Research funding programs. We identify also possible improvements and make recommendations on how they can be addressed.

[115]  arXiv:2109.10640 [pdf, other]
Title: LDC-VAE: A Latent Distribution Consistency Approach to Variational AutoEncoders
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Information Theory (cs.IT)

Variational autoencoders (VAEs), as an important aspect of generative models, have received a lot of research interests and reached many successful applications. However, it is always a challenge to achieve the consistency between the learned latent distribution and the prior latent distribution when optimizing the evidence lower bound (ELBO), and finally leads to an unsatisfactory performance in data generation. In this paper, we propose a latent distribution consistency approach to avoid such substantial inconsistency between the posterior and prior latent distributions in ELBO optimizing. We name our method as latent distribution consistency VAE (LDC-VAE). We achieve this purpose by assuming the real posterior distribution in latent space as a Gibbs form, and approximating it by using our encoder. However, there is no analytical solution for such Gibbs posterior in approximation, and traditional approximation ways are time consuming, such as using the iterative sampling-based MCMC. To address this problem, we use the Stein Variational Gradient Descent (SVGD) to approximate the Gibbs posterior. Meanwhile, we use the SVGD to train a sampler net which can obtain efficient samples from the Gibbs posterior. Comparative studies on the popular image generation datasets show that our method has achieved comparable or even better performance than several powerful improvements of VAEs.

[116]  arXiv:2109.10642 [pdf, other]
Title: Decentralized Learning of Tree-Structured Gaussian Graphical Models from Noisy Data
Authors: Akram Hussain
Comments: 32 pages, there are more authors of this paper
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)

This paper studies the decentralized learning of tree-structured Gaussian graphical models (GGMs) from noisy data. In decentralized learning, data set is distributed across different machines (sensors), and GGMs are widely used to model complex networks such as gene regulatory networks and social networks. The proposed decentralized learning uses the Chow-Liu algorithm for estimating the tree-structured GGM.
In previous works, upper bounds on the probability of incorrect tree structure recovery were given mostly without any practical noise for simplification. While this paper investigates the effects of three common types of noisy channels: Gaussian, Erasure, and binary symmetric channel. For Gaussian channel case, to satisfy the failure probability upper bound $\delta > 0$ in recovering a $d$-node tree structure, our proposed theorem requires only $\mathcal{O}(\log(\frac{d}{\delta}))$ samples for the smallest sample size ($n$) comparing to the previous literature \cite{Nikolakakis} with $\mathcal{O}(\log^4(\frac{d}{\delta}))$ samples by using the positive correlation coefficient assumption that is used in some important works in the literature. Moreover, the approximately bounded Gaussian random variable assumption does not appear in \cite{Nikolakakis}. Given some knowledge about the tree structure, the proposed Algorithmic Bound will achieve obviously better performance with small sample size (e.g., $< 2000$) comparing with formulaic bounds. Finally, we validate our theoretical results by performing simulations on synthetic data sets.

[117]  arXiv:2109.10645 [pdf, other]
Title: Contrastive Learning for Fair Representations
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

Trained classification models can unintentionally lead to biased representations and predictions, which can reinforce societal preconceptions and stereotypes. Existing debiasing methods for classification models, such as adversarial training, are often expensive to train and difficult to optimise. In this paper, we propose a method for mitigating bias in classifier training by incorporating contrastive learning, in which instances sharing the same class label are encouraged to have similar representations, while instances sharing a protected attribute are forced further apart. In such a way our method learns representations which capture the task label in focused regions, while ensuring the protected attribute has diverse spread, and thus has limited impact on prediction and thereby results in fairer models. Extensive experimental results across four tasks in NLP and computer vision show (a) that our proposed method can achieve fairer representations and realises bias reductions compared with competitive baselines; and (b) that it can do so without sacrificing main task performance; (c) that it sets a new state-of-the-art performance in one task despite reducing the bias. Finally, our method is conceptually simple and agnostic to network architectures, and incurs minimal additional compute cost.

[118]  arXiv:2109.10647 [pdf, ps, other]
Title: Numerical analysis of a finite element formulation of the P2D model for Lithium-ion cells
Authors: Rodolfo Bermejo
Comments: 31 pages, 2 figures
Subjects: Numerical Analysis (math.NA)

The mathematical P2D model is a system of strongly coupled nonlinear parabolic-elliptic equations that describes the electrodynamics of lithium-ion batteries. In this paper, we present the numerical analysis of a finite element-implicit Euler scheme for such a model. We obtain error estimates for both the spatially semidiscrete and the fully discrete systems of equations, and establish the existence and uniqueness of the fully discrete solution.

[119]  arXiv:2109.10649 [pdf, other]
Title: Caption Enriched Samples for Improving Hateful Memes Detection
Comments: EMNLP 2021
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)

The recently introduced hateful meme challenge demonstrates the difficulty of determining whether a meme is hateful or not. Specifically, both unimodal language models and multimodal vision-language models cannot reach the human level of performance. Motivated by the need to model the contrast between the image content and the overlayed text, we suggest applying an off-the-shelf image captioning tool in order to capture the first. We demonstrate that the incorporation of such automatic captions during fine-tuning improves the results for various unimodal and multimodal models. Moreover, in the unimodal case, continuing the pre-training of language models on augmented and original caption pairs, is highly beneficial to the classification accuracy.

[120]  arXiv:2109.10650 [pdf, other]
Title: MiRANews: Dataset and Benchmarks for Multi-Resource-Assisted News Summarization
Journal-ref: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing Findings (EMNLP2021 Findings)
Subjects: Computation and Language (cs.CL)

One of the most challenging aspects of current single-document news summarization is that the summary often contains 'extrinsic hallucinations', i.e., facts that are not present in the source document, which are often derived via world knowledge. This causes summarization systems to act more like open-ended language models tending to hallucinate facts that are erroneous. In this paper, we mitigate this problem with the help of multiple supplementary resource documents assisting the task. We present a new dataset MiRANews and benchmark existing summarization models. In contrast to multi-document summarization, which addresses multiple events from several source documents, we still aim at generating a summary for a single document. We show via data analysis that it's not only the models which are to blame: more than 27% of facts mentioned in the gold summaries of MiRANews are better grounded on assisting documents than in the main source articles. An error analysis of generated summaries from pretrained models fine-tuned on MiRANews reveals that this has an even bigger effects on models: assisted summarization reduces 55% of hallucinations when compared to single-document summarization models trained on the main article only. Our code and data are available at https://github.com/XinnuoXu/MiRANews.

[121]  arXiv:2109.10652 [pdf, other]
Title: Gotta catch 'em all: a Multistage Framework for honeypot fingerprinting
Subjects: Cryptography and Security (cs.CR)

Honeypots are decoy systems that lure attackers by presenting them with a seemingly vulnerable system. They provide an early detection mechanism as well as a method for learning how adversaries work and think. However, over the last years, a number of researchers have shown methods for fingerprinting honeypots. This significantly decreases the value of a honeypot; if an attacker is able to recognize the existence of such a system, they can evade it. In this article, we revisit the honeypot identification field, by providing a holistic framework that includes state of the art and novel fingerprinting components. We decrease the probability of false positives by proposing a rigid multi-step approach for labeling a system as a honeypot. We perform extensive scans covering 2.9 billion addresses of the IPv4 space and identify a total of 21,855 honeypot instances. Moreover, we present a number of interesting side-findings such as the identification of more than 354,431 non-honeypot systems that represent potentially vulnerable servers (e.g. SSH servers with default password configurations and vulnerable versions). Lastly, we discuss countermeasures against honeypot fingerprinting techniques.

[122]  arXiv:2109.10656 [pdf, other]
Title: Vehicle Behavior Prediction and Generalization Using Imbalanced Learning Techniques
Comments: Accepted for 2021 IEEE 24th International Conference on Intelligent Transportation Systems (ITSC)
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG)

The use of learning-based methods for vehicle behavior prediction is a promising research topic. However, many publicly available data sets suffer from class distribution skews which limits learning performance if not addressed. This paper proposes an interaction-aware prediction model consisting of an LSTM autoencoder and SVM classifier. Additionally, an imbalanced learning technique, the multiclass balancing ensemble is proposed. Evaluations show that the method enhances model performance, resulting in improved classification accuracy. Good generalization properties of learned models are important and therefore a generalization study is done where models are evaluated on unseen traffic data with dissimilar traffic behavior stemming from different road configurations. This is realized by using two distinct highway traffic recordings, the publicly available NGSIM US-101 and I80 data sets. Moreover, methods for encoding structural and static features into the learning process for improved generalization are evaluated. The resulting methods show substantial improvements in classification as well as generalization performance.

[123]  arXiv:2109.10657 [pdf, ps, other]
Title: Beamforming Design for IRS-aided Decode-and-Forward Relay Wireless Network
Subjects: Information Theory (cs.IT)

As a low-cost and low-power-consumption passive reflector, intelligent reflecting surface (IRS) can make a significant rate improvement by building a programmable wireless environment. To improve the rate performance and coverage range of wireless networks, an IRS-aided decode-and-forward (DF) relay network is proposed with multiple antennas at relay station (RS). To achieve a high rate, an alternately iterative structure (AIS) of maximizing receive power (Max-RP) at RS is proposed to jointly optimize the beamforming vectors at RS and phase shifts at IRS. Considering its high-complexity, two low-complexity Max-RP schemes of null-space projection (NSP) plus maximum ratio combining (MRC) and IRS element selection (IRSES) plus MRC are presented to reduce this complexity, respectively. For the former, NSP is used to separate the reflected signal from IRS and the direct transmitted signal from source and MRC is adopted to combine the two signals at RS. For the latter, the basic concept of IRSES is as follows: IRS is partitioned into M subsets of elements and adjusting the phases of all elements per subset make all reflected signals and the direct signal from source phase alignment (PA) at the corresponding antenna of relay. Simulation results show that the proposed three methods perform much better than the existing network with single-antenna relay in terms of rate performance. In particular, a 85% rate gain over existing scheme is achieved in the high signal-to-noise ratio region.

[124]  arXiv:2109.10658 [pdf, other]
Title: TACTIC: Joint Rate-Distortion-Accuracy Optimisation for Low Bitrate Compression
Subjects: Computer Vision and Pattern Recognition (cs.CV); Information Theory (cs.IT); Image and Video Processing (eess.IV)

We present TACTIC: Task-Aware Compression Through Intelligent Coding. Our lossy compression model learns based on the rate-distortion-accuracy trade-off for a specific task. By considering what information is important for the follow-on problem, the system trades off visual fidelity for good task performance at a low bitrate. When compared against JPEG at the same bitrate, our approach is able to improve the accuracy of ImageNet subset classification by 4.5%. We also demonstrate the applicability of our approach to other problems, providing a 3.4% accuracy and 4.9% mean IoU improvements in performance over task-agnostic compression for semantic segmentation.

[125]  arXiv:2109.10659 [pdf, ps, other]
Title: Improved variants of the Hutch++ algorithm for trace estimation
Subjects: Numerical Analysis (math.NA)

This paper is concerned with two improved variants of the Hutch++ algorithm for estimating the trace of a square matrix, implicitly given through matrix-vector products. Hutch++ combines randomized low-rank approximation in a first phase with stochastic trace estimation in a second phase. In turn, Hutch++ only requires $O\left(\varepsilon^{-1}\right)$ matrix-vector products to approximate the trace within a relative error $\varepsilon$ with high probability. This compares favorably with the $O\left(\varepsilon^{-2}\right)$ matrix-vector products needed when using stochastic trace estimation alone. In Hutch++, the number of matrix-vector products is fixed a priori and distributed in a prescribed fashion among the two phases. In this work, we derive an adaptive variant of Hutch++, which outputs an estimate of the trace that is within some prescribed error tolerance with a controllable failure probability, while splitting the matrix-vector products in a near-optimal way among the two phases. For the special case of symmetric positive semi-definite matrix, we present another variant of Hutch++, called Nystr\"om++, which utilizes the so called Nystr\"om approximation and requires only one pass over the matrix, as compared to two passes with Hutch++. We extend the analysis of Hutch++ to Nystr\"om++. Numerical experiments demonstrate the effectiveness of our two new algorithms.

[126]  arXiv:2109.10660 [pdf, other]
Title: VIA: Analyzing Device Interfaces of Protected Virtual Machines
Subjects: Cryptography and Security (cs.CR)

Both AMD and Intel have presented technologies for confidential computing in cloud environments. The proposed solutions - AMD SEV (-ES, -SNP) and Intel TDX - protect Virtual Machines (VMs) against attacks from higher privileged layers through memory encryption and integrity protection. This model of computation draws a new trust boundary between virtual devices and the VM, which in so far lacks thorough examination. In this paper, we therefore present an analysis of the virtual device interface and discuss several attack vectors against a protected VM. Further, we develop and evaluate VIA, an automated analysis tool to detect cases of improper sanitization of input recieved via the virtual device interface. VIA improves upon existing approaches for the automated analysis of device interfaces in the following aspects: (i) support for virtualization relevant buses, (ii) efficient Direct Memory Access (DMA) support and (iii) performance. VIA builds upon the Linux Kernel Library and clang's libfuzzer to fuzz the communication between the driver and the device via MMIO, PIO, and DMA. An evaluation of VIA shows that it performs 570 executions per second on average and improves performance compared to existing approaches by an average factor of 2706. Using VIA, we analyzed 22 drivers in Linux 5.10.0-rc6, thereby uncovering 50 bugs and initiating multiple patches to the virtual device driver interface of Linux. To prove our findings criticality under the threat model of AMD SEV and Intel TDX, we showcase three exemplary attacks based on the bugs found. The attacks enable a malicious hypervisor to corrupt the memory and gain code execution in protected VMs with SEV-ES and are theoretically applicable to SEV-SNP and TDX.

[127]  arXiv:2109.10661 [pdf, ps, other]
Title: Error bounds of fourth-order compact finite difference methods for the Dirac equation in the massless and nonrelativistic regime
Authors: Yue Feng, Ying Ma
Comments: 22 pages, 4 figures
Subjects: Numerical Analysis (math.NA)

We establish the error bounds of fourth-order compact finite difference (4cFD) methods for the Dirac equation in the massless and nonrelativistic regime, which involves a small dimensionless parameter $0 < \varepsilon \le 1$ inversely proportional to the speed of light. In this regime, the solution propagates waves with wavelength $O(\varepsilon)$ in time and $O(1)$ in space, as well as with the wave speed $O(1/\varepsilon)$ rapid outgoing waves. We adapt the conservative and semi-implicit 4cFD methods to discretize the Dirac equation and rigorously carry out their error bounds depending explicitly on the mesh size $h$, time step $\tau$ and the small parameter $\varepsilon$. Based on the error bounds, the $\varepsilon$-scalability of the 4cFD methods is $h = O(\varepsilon^{1/4})$ and $\tau = O(\varepsilon^{3/2})$, which not only improves the spatial resolution capacity but also has superior accuracy than classical second-order finite difference methods. Furthermore, physical observables including the total density and current density have the same conclusions. Numerical results are provided to validate the error bounds and the dynamics of the Dirac equation with different potentials in 2D is presented.

[128]  arXiv:2109.10664 [pdf]
Title: A deep neural network for multi-species fish detection using multiple acoustic cameras
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Underwater acoustic cameras are high potential devices for many applications in ecology, notably for fisheries management and monitoring. However how to extract such data into high value information without a time-consuming entire dataset reading by an operator is still a challenge. Moreover the analysis of acoustic imaging, due to its low signal-to-noise ratio, is a perfect training ground for experimenting with new approaches, especially concerning Deep Learning techniques. We present hereby a novel approach that takes advantage of both CNN (Convolutional Neural Network) and classical CV (Computer Vision) techniques, able to detect a generic class ''fish'' in acoustic video streams. The pipeline pre-treats the acoustic images to extract 2 features, in order to localise the signals and improve the detection performances. To ensure the performances from an ecological point of view, we propose also a two-step validation, one to validate the results of the trainings and one to test the method on a real-world scenario. The YOLOv3-based model was trained with data of fish from multiple species recorded by the two common acoustic cameras, DIDSON and ARIS, including species of high ecological interest, as Atlantic salmon or European eels. The model we developed provides satisfying results detecting almost 80% of fish and minimizing the false positive rate, however the model is much less efficient for eel detections on ARIS videos. The first CNN pipeline for fish monitoring exploiting video data from two models of acoustic cameras satisfies most of the required features. Many challenges are still present, such as the automation of fish species identification through a multiclass model. 1 However the results point a new solution for dealing with complex data, such as sonar data, which can also be reapplied in other cases where the signal-to-noise ratio is a challenge.

[129]  arXiv:2109.10665 [pdf, other]
Title: A Survey on Reinforcement Learning for Recommender Systems
Comments: 25 pages, 4 figures
Subjects: Information Retrieval (cs.IR)

Recommender systems have been widely applied in different real-life scenarios to help us find useful information. Recently, Reinforcement Learning (RL) based recommender systems have become an emerging research topic. It often surpasses traditional recommendation models even most deep learning-based methods, owing to its interactive nature and autonomous learning ability. Nevertheless, there are various challenges of RL when applying in recommender systems. Toward this end, we firstly provide a thorough overview, comparisons, and summarization of RL approaches for five typical recommendation scenarios, following three main categories of RL: value-function, policy search, and Actor-Critic. Then, we systematically analyze the challenges and relevant solutions on the basis of existing literature. Finally, under discussion for open issues of RL and its limitations of recommendation, we highlight some potential research directions in this field.

[130]  arXiv:2109.10670 [pdf, other]
Title: Machines as Programs: P $\neq$ NP
Authors: Jonathan J. Mize
Comments: 23 pages, 2 figures
Subjects: Computational Complexity (cs.CC); Logic in Computer Science (cs.LO); Logic (math.LO)

The Curry-Howard correspondence is often called the proofs-as-programs result. I offer a generalization of this result, something which may be called machines as programs. Utilizing this insight, I introduce two new Turing Machines called "Ceiling Machines." The formal ingredients of these two machines are nearly identical. But there are crucial differences, splitting the two into a "Higher Ceiling Machine" and a "Lower Ceiling Machine." A potential graph of state transitions of the Higher Ceiling Machine is then offered. This graph is termed the "canonically nondeterministic solution" or CNDS, whose accompanying problem is its own replication, i.e., the problem, "Replicate CNDS" (whose accompanying algorithm is cast in Martin-L\"of type theory). I then show that while this graph can be replicated (solved) in polynomial time by a nondeterministic machine -- of which the Higher Ceiling Machine is a canonical example -- it cannot be solved in polynomial time by a deterministic machine, of which the Lower Ceiling Machine is also canonical. It is consequently proven that P $\neq$ NP.

[131]  arXiv:2109.10678 [pdf, other]
Title: Natural Language Video Localization with Learnable Moment Proposals
Comments: emnlp21
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Given an untrimmed video and a natural language query, Natural Language Video Localization (NLVL) aims to identify the video moment described by the query. To address this task, existing methods can be roughly grouped into two groups: 1) propose-and-rank models first define a set of hand-designed moment candidates and then find out the best-matching one. 2) proposal-free models directly predict two temporal boundaries of the referential moment from frames. Currently, almost all the propose-and-rank methods have inferior performance than proposal-free counterparts. In this paper, we argue that propose-and-rank approach is underestimated due to the predefined manners: 1) Hand-designed rules are hard to guarantee the complete coverage of targeted segments. 2) Densely sampled candidate moments cause redundant computation and degrade the performance of ranking process. To this end, we propose a novel model termed LPNet (Learnable Proposal Network for NLVL) with a fixed set of learnable moment proposals. The position and length of these proposals are dynamically adjusted during training process. Moreover, a boundary-aware loss has been proposed to leverage frame-level information and further improve the performance. Extensive ablations on two challenging NLVL benchmarks have demonstrated the effectiveness of LPNet over existing state-of-the-art methods.

[132]  arXiv:2109.10683 [pdf, other]
Title: Adaptive Neural Message Passing for Inductive Learning on Hypergraphs
Subjects: Machine Learning (cs.LG); Multimedia (cs.MM)

Graphs are the most ubiquitous data structures for representing relational datasets and performing inferences in them. They model, however, only pairwise relations between nodes and are not designed for encoding the higher-order relations. This drawback is mitigated by hypergraphs, in which an edge can connect an arbitrary number of nodes. Most hypergraph learning approaches convert the hypergraph structure to that of a graph and then deploy existing geometric deep learning methods. This transformation leads to information loss, and sub-optimal exploitation of the hypergraph's expressive power. We present HyperMSG, a novel hypergraph learning framework that uses a modular two-level neural message passing strategy to accurately and efficiently propagate information within each hyperedge and across the hyperedges. HyperMSG adapts to the data and task by learning an attention weight associated with each node's degree centrality. Such a mechanism quantifies both local and global importance of a node, capturing the structural properties of a hypergraph. HyperMSG is inductive, allowing inference on previously unseen nodes. Further, it is robust and outperforms state-of-the-art hypergraph learning methods on a wide range of tasks and datasets. Finally, we demonstrate the effectiveness of HyperMSG in learning multimodal relations through detailed experimentation on a challenging multimedia dataset.

[133]  arXiv:2109.10686 [pdf, other]
Title: Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

There remain many open questions pertaining to the scaling behaviour of Transformer architectures. These scaling decisions and findings can be critical, as training runs often come with an associated computational cost which have both financial and/or environmental impact. The goal of this paper is to present scaling insights from pretraining and finetuning Transformers. While Kaplan et al. presents a comprehensive study of the scaling behaviour of Transformer language models, the scope is only on the upstream (pretraining) loss. Therefore, it is still unclear if these set of findings transfer to downstream task within the context of the pretrain-finetune paradigm. The key findings of this paper are as follows: (1) we show that aside from only the model size, model shape matters for downstream fine-tuning, (2) scaling protocols operate differently at different compute regions, (3) widely adopted T5-base and T5-large sizes are Pareto-inefficient. To this end, we present improved scaling protocols whereby our redesigned models achieve similar downstream fine-tuning quality while having 50\% fewer parameters and training 40\% faster compared to the widely adopted T5-base model. We publicly release over 100 pretrained checkpoints of different T5 configurations to facilitate future research and analysis.

[134]  arXiv:2109.10688 [pdf, other]
Title: Finding Facial Forgery Artifacts with Parts-Based Detectors
Comments: Accepted into the CVPR Workshop on Media Forensics 2021
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Manipulated videos, especially those where the identity of an individual has been modified using deep neural networks, are becoming an increasingly relevant threat in the modern day. In this paper, we seek to develop a generalizable, explainable solution to detecting these manipulated videos. To achieve this, we design a series of forgery detection systems that each focus on one individual part of the face. These parts-based detection systems, which can be combined and used together in a single architecture, meet all of our desired criteria - they generalize effectively between datasets and give us valuable insights into what the network is looking at when making its decision. We thus use these detectors to perform detailed empirical analysis on the FaceForensics++, Celeb-DF, and Facebook Deepfake Detection Challenge datasets, examining not just what the detectors find but also collecting and analyzing useful related statistics on the datasets themselves.

[135]  arXiv:2109.10689 [pdf, other]
Title: Towards Cognitive Navigation: Design and Implementation of a Biologically Inspired Head Direction Cell Network
Subjects: Neural and Evolutionary Computing (cs.NE)

As a vital cognitive function of animals, the navigation skill is first built on the accurate perception of the directional heading in the environment. Head direction cells (HDCs), found in the limbic system of animals, are proven to play an important role in identifying the directional heading allocentrically in the horizontal plane, independent of the animal's location and the ambient conditions of the environment. However, practical HDC models that can be implemented in robotic applications are rarely investigated, especially those that are biologically plausible and yet applicable to the real world. In this paper, we propose a computational HDC network which is consistent with several neurophysiological findings concerning biological HDCs, and then implement it in robotic navigation tasks. The HDC network keeps a representation of the directional heading only relying on the angular velocity as an input. We examine the proposed HDC model in extensive simulations and real-world experiments and demonstrate its excellent performance in terms of accuracy and real-time capability.

[136]  arXiv:2109.10691 [pdf, other]
Title: Query Evaluation in DatalogMTL -- Taming Infinite Query Results
Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI)

In this paper, we investigate finite representations of DatalogMTL. First, we introduce programs that have finite models and propose a toolkit for structuring the execution of DatalogMTL rules into sequential phases. Then, we study infinite models that eventually become constant and introduce sufficient criteria for programs that allow for such representation. We proceed by considering infinite models that are eventually periodic and show that such a representation encompasses all DatalogMTLFP programs, a widely discussed fragment. Finally, we provide a novel algorithm for reasoning over finite representable DatalogMTL programs that incorporates all of the previously discussed representations.

[137]  arXiv:2109.10695 [pdf, other]
Title: Differentiable Surface Triangulation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)

Triangle meshes remain the most popular data representation for surface geometry. This ubiquitous representation is essentially a hybrid one that decouples continuous vertex locations from the discrete topological triangulation. Unfortunately, the combinatorial nature of the triangulation prevents taking derivatives over the space of possible meshings of any given surface. As a result, to date, mesh processing and optimization techniques have been unable to truly take advantage of modular gradient descent components of modern optimization frameworks. In this work, we present a differentiable surface triangulation that enables optimization for any per-vertex or per-face differentiable objective function over the space of underlying surface triangulations. Our method builds on the result that any 2D triangulation can be achieved by a suitably perturbed weighted Delaunay triangulation. We translate this result into a computational algorithm by proposing a soft relaxation of the classical weighted Delaunay triangulation and optimizing over vertex weights and vertex locations. We extend the algorithm to 3D by decomposing shapes into developable sets and differentiably meshing each set with suitable boundary constraints. We demonstrate the efficacy of our method on various planar and surface meshes on a range of difficult-to-optimize objective functions. Our code can be found online: https://github.com/mrakotosaon/diff-surface-triangulation.

[138]  arXiv:2109.10696 [pdf, other]
Title: CC-Cert: A Probabilistic Approach to Certify General Robustness of Neural Networks
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)

In safety-critical machine learning applications, it is crucial to defend models against adversarial attacks -- small modifications of the input that change the predictions. Besides rigorously studied $\ell_p$-bounded additive perturbations, recently proposed semantic perturbations (e.g. rotation, translation) raise a serious concern on deploying ML systems in real-world. Therefore, it is important to provide provable guarantees for deep learning models against semantically meaningful input transformations. In this paper, we propose a new universal probabilistic certification approach based on Chernoff-Cramer bounds that can be used in general attack settings. We estimate the probability of a model to fail if the attack is sampled from a certain distribution. Our theoretical findings are supported by experimental results on different datasets.

[139]  arXiv:2109.10697 [pdf, other]
Title: Towards Automatic Bias Detection in Knowledge Graphs
Comments: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: Findings (EMNLP 2021). Nov 7--11, 2021
Subjects: Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

With the recent surge in social applications relying on knowledge graphs, the need for techniques to ensure fairness in KG based methods is becoming increasingly evident. Previous works have demonstrated that KGs are prone to various social biases, and have proposed multiple methods for debiasing them. However, in such studies, the focus has been on debiasing techniques, while the relations to be debiased are specified manually by the user. As manual specification is itself susceptible to human cognitive bias, there is a need for a system capable of quantifying and exposing biases, that can support more informed decisions on what to debias. To address this gap in the literature, we describe a framework for identifying biases present in knowledge graph embeddings, based on numerical bias metrics. We illustrate the framework with three different bias measures on the task of profession prediction, and it can be flexibly extended to further bias definitions and applications. The relations flagged as biased can then be handed to decision makers for judgement upon subsequent debiasing.

[140]  arXiv:2109.10698 [pdf]
Title: Complementing the Linear-Programming Learning Experience with the Design and Use of Computerized Games: The Formula 1 Championship Game
Comments: 21 pages, 12 Figures, 4 Tables
Subjects: Artificial Intelligence (cs.AI); Computer Science and Game Theory (cs.GT)

This document focuses on modeling a complex situations to achieve an advantage within a competitive context. Our goal is to devise the characteristics of games to teach and exercise non-easily quantifiable tasks crucial to the math-modeling process. A computerized game to exercise the math-modeling process and optimization problem formulation is introduced. The game is named The Formula 1 Championship, and models of the game were developed in the computerized simulation platform MoNet. It resembles some situations in which team managers must make crucial decisions to enhance their racing cars up to the feasible, most advantageous conditions. This paper describes the game's rules, limitations, and five Formula 1 circuit simulators used for the championship development. We present several formulations of this situation in the form of optimization problems. Administering the budget to reach the best car adjustment to a set of circuits to win the respective races can be an approach. Focusing on the best distribution of each Grand Prix's budget and then deciding how to use the assigned money to improve the car is also the right approach. In general, there may be a degree of conflict among these approaches because they are different aspects of the same multi-scale optimization problem. Therefore, we evaluate the impact of assigning the highest priority to an element, or another, when formulating the optimization problem. Studying the effectiveness of solving such optimization problems turns out to be an exciting way of evaluating the advantages of focusing on one scale or another. Another thread of this research directs to the meaning of the game in the teaching-learning process. We believe applying the Formula 1 Game is an effective way to discover opportunities in a complex-system situation and formulate them to finally extract and concrete the related benefit to the context described.

[141]  arXiv:2109.10702 [pdf, other]
Title: A Quantitative Comparison of Epistemic Uncertainty Maps Applied to Multi-Class Segmentation
Authors: Robin Camarasa (1 and 2), Daniel Bos (2 and 3), Jeroen Hendrikse (4), Paul Nederkoorn (5), M. Eline Kooi (6), Aad van der Lugt (2), Marleen de Bruijne (1, 2 and 7), ((1) Biomedical Imaging Group Rotterdam, Department of Radiology and Nuclear Medicine, Erasmus MC, Rotterdam, The Netherlands, (2) Department of Radiology and Nuclear Medicine, Erasmus MC, Rotterdam, The Netherlands, (3) Department of Epidemiology, Erasmus MC, Rotterdam, The Netherlands, (4) Department of Radiology, University Medical Center Utrecht, Utrecht, The Netherlands, (5) Department of Neurology, Academic Medical Center University of Amsterdam, Amsterdam, The Netherlands, (6) Department of Radiology and Nuclear Medicine, CARIM School for Cardiovascular Diseases, Maastricht University Medical Center, Maastricht, The Netherlands, (7) Department of Computer Science, University of Copenhagen, Denmark)
Comments: 39 pages, 22 figures, to be published in Journal of Machine Learning for Biomedical Imaging for the Special Issue: Uncertainty for Safe Utilization of Machine Learning in Medical Imaging (UNSURE) 2020
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)

Uncertainty assessment has gained rapid interest in medical image analysis. A popular technique to compute epistemic uncertainty is the Monte-Carlo (MC) dropout technique. From a network with MC dropout and a single input, multiple outputs can be sampled. Various methods can be used to obtain epistemic uncertainty maps from those multiple outputs. In the case of multi-class segmentation, the number of methods is even larger as epistemic uncertainty can be computed voxelwise per class or voxelwise per image. This paper highlights a systematic approach to define and quantitatively compare those methods in two different contexts: class-specific epistemic uncertainty maps (one value per image, voxel and class) and combined epistemic uncertainty maps (one value per image and voxel). We applied this quantitative analysis to a multi-class segmentation of the carotid artery lumen and vessel wall, on a multi-center, multi-scanner, multi-sequence dataset of (MR) images. We validated our analysis over 144 sets of hyperparameters of a model. Our main analysis considers the relationship between the order of the voxels sorted according to their epistemic uncertainty values and the misclassification of the prediction. Under this consideration, the comparison of combined uncertainty maps reveals that the multi-class entropy and the multi-class mutual information statistically out-perform the other combined uncertainty maps under study. In a class-specific scenario, the one-versus-all entropy statistically out-performs the class-wise entropy, the class-wise variance and the one versus all mutual information. The class-wise entropy statistically out-performs the other class-specific uncertainty maps in terms of calibration. We made a python package available to reproduce our analysis on different data and tasks.

[142]  arXiv:2109.10703 [pdf, other]
Title: The Banking Transactions Dataset and its Comparative Analysis with Scale-free Networks
Subjects: Social and Information Networks (cs.SI); Physics and Society (physics.soc-ph)

We construct a network of 1.6 million nodes from banking transactions of users of Rabobank. We assign two weights on each edge, which are the aggregate transferred amount and the total number of transactions between the users from the year 2010 to 2020. We present a detailed analysis of the unweighted and both weighted networks by examining their degree, strength, and weight distributions, as well as the topological assortativity and weighted assortativity, clustering, and weighted clustering, together with correlations between these quantities. We further study the meso-scale properties of the networks and compare them to a randomized reference system. We also analyze the characteristics of nodes and edges using centrality measures to understand their roles in the money transaction system. This will be the first publicly shared dataset of intra-bank transactions, and this work highlights the unique characteristics of banking transaction networks with other scale-free networks.

[143]  arXiv:2109.10705 [pdf, other]
Title: On Crossing-Families in Planar Point Sets
Subjects: Computational Geometry (cs.CG); Combinatorics (math.CO)

A $k$-crossing family in a point set $S$ in general position is a set of $k$ segments spanned by points of $S$ such that all $k$ segments mutually cross. In this short note we present two statements on crossing families which are based on sets of small cardinality: (1)~Any set of at least 15 points contains a crossing family of size~4. (2)~There are sets of $n$ points which do not contain a crossing family of size larger than~$8\lceil \frac{n}{41} \rceil$. Both results improve the previously best known bounds.

[144]  arXiv:2109.10708 [pdf, other]
Title: Graph type expressivity and transformations
Comments: 30 pages (including references and appendix), 12 figures
Subjects: Discrete Mathematics (cs.DM); Computational Complexity (cs.CC); Data Structures and Algorithms (cs.DS); Information Theory (cs.IT)

Graph representations have gained importance in almost every scientific field, ranging from mathematics, biology, social sciences and physics to computer science. In contrast to other data formats, graphs propose the possibility to model relations between entities. Together with the continuously rising amount of available data, graphs therefore open up a wide range of modeling capabilities for theoretical and real-world problems. However, the modeling possibilities of graphs have not been fully exploited. One reason for this is that there is neither an easily comprehensible overview of graph types nor an analysis of their modeling capacities available. As a result, neither the potential of modeling with certain graph types is exhausted nor higher modeling freedom and more efficient computing of graphs after transformation to another graph type is in scope of view of many users. In order to clarify the modeling possibilities of graphs, we order the different graph types, collate their memory complexity and provide an expressivity measure on them. Furthermore, we introduce transformation algorithms between the graph types from which equal expressivity of all graph types can be inferred, i.e., they are able to represent the same information or properties respectively. Finally, we provide a guideline for the question when a graph type transformation is efficient by defining a cost function dependend on the memory complexity and the transformation runtime as a decision-making tool.

[145]  arXiv:2109.10715 [pdf]
Title: Simulated Annealing for Emotional Dialogue Systems
Subjects: Computation and Language (cs.CL)

Explicitly modeling emotions in dialogue generation has important applications, such as building empathetic personal companions. In this study, we consider the task of expressing a specific emotion for dialogue generation. Previous approaches take the emotion as an input signal, which may be ignored during inference. We instead propose a search-based emotional dialogue system by simulated annealing (SA). Specifically, we first define a scoring function that combines contextual coherence and emotional correctness. Then, SA iteratively edits a general response and searches for a sentence with a higher score, enforcing the presence of the desired emotion. We evaluate our system on the NLPCC2017 dataset. Our proposed method shows 12% improvements in emotion accuracy compared with the previous state-of-the-art method, without hurting the generation quality (measured by BLEU).

[146]  arXiv:2109.10716 [pdf, ps, other]
Title: A formalisation of BPMN in Description Logics
Subjects: Artificial Intelligence (cs.AI)

In this paper we present a textual description, in terms of Description Logics, of the BPMN Ontology, which provides a clear semantic formalisation of the structural components of the Business Process Modelling Notation (BPMN), based on the latest stable BPMN specifications from OMG [BPMN Version 1.1 -- January 2008]. The development of the ontology was guided by the description of the complete set of BPMN Element Attributes and Types contained in Annex B of the BPMN specifications.

[147]  arXiv:2109.10717 [pdf, other]
Title: A generic fixed-point iteration-based hierarchical control design: Application to a cryogenic process
Subjects: Systems and Control (eess.SY)

This paper presents an extension of a recently proposed hierarchical control framework applied to a cryogenic system. While in the previous work, each sub-system in the decomposition needed to show at least one component of the control input, in the present contribution, this condition is removed enabling a higher flexibility in the definition of the decomposition graph. The impact of this extended flexibility on the computation time is shown using the same cryogenic station where a decomposition in four sub-system is made possible (instead of two in the previous setting).

[148]  arXiv:2109.10718 [pdf, other]
Title: Input-Output History Feedback Controller for Encrypted Control with Leveled Fully Homomorphic Encryption
Comments: 13 pages, 6 figures
Subjects: Systems and Control (eess.SY); Cryptography and Security (cs.CR)

Protecting the parameters, states, and input/output signals of a dynamic controller is essential for securely outsourcing its computation to an untrusted third party. Although a fully homomorphic encryption scheme allows the evaluation of controller operations with encrypted data, an encrypted dynamic controller with the encryption scheme destabilizes a closed-loop system or degrades the control performance due to overflow. This paper presents a novel controller representation based on input-output history data to implement an encrypted dynamic controller that operates without destabilization and performance degradation. An algorithm for efficient encrypted control computation is also proposed using single instruction/multiple data operations based on a batching technique. Furthermore, this study analyzes the stability and performance degradation of a closed-loop system caused by the effects of controller encryption. A numerical simulation demonstrates the feasibility of the proposed encrypted control scheme, which inherits the control performance of the original controller at a sufficient level.

[149]  arXiv:2109.10719 [pdf, other]
Title: Autonomous Blimp Control using Deep Reinforcement Learning
Subjects: Robotics (cs.RO); Systems and Control (eess.SY)

Aerial robot solutions are becoming ubiquitous for an increasing number of tasks. Among the various types of aerial robots, blimps are very well suited to perform long-duration tasks while being energy efficient, relatively silent and safe. To address the blimp navigation and control task, in our recent work, we have developed a software-in-the-loop simulation and a PID-based controller for large blimps in the presence of wind disturbance. However, blimps have a deformable structure and their dynamics are inherently non-linear and time-delayed, often resulting in large trajectory tracking errors. Moreover, the buoyancy of a blimp is constantly changing due to changes in the ambient temperature and pressure. In the present paper, we explore a deep reinforcement learning (DRL) approach to address these issues. We train only in simulation, while keeping conditions as close as possible to the real-world scenario. We derive a compact state representation to reduce the training time and a discrete action space to enforce control smoothness. Our initial results in simulation show a significant potential of DRL in solving the blimp control task and robustness against moderate wind and parameter uncertainty. Extensive experiments are presented to study the robustness of our approach. We also openly provide the source code of our approach.

[150]  arXiv:2109.10724 [pdf, other]
Title: Low-Latency Incremental Text-to-Speech Synthesis with Distilled Context Prediction Network
Comments: Accepted for ASRU2021
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)

Incremental text-to-speech (TTS) synthesis generates utterances in small linguistic units for the sake of real-time and low-latency applications. We previously proposed an incremental TTS method that leverages a large pre-trained language model to take unobserved future context into account without waiting for the subsequent segment. Although this method achieves comparable speech quality to that of a method that waits for the future context, it entails a huge amount of processing for sampling from the language model at each time step. In this paper, we propose an incremental TTS method that directly predicts the unobserved future context with a lightweight model, instead of sampling words from the large-scale language model. We perform knowledge distillation from a GPT2-based context prediction network into a simple recurrent model by minimizing a teacher-student loss defined between the context embedding vectors of those models. Experimental results show that the proposed method requires about ten times less inference time to achieve comparable synthetic speech quality to that of our previous method, and it can perform incremental synthesis much faster than the average speaking speed of human English speakers, demonstrating the availability of our method to real-time applications.

[151]  arXiv:2109.10727 [pdf, other]
Title: Frisbee: automated testing of Cloud-native applications in Kubernetes
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

As more and more companies are migrating (or planning to migrate) from on-premise to Cloud, their focus is to find anomalies and deficits as early as possible in the development life cycle. We propose Frisbee, a declarative language and associated runtime components for testing cloud-native applications on top of Kubernetes. Given a template describing the system under test and a workflow describing the experiment, Frisbee automatically interfaces with Kubernetes to deploy the necessary software in containers, launch needed sidecars, execute the workflow steps, and perform automated checks for deviation from expected behavior. We evaluate Frisbee through a series of tests, to demonstrate its role in designing, and evaluating cloud-native applications; Frisbee helps in testing uncertainties at the level of application (e.g., dynamically changing request patterns), infrastructure (e.g., crashes, network partitions), and deployment (e.g., saturation points). Our findings have strong implications for the design, deployment, and evaluation of cloud applications. The most prominent is that: erroneous benchmark outputs can cause an apparent performance improvement, automated failover mechanisms may require interoperability with clients, and that a proper placement policy should also account for the clock frequency, not only the number of cores.

[152]  arXiv:2109.10733 [pdf, other]
Title: Animal inspired Application of a Variant of Mel Spectrogram for Seismic Data Processing
Comments: 6 pages, 5 figures, 1 table
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Predicting disaster events from seismic data is of paramount importance and can save thousands of lives, especially in earthquake-prone areas and habitations around volcanic craters. The drastic rise in the number of seismic monitoring stations in recent years has allowed the collection of a huge quantity of data, outpacing the capacity of seismologists. Due to the complex nature of the seismological data, it is often difficult for seismologists to detect subtle patterns with major implications. Machine learning algorithms have been demonstrated to be effective in classification and prediction tasks for seismic data. It has been widely known that some animals can sense disasters like earthquakes from seismic signals well before the disaster strikes. Mel spectrogram has been widely used for speech recognition as it scales the actual frequencies according to human hearing. In this paper, we propose a variant of the Mel spectrogram to scale the raw frequencies of seismic data to the hearing of such animals that can sense disasters from seismic signals. We are using a Computer vision algorithm along with clustering that allows for the classification of unlabelled seismic data.

[153]  arXiv:2109.10736 [pdf, other]
Title: Estimation Error Correction in Deep Reinforcement Learning for Deterministic Actor-Critic Methods
Comments: Accepted at ICTAI 2021
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

In value-based deep reinforcement learning methods, approximation of value functions induces overestimation bias and leads to suboptimal policies. We show that in deep actor-critic methods that aim to overcome the overestimation bias, if the reinforcement signals received by the agent have a high variance, a significant underestimation bias arises. To minimize the underestimation, we introduce a parameter-free, novel deep Q-learning variant. Our Q-value update rule combines the notions behind Clipped Double Q-learning and Maxmin Q-learning by computing the critic objective through the nested combination of maximum and minimum operators to bound the approximate value estimates. We evaluate our modification on the suite of several OpenAI Gym continuous control tasks, improving the state-of-the-art in every environment tested.

[154]  arXiv:2109.10737 [pdf, other]
Title: DyStyle: Dynamic Neural Network for Multi-Attribute-Conditioned Style Editing
Comments: 23 pages, 21 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Great diversity and photorealism have been achieved by unconditional GAN frameworks such as StyleGAN and its variations. In the meantime, persistent efforts have been made to enhance the semantic controllability of StyleGANs. For example, a dozen of style manipulation methods have been recently proposed to perform attribute-conditioned style editing. Although some of these methods work well in manipulating the style codes along one attribute, the control accuracy when jointly manipulating multiple attributes tends to be problematic. To address these limitations, we propose a Dynamic Style Manipulation Network (DyStyle) whose structure and parameters vary by input samples, to perform nonlinear and adaptive manipulation of latent codes for flexible and precise attribute control. Additionally, a novel easy-to-hard training procedure is introduced for efficient and stable training of the DyStyle network. Extensive experiments have been conducted on faces and other objects. As a result, our approach demonstrates fine-grained disentangled edits along multiple numeric and binary attributes. Qualitative and quantitative comparisons with existing style manipulation methods verify the superiority of our method in terms of the attribute control accuracy and identity preservation without compromising the photorealism. The advantage of our method is even more significant for joint multi-attribute control. The source codes are made publicly available at \href{https://github.com/phycvgan/DyStyle}{phycvgan/DyStyle}.

[155]  arXiv:2109.10739 [pdf, other]
Title: Predicting Efficiency/Effectiveness Trade-offs for Dense vs. Sparse Retrieval Strategy Selection
Subjects: Information Retrieval (cs.IR)

Over the last few years, contextualized pre-trained transformer models such as BERT have provided substantial improvements on information retrieval tasks. Recent approaches based on pre-trained transformer models such as BERT, fine-tune dense low-dimensional contextualized representations of queries and documents in embedding space. While these dense retrievers enjoy substantial retrieval effectiveness improvements compared to sparse retrievers, they are computationally intensive, requiring substantial GPU resources, and dense retrievers are known to be more expensive from both time and resource perspectives. In addition, sparse retrievers have been shown to retrieve complementary information with respect to dense retrievers, leading to proposals for hybrid retrievers. These hybrid retrievers leverage low-cost, exact-matching based sparse retrievers along with dense retrievers to bridge the semantic gaps between query and documents. In this work, we address this trade-off between the cost and utility of sparse vs dense retrievers by proposing a classifier to select a suitable retrieval strategy (i.e., sparse vs. dense vs. hybrid) for individual queries. Leveraging sparse retrievers for queries which can be answered with sparse retrievers decreases the number of calls to GPUs. Consequently, while utility is maintained, query latency decreases. Although we use less computational resources and spend less time, we still achieve improved performance. Our classifier can select between sparse and dense retrieval strategies based on the query alone. We conduct experiments on the MS MARCO passage dataset demonstrating an improved range of efficiency/effectiveness trade-offs between purely sparse, purely dense or hybrid retrieval strategies, allowing an appropriate strategy to be selected based on a target latency and resource budget.

[156]  arXiv:2109.10742 [pdf, other]
Title: Early Lane Change Prediction for Automated Driving Systems Using Multi-Task Attention-based Convolutional Neural Networks
Comments: 12 pages, 12 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

Lane change (LC) is one of the safety-critical manoeuvres in highway driving according to various road accident records. Thus, reliably predicting such manoeuvre in advance is critical for the safe and comfortable operation of automated driving systems. The majority of previous studies rely on detecting a manoeuvre that has been already started, rather than predicting the manoeuvre in advance. Furthermore, most of the previous works do not estimate the key timings of the manoeuvre (e.g., crossing time), which can actually yield more useful information for the decision making in the ego vehicle. To address these shortcomings, this paper proposes a novel multi-task model to simultaneously estimate the likelihood of LC manoeuvres and the time-to-lane-change (TTLC). In both tasks, an attention-based convolutional neural network (CNN) is used as a shared feature extractor from a bird's eye view representation of the driving environment. The spatial attention used in the CNN model improves the feature extraction process by focusing on the most relevant areas of the surrounding environment. In addition, two novel curriculum learning schemes are employed to train the proposed approach. The extensive evaluation and comparative analysis of the proposed method in existing benchmark datasets show that the proposed method outperforms state-of-the-art LC prediction models, particularly considering long-term prediction performance.

[157]  arXiv:2109.10743 [pdf]
Title: Natural Typing Recognition vis Surface Electromyography
Subjects: Human-Computer Interaction (cs.HC); Machine Learning (cs.LG)

By using a computer keyboard as a finger recording device, we construct the largest existing dataset for gesture recognition via surface electromyography (sEMG), and use deep learning to achieve over 90% character-level accuracy on reconstructing typed text entirely from measured muscle potentials. We prioritize the temporal structure of the EMG signal instead of the spatial structure of the electrode layout, using network architectures inspired by those used for real-time spoken language transcription. Our architecture recognizes the rapid movements of natural computer typing, which occur at irregular intervals and often overlap in time. The extensive size of our dataset also allows us to study gesture recognition after synthetically downgrading the spatial or temporal resolution, showing the system capabilities necessary for real-time gesture recognition.

[158]  arXiv:2109.10750 [pdf, other]
Title: Control of Pneumatic Artificial Muscles with SNN-based Cerebellar-like Model
Subjects: Robotics (cs.RO)

Soft robotics technologies have gained growing interest in recent years, which allows various applications from manufacturing to human-robot interaction. Pneumatic artificial muscle (PAM), a typical soft actuator, has been widely applied to soft robots. The compliance and resilience of soft actuators allow soft robots to behave compliant when interacting with unstructured environments, while the utilization of soft actuators also introduces nonlinearity and uncertainty. Inspired by Cerebellum's vital functions in control of human's physical movement, a neural network model of Cerebellum based on spiking neuron networks (SNNs) is designed. This model is used as a feed-forward controller in controlling a 1-DOF robot arm driven by PAMs. The simulation results show that this Cerebellar-based system achieves good performance and increases the system's response.

[159]  arXiv:2109.10754 [pdf, ps, other]
Title: Optimal Operation of a Hydrogen-based Building Multi-Energy System Based on Deep Reinforcement Learning
Comments: 13 pages, 15 figures
Subjects: Systems and Control (eess.SY)

Since hydrogen has many advantages (e.g., free pollution, extensive sources, convenient storage and transportation), hydrogen-based multi-energy systems (HMESs) have received wide attention. However, existing works on the optimal operation of HMESs neglect building thermal dynamics, which means that the flexibility of building thermal loads can not be utilized for reducing system operation cost. In this paper, we investigate an optimal operation problem of an HMES with the consideration of building thermal dynamics. Specifically, we first formulate an expected operational cost minimization problem related to an HMES. Due to the existence of uncertain parameters, inexplicit building thermal dynamics models, temporally coupled operational constraints related to three kinds of energy storage systems and indoor temperatures, as well as the coupling between electric energy subsystems and thermal energy subsystems, it is challenging to solve the formulated problem. To overcome the challenge, we reformulate the problem as a Markov game and propose an energy management algorithm to solve it based on multi-agent discrete actor-critic with rules (MADACR). Note that the proposed algorithm does not require any prior knowledge of uncertain parameters, parameter prediction, and explicit building thermal dynamics model. Simulation results based on real-world traces show the effectiveness of the proposed algorithm.

[160]  arXiv:2109.10757 [pdf, other]
Title: Unsupervised Movement Detection in Indoor Positioning Systems
Subjects: Machine Learning (cs.LG); Applications (stat.AP)

In recent years, the usage of indoor positioning systems for manufacturing processes became increasingly popular. Typically, the production hall is equipped with satellites which receive position data of sensors that can be pinned on components, load carriers or industrial trucks. This enables a company e.g. to reduce search efforts and to optimize individual system processes. In our research context, a sensor only sends position information when it is moved. However, various circumstances frequently affect that data is undesirably sent, e.g. due to disrupting factors nearby. This has a negative impact on the data quality, the energy consumption, and the reliability of the whole system. Motivated by this, we aim to distinguish between actual movements and signals that were undesirably sent which is in particular challenging due to the susceptibility of indoor systems in terms of noise and measuring errors. Therefore, we propose two novel unsupervised classification algorithms suitable for this task. Depending on the question of interest, they rely either on a distance-based or on a time-based criterion, which allows to make use of all essential information. Furthermore, we propose an approach to combine both classifications and to aggregate them on spatial production areas. This enables us to generate a comprehensive map of the underlying production hall with the sole usage of the position data. Aside from the analysis and detection of the underlying movement structure, the user benefits from a better understanding of own system processes and from the detection of problematic system areas which leads to a more efficient usage of positioning systems. Since all our approaches are constructed with unsupervised techniques, they are handily applicable in practice and do not require more information than the output data of the positioning system.

[161]  arXiv:2109.10760 [pdf, other]
Title: FaceEraser: Removing Facial Parts for Augmented Reality
Comments: 18 pages, 15 figures. ICCV 2022, Fifth Workshop on Computer Vision for AR/VR
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Our task is to remove all facial parts (e.g., eyebrows, eyes, mouth and nose), and then impose visual elements onto the ``blank'' face for augmented reality. Conventional object removal methods rely on image inpainting techniques (e.g., EdgeConnect, HiFill) that are trained in a self-supervised manner with randomly manipulated image pairs. Specifically, given a set of natural images, randomly masked images are used as inputs and the raw images are treated as ground truths. Whereas, this technique does not satisfy the requirements of facial parts removal, as it is hard to obtain ``ground-truth'' images with real ``blank'' faces. To address this issue, we propose a novel data generation technique to produce paired training data that well mimic the ``blank'' faces. In the mean time, we propose a novel network architecture for improved inpainting quality for our task. Finally, we demonstrate various face-oriented augmented reality applications on top of our facial parts removal model. Our method has been integrated into commercial products and its effectiveness has been verified with unconstrained user inputs. The source codes, pre-trained models and training data will be released for research purposes.

[162]  arXiv:2109.10761 [pdf, other]
Title: Stigmergy-based collision-avoidance algorithm for self-organising swarms
Comments: Accepted for publication in Proceedings of the 5th International Conference on Computational Vision and Bio Inspired Computing. To be published in Springer's Advances in Intelligent Systems and Computing
Subjects: Systems and Control (eess.SY); Robotics (cs.RO)

Real-time multi-agent collision-avoidance algorithms comprise a key enabling technology for the practical use of self-organising swarms of drones. This paper proposes a decentralised reciprocal collision-avoidance algorithm, which is based on stigmergy and scalable. The algorithm is computationally inexpensive, based on the gradient of the locally measured dynamic cumulative signal strength field which results from the signals emitted by the swarm. The signal strength acts as a repulsor on each drone, which then tends to steer away from the noisiest regions (cluttered environment), thus avoiding collisions. The magnitudes of these repulsive forces can be tuned to control the relative importance assigned to collision avoidance with respect to the other phenomena affecting the agent's dynamics. We carried out numerical experiments on a self-organising swarm of drones aimed at fighting wildfires autonomously. As expected, it has been found that the collision rate can be reduced either by decreasing the cruise speed of the agents and/or by increasing the sampling frequency of the global signal strength field. A convenient by-product of the proposed collision-avoidance algorithm is that it helps maintain diversity in the swarm, thus enhancing exploration.

[163]  arXiv:2109.10763 [pdf]
Title: A Deep Learning Perspective on Connected Automated Vehicle (CAV) Cybersecurity and Threat Intelligence
Comments: Book chapter
Subjects: Cryptography and Security (cs.CR); Systems and Control (eess.SY)

The automation and connectivity of CAV inherit most of the cyber-physical vulnerabilities of incumbent technologies such as evolving network architectures, wireless communications, and AI-based automation. This book chapter entails the cyber-physical vulnerabilities and risks that originated in IT, OT, and the physical domains of the CAV ecosystem, eclectic threat landscapes, and threat intelligence. To deal with the security threats in high-speed, high dimensional, multimodal data and assets from eccentric stakeholders of the CAV ecosystem, this chapter presents and analyzes some of the state of art deep learning-based threat intelligence for attack detection. The frontiers in deep learning, namely Meta-Learning and Federated Learning, along with their challenges have been included in the chapter. We have proposed, trained, and tested the deep CNN-LSTM architecture for CAV threat intelligence; assessed and compared the performance of the proposed model against other deep learning algorithms such as DNN, CNN, LSTM. Our results indicate the superiority of the proposed model although DNN and 1d-CNN also achieved more than 99% of accuracy, precision, recall, f1-score, and AUC on the CAV-KDD dataset. The good performance of deep CNN-LSTM comes with the increased model complexity and cumbersome hyperparameters tuning. Still, there are open challenges on deep learning adoption in the CAV cybersecurity paradigm due to lack of properly developed protocols and policies, poorly defined privileges between stakeholders, costlier training, adversarial threats to the model, and poor generalizability of the model under out of data distributions.

[164]  arXiv:2109.10767 [pdf, other]
Title: HybridSDF: Combining Free Form Shapes and Geometric Primitives for effective Shape Manipulation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computational Geometry (cs.CG)

CAD modeling typically involves the use of simple geometric primitives whereas recent advances in deep-learning based 3D surface modeling have opened new shape design avenues. Unfortunately, these advances have not yet been accepted by the CAD community because they cannot be integrated into engineering workflows. To remedy this, we propose a novel approach to effectively combining geometric primitives and free-form surfaces represented by implicit surfaces for accurate modeling that preserves interpretability, enforces consistency, and enables easy manipulation.

[165]  arXiv:2109.10770 [pdf, ps, other]
Title: Exploring Adversarial Examples for Efficient Active Learning in Machine Learning Classifiers
Subjects: Machine Learning (cs.LG)

Machine learning researchers have long noticed the phenomenon that the model training process will be more effective and efficient when the training samples are densely sampled around the underlying decision boundary. While this observation has already been widely applied in a range of machine learning security techniques, it lacks theoretical analyses of the correctness of the observation. To address this challenge, we first add particular perturbation to original training examples using adversarial attack methods so that the generated examples could lie approximately on the decision boundary of the ML classifiers. We then investigate the connections between active learning and these particular training examples. Through analyzing various representative classifiers such as k-NN classifiers, kernel methods as well as deep neural networks, we establish a theoretical foundation for the observation. As a result, our theoretical proofs provide support to more efficient active learning methods with the help of adversarial examples, contrary to previous works where adversarial examples are often used as destructive solutions. Experimental results show that the established theoretical foundation will guide better active learning strategies based on adversarial examples.

[166]  arXiv:2109.10774 [pdf, other]
Title: "It's a Trap!"-How Speculation Invariance Can Be Abused with Forward Speculative Interference
Comments: Presented at IEEE International Symposium On Secure And Private Execution Enviroment Design (SEED) 2021
Subjects: Cryptography and Security (cs.CR); Hardware Architecture (cs.AR)

Speculative side-channel attacks access sensitive data and use transmitters to leak the data during wrong-path execution. Various defenses have been proposed to prevent such information leakage. However, not all speculatively executed instructions are unsafe: Recent work demonstrates that speculation invariant instructions are independent of speculative control-flow paths and are guaranteed to eventually commit, regardless of the speculation outcome. Compile-time information coupled with run-time mechanisms can then selectively lift defenses for speculation invariant instructions, reclaiming some of the lost performance.
Unfortunately, speculation invariant instructions can easily be manipulated by a form of speculative interference to leak information via a new side-channel that we introduce in this paper. We show that forward speculative interference whereolder speculative instructions interfere with younger speculation invariant instructions effectively turns them into transmitters for secret data accessed during speculation. We demonstrate forward speculative interference on actual hardware, by selectively filling the reorder buffer (ROB) with instructions, pushing speculative invariant instructions in-or-out of the ROB on demand, based on a speculatively accessed secret. This reveals the speculatively accessed secret, as the occupancy of the ROB itself becomes a new speculative side-channel.

[167]  arXiv:2109.10777 [pdf, other]
Title: Deep Variational Clustering Framework for Self-labeling of Large-scale Medical Images
Comments: arXiv admin note: text overlap with arXiv:2109.05232
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)

We propose a Deep Variational Clustering (DVC) framework for unsupervised representation learning and clustering of large-scale medical images. DVC simultaneously learns the multivariate Gaussian posterior through the probabilistic convolutional encoder and the likelihood distribution with the probabilistic convolutional decoder; and optimizes cluster labels assignment. Here, the learned multivariate Gaussian posterior captures the latent distribution of a large set of unlabeled images. Then, we perform unsupervised clustering on top of the variational latent space using a clustering loss. In this approach, the probabilistic decoder helps to prevent the distortion of data points in the latent space and to preserve the local structure of data generating distribution. The training process can be considered as a self-training process to refine the latent space and simultaneously optimizing cluster assignments iteratively. We evaluated our proposed framework on three public datasets that represented different medical imaging modalities. Our experimental results show that our proposed framework generalizes better across different datasets. It achieves compelling results on several medical imaging benchmarks. Thus, our approach offers potential advantages over conventional deep unsupervised learning in real-world applications. The source code of the method and all the experiments are available publicly at: https://github.com/csfarzin/DVC

[168]  arXiv:2109.10778 [pdf, other]
Title: Label Cleaning Multiple Instance Learning: Refining Coarse Annotations on Single Whole-Slide Images
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

Annotating cancerous regions in whole-slide images (WSIs) of pathology samples plays a critical role in clinical diagnosis, biomedical research, and machine learning algorithms development. However, generating exhaustive and accurate annotations is labor-intensive, challenging, and costly. Drawing only coarse and approximate annotations is a much easier task, less costly, and it alleviates pathologists' workload. In this paper, we study the problem of refining these approximate annotations in digital pathology to obtain more accurate ones. Some previous works have explored obtaining machine learning models from these inaccurate annotations, but few of them tackle the refinement problem where the mislabeled regions should be explicitly identified and corrected, and all of them require a - often very large - number of training samples. We present a method, named Label Cleaning Multiple Instance Learning (LC-MIL), to refine coarse annotations on a single WSI without the need of external training data. Patches cropped from a WSI with inaccurate labels are processed jointly with a MIL framework, and a deep-attention mechanism is leveraged to discriminate mislabeled instances, mitigating their impact on the predictive model and refining the segmentation. Our experiments on a heterogeneous WSI set with breast cancer lymph node metastasis, liver cancer, and colorectal cancer samples show that LC-MIL significantly refines the coarse annotations, outperforming the state-of-the-art alternatives, even while learning from a single slide. These results demonstrate the LC-MIL is a promising, lightweight tool to provide fine-grained annotations from coarsely annotated pathology sets.

[169]  arXiv:2109.10780 [pdf, other]
Title: Stability Assessment for Multi-Infeed Grid-Connected VSCs Modeled in the Admittance Matrix Form
Journal-ref: IEEE Transactions on Circuits and Systems I: Regular Papers ( Volume: 68, Issue: 9, Sept. 2021)
Subjects: Systems and Control (eess.SY)

The increasing use of power electronics converters to integrate renewable energy sources has been subject of concern due to the resonance oscillatory phenomena caused by their interaction with poorly damped AC networks. Early studies are focused to assess the controller influence of a single converter connected to simple networks, and they are no longer representative for existing systems. Lately, studies of multi-infeed grid-connected converters are of particular interest, and their main aim is to apply traditional criteria and identify their difficulties in the stability assessment. An extension of traditional criteria is commonly proposed as a result of these analysis, but they can be burdensome for large and complex power systems. The present work addresses this issue by proposing a simple criterion to assess the stability of large power systems with high-penetration of power converters. The criterion has its origin in the mode analysis and positive-net damping stability criteria, and it addresses the stability in the frequency domain by studying the eigenvalues magnitude and real component of dynamic models in the admittance matrix form. Its effectiveness is tested in two case studies developed in Matlab/Simulink which compare it with traditionally criteria, proving its simplicity.

[170]  arXiv:2109.10781 [pdf, other]
Title: Introducing Symmetries to Black Box Meta Reinforcement Learning
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

Meta reinforcement learning (RL) attempts to discover new RL algorithms automatically from environment interaction. In so-called black-box approaches, the policy and the learning algorithm are jointly represented by a single neural network. These methods are very flexible, but they tend to underperform in terms of generalisation to new, unseen environments. In this paper, we explore the role of symmetries in meta-generalisation. We show that a recent successful meta RL approach that meta-learns an objective for backpropagation-based learning exhibits certain symmetries (specifically the reuse of the learning rule, and invariance to input and output permutations) that are not present in typical black-box meta RL systems. We hypothesise that these symmetries can play an important role in meta-generalisation. Building off recent work in black-box supervised meta learning, we develop a black-box meta RL system that exhibits these same symmetries. We show through careful experimentation that incorporating these symmetries can lead to algorithms with a greater ability to generalise to unseen action & observation spaces, tasks, and environments.

[171]  arXiv:2109.10787 [pdf, other]
Title: DHT-based Communications Survey: Architectures and Use Cases
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

Several distributed system paradigms utilize Distributed Hash Tables (DHTs) to realize structured peer-to-peer (P2P) overlays. DHT structures arise as the most commonly used organizations for peers that can efficiently perform crucial services such as data storage, replication, query resolution, and load balancing. With the advances in various distributed system technologies, novel and efficient solutions based on DHTs emerge and play critical roles in system design. DHT-based methods and communications have been proposed to address challenges such as scalability, availability, reliability and performance, by considering unique characteristics of these technologies. In this article, we propose a classification of the state-of-the-art DHT-based methods focusing on their system architecture, communication, routing and technological aspects across various system domains. To the best of our knowledge, there is no comprehensive survey on DHT-based applications from system architecture and communication perspectives that spans various domains of recent distributed system technologies. We investigate the recently emerged DHT-based solutions in the seven key domains of edge and fog computing, cloud computing, blockchain, the Internet of Things (IoT), Online Social Networks (OSNs), Mobile Ad Hoc Networks (MANETs), and Vehicular Ad Hoc Networks (VANETs). In contrast to the existing surveys, our study goes beyond the commonly known DHT methods such as storage, routing, and lookup, and identifies diverse DHT-based solutions including but not limited to aggregation, task scheduling, resource management and discovery, clustering and group management, federation, data dependency management, and data transmission. Furthermore, we identify open problems and discuss future research guidelines for each domain.

[172]  arXiv:2109.10789 [pdf, other]
Title: Do I Get the Privacy I Need? Benchmarking Utility in Differential Privacy Libraries
Comments: 13 pages, 12 figures, 15 tables, and 1 algorithm
Subjects: Cryptography and Security (cs.CR)

An increasing number of open-source libraries promise to bring differential privacy to practice, even for non-experts. This paper studies five libraries that offer differentially private analytics: Google DP, SmartNoise, diffprivlib, diffpriv, and Chorus. We compare these libraries qualitatively (capabilities, features, and maturity) and quantitatively (utility and scalability) across four analytics queries (count, sum, mean, and variance) executed on synthetic and real-world datasets. We conclude that these libraries provide similar utility (except in some notable scenarios). However, there are significant differences in the features provided, and we find that no single library excels in all areas. Based on our results, we provide guidance for practitioners to help in choosing a suitable library, guidance for library designers to enhance their software, and guidance for researchers on open challenges in differential privacy tools for non-experts.

[173]  arXiv:2109.10790 [pdf]
Title: Evaluation of mechanical and energy properties for the phase field modeling of failure
Subjects: Numerical Analysis (math.NA)

In recent years, various phase field models have been developed in variational methods to simulate the failure of brittle solids. However, there is a lack of objective evaluation of the existing results, and in particular, there are few studies on model nonhomogeneous resolution, stress-strain linear elastic properties, and failure stress estimation. To compensate for the above gaps, the commonly used variational phase field model is systematically analyzed to solve the problem of evaluating the mechanics and energy properties of the model in this paper. The unified expression of the analytical solution and the nonhomogeneous solution under specific boundary conditions is analyzed and verified. Additionally, we theoretically analyze the energy properties of the phase field model and study the influence of the critical strain energy on the damage field, stress and strain of different models. Finally, the effect of different effective material parameters on the material failure stress is analyzed, the stress estimation formula for the failure of mode I of various phase field models is given, and the accuracy of the theory is verified by some examples.

[174]  arXiv:2109.10795 [pdf, other]
Title: Neural network relief: a pruning algorithm based on neural activity
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)

Current deep neural networks (DNNs) are overparameterized and use most of their neuronal connections during inference for each task. The human brain, however, developed specialized regions for different tasks and performs inference with a small fraction of its neuronal connections. We propose an iterative pruning strategy introducing a simple importance-score metric that deactivates unimportant connections, tackling overparameterization in DNNs and modulating the firing patterns. The aim is to find the smallest number of connections that is still capable of solving a given task with comparable accuracy, i.e. a simpler subnetwork. We achieve comparable performance for LeNet architectures on MNIST, and significantly higher parameter compression than state-of-the-art algorithms for VGG and ResNet architectures on CIFAR-10/100 and Tiny-ImageNet. Our approach also performs well for the two different optimizers considered -- Adam and SGD. The algorithm is not designed to minimize FLOPs when considering current hardware and software implementations, although it performs reasonably when compared to the state of the art.

[175]  arXiv:2109.10797 [pdf, ps, other]
Title: Improved Multi-label Classification with Frequent Label-set Mining and Association
Subjects: Machine Learning (cs.LG)

Multi-label (ML) data deals with multiple classes associated with individual samples at the same time. This leads to the co-occurrence of several classes repeatedly, which indicates some existing correlation among them. In this article, the correlation among classes has been explored to improve the classification performance of existing ML classifiers. A novel approach of frequent label-set mining has been proposed to extract these correlated classes from the label-sets of the data. Both co-presence (CP) and co-absence (CA) of classes have been taken into consideration. The rules mined from the ML data has been further used to incorporate class correlation information into existing ML classifiers. The soft scores generated by an ML classifier are modified through a novel approach using the CP-CA rules. A concept of certain and uncertain scores has been defined here, where the proposed method aims to improve the uncertain scores with the help of the certain scores and their corresponding CP-CA rules. This has been experimentally analysed on ten ML datasets for three ML existing classifiers which shows substantial improvement in their overall performance.

[176]  arXiv:2109.10803 [pdf, other]
Title: Multi-Slice Clustering for 3-order Tensor Data
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

Several methods of triclustering of three dimensional data require the specification of the cluster size in each dimension. This introduces a certain degree of arbitrariness. To address this issue, we propose a new method, namely the multi-slice clustering (MSC) for a 3-order tensor data set. We analyse, in each dimension or tensor mode, the spectral decomposition of each tensor slice, i.e. a matrix. Thus, we define a similarity measure between matrix slices up to a threshold (precision) parameter, and from that, identify a cluster. The intersection of all partial clusters provides the desired triclustering. The effectiveness of our algorithm is shown on both synthetic and real-world data sets.

[177]  arXiv:2109.10808 [pdf, other]
Title: Population-scale dietary interests during the COVID-19 pandemic
Subjects: Social and Information Networks (cs.SI)

The SARS-CoV-2 virus has altered people's lives around the world, not only through the coronavirus disease (COVID-19) it causes, but also through unprecedented non-pharmaceutical interventions such as full-scale national lockdowns. Here we document population-wide shifts in dietary interests in 12 countries in 2020, as revealed through timeseries of Google search volumes. We find that during the first wave of the COVID-19 pandemic there was an overall surge in food interest, larger and longer-lasting than the surge during typical end-of-year holidays. The changes were strongly associated with population-wide mobility patterns. Using a quasi-experimental regression discontinuity design, we estimate that the shock of decreased mobility manifested as a drastic increase in interest in consuming food at home, with interest in recipes and related entities increasing by 90% on average across countries, and a corresponding decrease in consuming food outside of home, with the interest in restaurants decreasing by 54% on average. We find that, in addition to the volume of searched foods, the nature of searched foods also changed. The most drastic (up to threefold) increases occurred for calorie-dense carbohydrate-based foods such as pastries, bakery products, bread, pies, and desserts. In terms of the relative share (rather than absolute volume) of search interest, the most prominent increases occurred for carbohydrate-based foods, whereas the share of interest in other food categories on average remained robust. The observed shifts in dietary interests have the potential to affect food consumption and health outcomes of people worldwide. These findings can inform governmental and organizational decisions regarding measures to mitigate the effects of the COVID-19 pandemic on diet and nutrition, and thus on population health.

[178]  arXiv:2109.10813 [pdf, other]
Title: A Workflow for Offline Model-Free Robotic Reinforcement Learning
Comments: CoRL 2021. Project Website: this https URL First two authors contributed equally
Subjects: Machine Learning (cs.LG)

Offline reinforcement learning (RL) enables learning control policies by utilizing only prior experience, without any online interaction. This can allow robots to acquire generalizable skills from large and diverse datasets, without any costly or unsafe online data collection. Despite recent algorithmic advances in offline RL, applying these methods to real-world problems has proven challenging. Although offline RL methods can learn from prior data, there is no clear and well-understood process for making various design choices, from model architecture to algorithm hyperparameters, without actually evaluating the learned policies online. In this paper, our aim is to develop a practical workflow for using offline RL analogous to the relatively well-understood workflows for supervised learning problems. To this end, we devise a set of metrics and conditions that can be tracked over the course of offline training, and can inform the practitioner about how the algorithm and model architecture should be adjusted to improve final performance. Our workflow is derived from a conceptual understanding of the behavior of conservative offline RL algorithms and cross-validation in supervised learning. We demonstrate the efficacy of this workflow in producing effective policies without any online tuning, both in several simulated robotic learning scenarios and for three tasks on two distinct real robots, focusing on learning manipulation skills with raw image observations with sparse binary rewards. Explanatory video and additional results can be found at sites.google.com/view/offline-rl-workflow

[179]  arXiv:2109.10815 [pdf, ps, other]
Title: A modified block alternating splitting iteration method for solving a class of two-by-two block complex linear systems
Comments: Six pages, submitted
Subjects: Numerical Analysis (math.NA)

A modified version of the block alternating splitting (MBAS) iteration method is presented for solving the system arising from finite element discretization of the distributed optimal control problem with time-periodic parabolic equations. We prove that the MBAS iteration method is unconditionally convergent. We also present an estimation formula for the iteration parameter of the MBAS preconditioner. Numerical results are presented to verify the efficiency of both the MBAS iteration method and the MBAS preconditioner.

[180]  arXiv:2109.10817 [pdf, other]
Title: Causal Inference in Non-linear Time-series usingDeep Networks and Knockoff Counterfactuals
Journal-ref: IEEE International Conference on Machine Learning and Applications (ICMLA) 2021
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)

Estimating causal relations is vital in understanding the complex interactions in multivariate time series. Non-linear coupling of variables is one of the major challenges inaccurate estimation of cause-effect relations. In this paper, we propose to use deep autoregressive networks (DeepAR) in tandem with counterfactual analysis to infer nonlinear causal relations in multivariate time series. We extend the concept of Granger causality using probabilistic forecasting with DeepAR. Since deep networks can neither handle missing input nor out-of-distribution intervention, we propose to use the Knockoffs framework (Barberand Cand`es, 2015) for generating intervention variables and consequently counterfactual probabilistic forecasting. Knockoff samples are independent of their output given the observed variables and exchangeable with their counterpart variables without changing the underlying distribution of the data. We test our method on synthetic as well as real-world time series datasets. Overall our method outperforms the widely used vector autoregressive Granger causality and PCMCI in detecting nonlinear causal dependency in multivariate time series.

[181]  arXiv:2109.10819 [pdf]
Title: Formulations and Approximations of the Branch Flow Model for Mesh Power Networks
Authors: Zhao Yuan
Comments: 9 pages submitted to the Journal of Modern Power Systems and Clean Energy. Manuscript ID: MPCE-2021-0647. September 2021. Email: zhaoyuan@hi.is, zhaoyuan.epslab@gmail.com
Subjects: Systems and Control (eess.SY)

The formulations and approximations of the branch flow model for mesh power networks (Mesh-BranchFlow) are given in this paper. Using different sets of the power flow equations, six formats of the exact Mesh-BranchFlow model are listed. These six formats are mathematically equivalent with each other. Linear approximation and second-order cone programming (SOCP) are then used to derive the six formats of the convex Mesh-BranchFlow model. The branch ampacity constraints considering the shunt conductance and capacitance of the transmission line $\Pi$-model are derived. The key foundation of deriving the ampacity constraints is the correct interpretation of the physical meaning of the transmission line $\Pi$-model. An exact linear expression of the ampacity constraints of the power loss variable is derived. The applications of the Mesh-BranchFlow model in deriving twelve formats of the exact optimal power flow (OPF) model and twelve formats of the approximate OPF model are formulated and analyzed. Using the Julia programming language, the extensive numerical investigations of all formats of the OPF models show the accuracy and computational efficiency of the Mesh-BranchFlow model. A penalty function based approximation gap reduction method is finally proposed and numerically validated to improve the AC-feasibility of the approximate Mesh-BranchFlow model.

[182]  arXiv:2109.10824 [pdf, other]
Title: Learning by Examples Based on Multi-level Optimization
Subjects: Machine Learning (cs.LG)

Learning by examples, which learns to solve a new problem by looking into how similar problems are solved, is an effective learning method in human learning. When a student learns a new topic, he/she finds out exemplar topics that are similar to this new topic and studies the exemplar topics to deepen the understanding of the new topic. We aim to investigate whether this powerful learning skill can be borrowed from humans to improve machine learning as well. In this work, we propose a novel learning approach called Learning By Examples (LBE). Our approach automatically retrieves a set of training examples that are similar to query examples and predicts labels for query examples by using class labels of the retrieved examples. We propose a three-level optimization framework to formulate LBE which involves three stages of learning: learning a Siamese network to retrieve similar examples; learning a matching network to make predictions on query examples by leveraging class labels of retrieved similar examples; learning the ``ground-truth'' similarities between training examples by minimizing the validation loss. We develop an efficient algorithm to solve the LBE problem and conduct extensive experiments on various benchmarks where the results demonstrate the effectiveness of our method on both supervised and few-shot learning.

[183]  arXiv:2109.10835 [pdf, other]
Title: Mapping and Validating a Point Neuron Model on Intel's Neuromorphic Hardware Loihi
Subjects: Neural and Evolutionary Computing (cs.NE); Emerging Technologies (cs.ET)

Neuromorphic hardware is based on emulating the natural biological structure of the brain. Since its computational model is similar to standard neural models, it could serve as a computational acceleration for research projects in the field of neuroscience and artificial intelligence, including biomedical applications. However, in order to exploit this new generation of computer chips, rigorous simulation and consequent validation of brain-based experimental data is imperative. In this work, we investigate the potential of Intel's fifth generation neuromorphic chip - `Loihi', which is based on the novel idea of Spiking Neural Networks (SNNs) emulating the neurons in the brain. The work is implemented in context of simulating the Leaky Integrate and Fire (LIF) models based on the mouse primary visual cortex matched to a rich data set of anatomical, physiological and behavioral constraints. Simulations on the classical hardware serve as the validation platform for the neuromorphic implementation. We find that Loihi replicates classical simulations very efficiently and scales notably well in terms of both time and energy performance as the networks get larger.

[184]  arXiv:2109.10836 [html]
Title: AI-HRI 2021 Proceedings
Comments: Proceedings of the AI-HRI Symposium at AAAI-FSS 2021
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI)

The Artificial Intelligence (AI) for Human-Robot Interaction (HRI) Symposium has been a successful venue of discussion and collaboration since 2014. During that time, these symposia provided a fertile ground for numerous collaborations and pioneered many discussions revolving trust in HRI, XAI for HRI, service robots, interactive learning, and more.
This year, we aim to review the achievements of the AI-HRI community in the last decade, identify the challenges facing ahead, and welcome new researchers who wish to take part in this growing community. Taking this wide perspective, this year there will be no single theme to lead the symposium and we encourage AI-HRI submissions from across disciplines and research interests. Moreover, with the rising interest in AR and VR as part of an interaction and following the difficulties in running physical experiments during the pandemic, this year we specifically encourage researchers to submit works that do not include a physical robot in their evaluation, but promote HRI research in general. In addition, acknowledging that ethics is an inherent part of the human-robot interaction, we encourage submissions of works on ethics for HRI. Over the course of the two-day meeting, we will host a collaborative forum for discussion of current efforts in AI-HRI, with additional talks focused on the topics of ethics in HRI and ubiquitous HRI.

[185]  arXiv:2109.10839 [pdf, other]
Title: Why Most Results of Socio-Technical Security User Studies Are False
Authors: Thomas Gross
Comments: Open Science Framework: this https URL, 19 pages, Author's copy of the work. The work was supported by the ERC Starting Grant CASCAde, GA no. 716980
Subjects: Cryptography and Security (cs.CR); Human-Computer Interaction (cs.HC)

Background. In recent years, cyber security user studies have been scrutinized for their reporting completeness, statistical reporting fidelity, statistical reliability and biases. It remains an open question what strength of evidence positive reports of such studies actually yield. We focus on the extent to which positive reports indicate relation true in reality, that is, a probabilistic assessment.
Aim. This study aims at establishing the overall strength of evidence in cyber security user studies, with the dimensions -- Positive Predictive Value (PPV) and its complement False Positive Risk (FPR), -- Likelihood Ratio (LR), and -- Reverse-Bayesian Prior (RBP) for a fixed tolerated False Positive Risk.
Method. Based on $431$ coded statistical inferences in $146$ cyber security user studies from a published SLR covering the years 2006-2016, we first compute a simulation of the a posteriori false positive risk based on assumed prior and bias thresholds. Second, we establish the observed likelihood ratios for positive reports. Third, we compute the reverse Bayesian argument on the observed positive reports by computing the prior required for a fixed a posteriori false positive rate.
Results. We obtain a comprehensive analysis of the strength of evidence including an account of appropriate multiple comparison corrections. The simulations show that even in face of well-controlled conditions and high prior likelihoods, only few studies achieve good a posteriori probabilities.
Conclusions. Our work shows that the strength of evidence of the field is weak and that most positive reports are likely false. From this, we learn what to watch out for in studies to advance the knowledge of the field.

[186]  arXiv:2109.10847 [pdf, other]
Title: Small-Bench NLP: Benchmark for small single GPU trained models in Natural Language Processing
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL)

Recent progress in the Natural Language Processing domain has given us several State-of-the-Art (SOTA) pretrained models which can be finetuned for specific tasks. These large models with billions of parameters trained on numerous GPUs/TPUs over weeks are leading in the benchmark leaderboards. In this paper, we discuss the need for a benchmark for cost and time effective smaller models trained on a single GPU. This will enable researchers with resource constraints experiment with novel and innovative ideas on tokenization, pretraining tasks, architecture, fine tuning methods etc. We set up Small-Bench NLP, a benchmark for small efficient neural language models trained on a single GPU. Small-Bench NLP benchmark comprises of eight NLP tasks on the publicly available GLUE datasets and a leaderboard to track the progress of the community. Our ELECTRA-DeBERTa (15M parameters) small model architecture achieves an average score of 81.53 which is comparable to that of BERT-Base's 82.20 (110M parameters). Our models, code and leaderboard are available at https://github.com/smallbenchnlp

[187]  arXiv:2109.10852 [pdf, other]
Title: Pix2seq: A Language Modeling Framework for Object Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)

This paper presents Pix2Seq, a simple and generic framework for object detection. Unlike existing approaches that explicitly integrate prior knowledge about the task, we simply cast object detection as a language modeling task conditioned on the observed pixel inputs. Object descriptions (e.g., bounding boxes and class labels) are expressed as sequences of discrete tokens, and we train a neural net to perceive the image and generate the desired sequence. Our approach is based mainly on the intuition that if a neural net knows about where and what the objects are, we just need to teach it how to read them out. Beyond the use of task-specific data augmentations, our approach makes minimal assumptions about the task, yet it achieves competitive results on the challenging COCO dataset, compared to highly specialized and well optimized detection algorithms.

[188]  arXiv:2109.10855 [pdf, other]
Title: BFClass: A Backdoor-free Text Classification Framework
Comments: Accepted to appear in Findings of EMNLP 2021
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)

Backdoor attack introduces artificial vulnerabilities into the model by poisoning a subset of the training data via injecting triggers and modifying labels. Various trigger design strategies have been explored to attack text classifiers, however, defending such attacks remains an open problem. In this work, we propose BFClass, a novel efficient backdoor-free training framework for text classification. The backbone of BFClass is a pre-trained discriminator that predicts whether each token in the corrupted input was replaced by a masked language model. To identify triggers, we utilize this discriminator to locate the most suspicious token from each training sample and then distill a concise set by considering their association strengths with particular labels. To recognize the poisoned subset, we examine the training samples with these identified triggers as the most suspicious token, and check if removing the trigger will change the poisoned model's prediction. Extensive experiments demonstrate that BFClass can identify all the triggers, remove 95% poisoned training samples with very limited false alarms, and achieve almost the same performance as the models trained on the benign training data.

[189]  arXiv:2109.10856 [pdf, other]
Title: Coarse2Fine: Fine-grained Text Classification on Coarsely-grained Annotated Data
Comments: Accepted to appear in EMNLP 2021
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)

Existing text classification methods mainly focus on a fixed label set, whereas many real-world applications require extending to new fine-grained classes as the number of samples per label increases. To accommodate such requirements, we introduce a new problem called coarse-to-fine grained classification, which aims to perform fine-grained classification on coarsely annotated data. Instead of asking for new fine-grained human annotations, we opt to leverage label surface names as the only human guidance and weave in rich pre-trained generative language models into the iterative weak supervision strategy. Specifically, we first propose a label-conditioned finetuning formulation to attune these generators for our task. Furthermore, we devise a regularization objective based on the coarse-fine label constraints derived from our problem setting, giving us even further improvements over the prior formulation. Our framework uses the fine-tuned generative models to sample pseudo-training data for training the classifier, and bootstraps on real unlabeled data for model refinement. Extensive experiments and case studies on two real-world datasets demonstrate superior performance over SOTA zero-shot classification baselines.

[190]  arXiv:2109.10859 [pdf, other]
Title: Pushing the Right Buttons: Adversarial Evaluation of Quality Estimation
Comments: Accepted to WMT 2021 Conference co-located with EMNLP 2021. 14 pages with a 4 page appendix
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

Current Machine Translation (MT) systems achieve very good results on a growing variety of language pairs and datasets. However, they are known to produce fluent translation outputs that can contain important meaning errors, thus undermining their reliability in practice. Quality Estimation (QE) is the task of automatically assessing the performance of MT systems at test time. Thus, in order to be useful, QE systems should be able to detect such errors. However, this ability is yet to be tested in the current evaluation practices, where QE systems are assessed only in terms of their correlation with human judgements. In this work, we bridge this gap by proposing a general methodology for adversarial testing of QE for MT. First, we show that despite a high correlation with human judgements achieved by the recent SOTA, certain types of meaning errors are still problematic for QE to detect. Second, we show that on average, the ability of a given model to discriminate between meaning-preserving and meaning-altering perturbations is predictive of its overall performance, thus potentially allowing for comparing QE systems without relying on manual quality annotation.

[191]  arXiv:2109.10862 [pdf, other]
Title: Recursively Summarizing Books with Human Feedback
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

A major challenge for scaling machine learning is training models to perform tasks that are very difficult or time-consuming for humans to evaluate. We present progress on this problem on the task of abstractive summarization of entire fiction novels. Our method combines learning from human feedback with recursive task decomposition: we use models trained on smaller parts of the task to assist humans in giving feedback on the broader task. We collect a large volume of demonstrations and comparisons from human labelers, and fine-tune GPT-3 using behavioral cloning and reward modeling to do summarization recursively. At inference time, the model first summarizes small sections of the book and then recursively summarizes these summaries to produce a summary of the entire book. Our human labelers are able to supervise and evaluate the models quickly, despite not having read the entire books themselves. Our resulting model generates sensible summaries of entire books, even matching the quality of human-written summaries in a few cases ($\sim5\%$ of books). We achieve state-of-the-art results on the recent BookSum dataset for book-length summarization. A zero-shot question-answering model using these summaries achieves state-of-the-art results on the challenging NarrativeQA benchmark for answering questions about books and movie scripts. We release datasets of samples from our model.

[192]  arXiv:2109.10867 [pdf]
Title: Towards a GDPR-Compliant Blockchain-Based COVID Vaccination Passport
Comments: Applied Sciences , 2021
Subjects: Cryptography and Security (cs.CR); Human-Computer Interaction (cs.HC)

The COVID-19 pandemic has shaken the world and limited work/personal life activities. Besides the loss of human lives and agony faced by humankind, the pandemic has badly hit different sectors economically, including the travel industry. Special arrangements, including COVID test before departure and on arrival, and voluntary quarantine, were enforced to limit the risk of transmission. However, the hope for returning to a normal (pre-COVID) routine relies on the success of the current COVID vaccination drives administered by different countries. To open for tourism and other necessary travel, a need is realized for a universally accessible proof of COVID vaccination, allowing travelers to cross the borders without any hindrance. This paper presents an architectural framework for a GDPR-compliant blockchain-based COVID vaccination passport (VacciFi), whilst considering the relevant developments, especially in the European Union region.

[193]  arXiv:2109.10868 [pdf, other]
Title: A Context-aware Radio Resource Management in Heterogeneous Virtual RANs
Comments: Accepted for publication in IEEE Transaction of Network and Service Management
Subjects: Networking and Internet Architecture (cs.NI)

New-generation wireless networks are designed to support a wide range of services with diverse key performance indicators (KPIs) requirements. A fundamental component of such networks, and a pivotal factor to the fulfillment of the target KPIs, is the virtual radio access network (vRAN), which allows high flexibility on the control of the radio link. However, to fully exploit the potentiality of vRANs, an efficient mapping of the rapidly varying context to radio control decisions is not only essential, but also challenging owing to the interdependence of user traffic demand, channel conditions, and resource allocation. Here, we propose CAREM, a reinforcement learning framework for dynamic radio resource allocation in heterogeneous vRANs, which selects the best available link and transmission parameters for packet transfer, so as to meet the KPI requirements. To show its effectiveness, we develop a testbed for proof-of-concept. Experimental results demonstrate that CAREM enables an efficient radio resource allocation under different settings and traffic demand. Also, compared to the closest existing scheme based on neural network and the standard LTE, CAREM exhibits an improvement of one order of magnitude in packet loss and latency, while it provides a 65% latency improvement relatively to the contextual bandit approach.

[194]  arXiv:2109.10870 [pdf, other]
Title: SoK: Machine Learning Governance
Subjects: Cryptography and Security (cs.CR); Machine Learning (cs.LG); Software Engineering (cs.SE)

The application of machine learning (ML) in computer systems introduces not only many benefits but also risks to society. In this paper, we develop the concept of ML governance to balance such benefits and risks, with the aim of achieving responsible applications of ML. Our approach first systematizes research towards ascertaining ownership of data and models, thus fostering a notion of identity specific to ML systems. Building on this foundation, we use identities to hold principals accountable for failures of ML systems through both attribution and auditing. To increase trust in ML systems, we then survey techniques for developing assurance, i.e., confidence that the system meets its security requirements and does not exhibit certain known failures. This leads us to highlight the need for techniques that allow a model owner to manage the life cycle of their system, e.g., to patch or retire their ML system. Put altogether, our systematization of knowledge standardizes the interactions between principals involved in the deployment of ML throughout its life cycle. We highlight opportunities for future work, e.g., to formalize the resulting game between ML principals.

[195]  arXiv:2109.10871 [pdf, other]
Title: On Reference Solutions to Non-Gaussian SLAM Factor Graphs
Subjects: Robotics (cs.RO); Applications (stat.AP)

Many real-world applications of simultaneous localization and mapping (SLAM) require approximate inference approaches, as exact inference for high-dimensional non-Gaussian posterior distributions is often computationally intractable. There are substantial challenges, however, in evaluating the quality of a solution provided by such inference techniques. One approach to solution evaluation is to solve the non-Gaussian posteriors with a more computationally expensive but generally accurate approach to create a reference solution for side-by-side comparison. Our work takes this direction. This paper presents nested sampling for factor graphs (NSFG), a nested-sampling-based approach for posterior estimation in non-Gaussian factor graph inference. Although NSFG applies to any problem modeled as inference over a factor graph, we focus on providing reference solutions for evaluation of approximate inference approaches to SLAM problems. The sparsity structure of SLAM factor graphs is exploited for improved computational performance without sacrificing solution quality. We compare NSFG to two other sampling-based approaches, the No-U-Turn sampler (NUTS) and sequential Monte Carlo (SMC), as well as GTSAM, a state-of-the-art Gaussian SLAM solver. We evaluate across several synthetic examples of interest to the non-Gaussian SLAM community, including multi-robot range-only SLAM and range-only SLAM with ambiguous data associations. Quantitative and qualitative analyses show NSFG is capable of producing high-fidelity solutions to a wide range of non-Gaussian SLAM problems, with notably superior solutions than NUTS and SMC. In addition, NSFG demonstrated improved scalability over NUTS and SMC.

[196]  arXiv:2109.10872 [pdf, ps, other]
Title: Nonlinear Attitude Estimation Using Intermittent Linear Velocity and Vector Measurements
Comments: Accepted to the IEEE 60th Conference on Decision and Control (CDC 2021), 8 pages, 3 figures
Subjects: Systems and Control (eess.SY); Robotics (cs.RO); Optimization and Control (math.OC)

This paper investigates the problem of continuous attitude estimation on $SO(3)$ using continuous angular velocity and linear acceleration measurements as well as intermittent linear velocity and inertial vector measurements. First, we propose a nonlinear observer for the case where all the measurements are continuous and almost global asymptotic stability (AGAS) is shown using the notion of almost global input-to-state stability (ISS) on manifolds. Thereafter, a hybrid attitude observer, with AGAS guarantees, is proposed in terms of intermittent linear velocity and vector measurements. Numerical simulation results are presented to illustrate the performance of the proposed hybrid observer.

[197]  arXiv:2109.10876 [pdf, other]
Title: Code modernization strategies for short-range non-bonded molecular dynamics simulations
Comments: 9 pages, 8 figures
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Computational Physics (physics.comp-ph)

As modern HPC systems increasingly rely on greater core counts and wider vector registers, applications need to be adapted to fully utilize these hardware capabilities. One class of applications that can benefit from this increase in parallelism are molecular dynamics simulations. In this paper, we describe our efforts at modernizing the ESPResSo++ molecular dynamics simulation package by restructuring its particle data layout for efficient memory accesses and applying vectorization techniques to benefit the calculation of short-range non-bonded forces, which results in an overall 3 times speedup and serves as a baseline for further optimizations. We also implement finer-grain parallelism for multi-core CPUs through HPX, a C++ runtime system which uses lightweight threads and an asynchronous many-task approach to maximize parallelism. Our goal is to evaluate the performance of an HPX-based approach compared to the bulk-synchronous MPI-based implementation. This requires the introduction of an additional layer to the domain decomposition scheme that defines the task granularity. On spatially inhomogeneous systems, which impose a corresponding load-imbalance in traditional MPI-based approaches, we demonstrate that by choosing an optimal task size, the efficient work-stealing mechanisms of HPX can overcome the overhead of communication resulting in an overall 1.3 times speedup compared to the baseline MPI version.

[198]  arXiv:2109.10883 [pdf, other]
Title: ENERO: Efficient Real-Time Routing Optimization
Comments: 12 pages, 10 figures
Subjects: Networking and Internet Architecture (cs.NI); Machine Learning (cs.LG)

Wide Area Networks (WAN) are a key infrastructure in today's society. During the last years, WANs have seen a considerable increase in network's traffic as well as in the number of network applications. To enable the deployment of emergent network applications (e.g., Vehicular networks, Internet of Things), existing Traffic Engineering (TE) solutions must be able to achieve high performance real-time network operation. In addition, TE solutions must be able to adapt to dynamic scenarios (e.g., changes in the traffic matrix or topology link failures). However, current TE technologies rely on hand-crafted heuristics or computationally expensive solvers, which are not suitable for highly dynamic TE scenarios.
In this paper we propose Enero, an efficient real-time TE engine. Enero is based on a two-stage optimization process. In the first one, it leverages Deep Reinforcement Learning (DRL) to optimize the routing configuration by generating a long-term TE strategy. We integrated a Graph Neural Network (GNN) into the DRL agent to enable efficient TE on dynamic networks. In the second stage, Enero uses a Local Search algorithm to improve DRL's solution without adding computational overhead to the optimization process. Enero offers a lower bound in performance, enabling the network operator to know the worst-case performance of the DRL agent. We believe that the lower bound in performance will lighten the path of deploying DRL-based solutions in real-world network scenarios. The experimental results indicate that Enero is able to operate in real-world dynamic network topologies in 4.5 seconds on average for topologies up to 100 edges.

[199]  arXiv:2109.10884 [pdf, other]
Title: Simple exponential acceleration of the power iteration algorithm
Comments: 7 pages, 1 figure, 2 tables, 3 Python files
Subjects: Numerical Analysis (math.NA); Disordered Systems and Neural Networks (cond-mat.dis-nn)

Many real-world problems rely on finding eigenvalues and eigenvectors of a matrix. The power iteration algorithm is a simple method for determining the largest eigenvalue and associated eigenvector of a general matrix. This algorithm relies on the idea that repeated multiplication of a randomly chosen vector x by the matrix A gradually amplifies the component of the vector along the eigenvector of the largest eigenvalue of A while suppressing all other components. Unfortunately, the power iteration algorithm may demonstrate slow convergence. In this report, we demonstrate an exponential speed up in convergence of the power iteration algorithm with only a polynomial increase in computation by taking advantage of the commutativity of matrix multiplication.

[200]  arXiv:2109.10885 [pdf, other]
Title: Easily computable continuous metrics on the space of isometry classes of all 2-dimensional lattices
Comments: 26 pages, 7 figures
Subjects: Computational Geometry (cs.CG)

A periodic lattice in Euclidean space is the infinite set of all integer linear combinations of basis vectors. Any lattice can be generated by infinitely many different bases. Motivated by rigid crystal structures, we consider lattices up to rigid motion or isometry, which preserves inter-point distances. Then all isometry classes of lattices form a continuous space. There are several parameterisations of this space in dimensions two and three, but this is the first which is not discontinuous in singular cases. We introduce new continuous coordinates (root products) on the space of lattices and new metrics between root forms satisfying all metric axioms and continuity under all perturbations. The root forms allow visualisations of hundreds of thousands of real crystal lattices from the Cambridge Structural Database for the first time.

[201]  arXiv:2109.10886 [pdf, other]
Title: Investigating Entropy for Extractive Document Summarization
Subjects: Information Retrieval (cs.IR)

Automatic text summarization aims to cut down readers time and cognitive effort by reducing the content of a text document without compromising on its essence. Ergo, informativeness is the prime attribute of document summary generated by an algorithm, and selecting sentences that capture the essence of a document is the primary goal of extractive document summarization. In this paper, we employ Shannon entropy to capture informativeness of sentences. We employ Non-negative Matrix Factorization (NMF) to reveal probability distributions for computing entropy of terms, topics, and sentences in latent space. We present an information theoretic interpretation of the computed entropy, which is the bedrock of the proposed E-Summ algorithm, an unsupervised method for extractive document summarization. The algorithm systematically applies information theoretic principle for selecting informative sentences from important topics in the document. The proposed algorithm is generic and fast, and hence amenable to use for summarization of documents in real time. Furthermore, it is domain-, collection-independent and agnostic to the language of the document. Benefiting from strictly positive NMF factor matrices, E-Summ algorithm is transparent and explainable too. We use standard ROUGE toolkit for performance evaluation of the proposed method on four well known public data-sets. We also perform quantitative assessment of E-Summ summary quality by computing its semantic similarity w.r.t the original document. Our investigation reveals that though using NMF and information theoretic approach for document summarization promises efficient, explainable, and language independent text summarization, it needs to be bolstered to match the performance of deep neural methods.

[202]  arXiv:2109.10888 [pdf, other]
Title: Quantifying Model Predictive Uncertainty with Perturbation Theory
Comments: 16 pages, 12 figures, 4 tables. arXiv admin note: text overlap with arXiv:2103.01374
Subjects: Machine Learning (cs.LG); Information Theory (cs.IT)

We propose a framework for predictive uncertainty quantification of a neural network that replaces the conventional Bayesian notion of weight probability density function (PDF) with a physics based potential field representation of the model weights in a Gaussian reproducing kernel Hilbert space (RKHS) embedding. This allows us to use perturbation theory from quantum physics to formulate a moment decomposition problem over the model weight-output relationship. The extracted moments reveal successive degrees of regularization of the weight potential field around the local neighborhood of the model output. Such localized moments represent well the PDF tails and provide significantly greater accuracy of the model's predictive uncertainty than the central moments characterized by Bayesian and ensemble methods or their variants. We show that this consequently leads to a better ability to detect false model predictions of test data that has undergone a covariate shift away from the training PDF learned by the model. We evaluate our approach against baseline uncertainty quantification methods on several benchmark datasets that are corrupted using common distortion techniques. Our approach provides fast model predictive uncertainty estimates with much greater precision and calibration.

[203]  arXiv:2109.10889 [pdf, other]
Title: Space-Time Tradeoffs for Answering Boolean Conjunctive Queries
Comments: Comments and suggestions are always welcome
Subjects: Data Structures and Algorithms (cs.DS); Databases (cs.DB)

In this paper, we investigate space-time tradeoffs for answering boolean conjunctive queries. The goal is to create a data structure in an initial preprocessing phase and use it for answering (multiple) queries. Previous work has developed data structures that trade off space usage for answering time and has proved conditional space lower bounds for queries of practical interest such as the path and triangle query. However, most of these results cater to only those queries, lack a comprehensive framework, and are not generalizable. The isolated treatment of these queries also fails to utilize the connections with extensive research on related problems within the database community. The key insight in this work is to exploit the formalism of relational algebra by casting the problems as answering join queries over a relational database. Using the notion of boolean {\em adorned queries} and {\em access patterns}, we propose a unified framework that captures several widely studied algorithmic problems. Our main contribution is three-fold. First, we present an algorithm that recovers existing space-time tradeoffs for several problems. The algorithm is based on an application of the {\em join size bound} to capture the space usage of our data structure. We combine our data structure with {\em query decomposition} techniques to further improve the tradeoffs and show that it is readily extensible to queries with negation. Second, we falsify two conjectures proposed in the existing literature that relates to the space-time lower bound for path queries and triangle detection by proposing an unexpectedly better algorithm. This result opens a new avenue for improving several algorithmic results that have so far been assumed to be (conditionally) optimal. Finally, we prove new conditional space-time lower bounds for star and path queries.

[204]  arXiv:2109.10892 [pdf, other]
Title: The Design of Stretch: A Compact, Lightweight Mobile Manipulator for Indoor Human Environments
Comments: 6 pages plus references, 7 figures
Subjects: Robotics (cs.RO)

Mobile manipulators for indoor human environments can serve as versatile devices that perform a variety of tasks, yet adoption of this technology has been limited. Reducing size, weight, and cost could facilitate adoption, but risks restricting capabilities. We present a novel design that reduces size, weight, and cost, while still performing a variety of tasks. The core design consists of a two-wheeled differential-drive mobile base, a lift, and a telescoping arm configured to achieve Cartesian motion at the end of the arm. Design extensions include a 1 degree-of-freedom (DOF) wrist to stow a tool, a 2-DOF dexterous wrist to pitch and roll a tool, and a compliant gripper. We justify our design with mathematical models of static stability that relate the robot's size and weight to its workspace, payload, and applied forces. We also provide empirical support by teleoperating and autonomously controlling a commercial robot based on our design (the Stretch RE1 from Hello Robot Inc.) to perform tasks in real homes.

[205]  arXiv:2109.10893 [pdf, other]
Title: Intercept Graph: An Interactive Radial Visualization for Comparison of State Changes
Subjects: Human-Computer Interaction (cs.HC); Graphics (cs.GR)

State change comparison of multiple data items is often necessary in multiple application domains, such as medical science, financial engineering, sociology, biological science, etc. Slope graphs and grouped bar charts have been widely used to show a "before-and-after" story of different data states and indicate their changes. However, they visualize state changes as either slope or difference of bars, which has been proved less effective for quantitative comparison. Also, both visual designs suffer from visual clutter issues with an increasing number of data items. In this paper, we propose Intercept Graph, a novel visual design to facilitate effective interactive comparison of state changes. Specifically, a radial design is proposed to visualize the starting and ending states of each data item and the line segment length explicitly encodes the "state change". By interactively adjusting the radius of the inner circular axis, Intercept Graph can smoothly filter the large state changes and magnify the difference between similar state changes, mitigating the visual clutter issues and enhancing the effective comparison of state changes. We conducted a case study through comparing Intercept Graph with slope graphs and grouped bar charts on real datasets to demonstrate the effectiveness of Intercept Graph.

[206]  arXiv:2109.10894 [pdf, other]
Title: Evaluating Effects of Background Stories on Graph Perception
Comments: 15 pages, 6 figures
Subjects: Human-Computer Interaction (cs.HC)

A graph is an abstract model that represents relations among entities, for example, the interactions between characters in a novel. A background story endows entities and relations with real-world meanings and describes the semantics and context of the abstract model, for example, the actual story that the novel presents. Considering practical experience and prior research, human viewers who are familiar with the background story of a graph and those who do not know the background story may perceive the same graph differently. However, no previous research has adequately addressed this problem. This research paper thus presents an evaluation that investigated the effects of background stories on graph perception. Three hypotheses that focused on the role of visual focus areas, graph structure identification, and mental model formation on graph perception were formulated and guided three controlled experiments that evaluated the hypotheses using real-world graphs with background stories. An analysis of the resulting experimental data, which compared the performance of participants who read and did not read the background stories, obtained a set of instructive findings. First, having knowledge about a graph's background story influences participants' focus areas during interactive graph explorations. Second, such knowledge significantly affects one's ability to identify community structures but not high degree and bridge structures. Third, this knowledge influences graph recognition under blurred visual conditions. These findings can bring new considerations to the design of storytelling visualizations and interactive graph explorations.

[207]  arXiv:2109.10895 [pdf, other]
Title: Geo-Context Aware Study of Vision-Based Autonomous Driving Models and Spatial Video Data
Comments: 11 pages, 8 figures, and 1 table. This paper is accepted and to be published in IEEE Transactions on Visualization and Computer Graphics
Subjects: Human-Computer Interaction (cs.HC); Machine Learning (cs.LG)

Vision-based deep learning (DL) methods have made great progress in learning autonomous driving models from large-scale crowd-sourced video datasets. They are trained to predict instantaneous driving behaviors from video data captured by on-vehicle cameras. In this paper, we develop a geo-context aware visualization system for the study of Autonomous Driving Model (ADM) predictions together with large-scale ADM video data. The visual study is seamlessly integrated with the geographical environment by combining DL model performance with geospatial visualization techniques. Model performance measures can be studied together with a set of geospatial attributes over map views. Users can also discover and compare prediction behaviors of multiple DL models in both city-wide and street-level analysis, together with road images and video contents. Therefore, the system provides a new visual exploration platform for DL model designers in autonomous driving. Use cases and domain expert evaluation show the utility and effectiveness of the visualization system.

[208]  arXiv:2109.10896 [pdf, other]
Title: Updating Embeddings for Dynamic Knowledge Graphs
Subjects: Machine Learning (cs.LG)

Data in Knowledge Graphs often represents part of the current state of the real world. Thus, to stay up-to-date the graph data needs to be updated frequently. To utilize information from Knowledge Graphs, many state-of-the-art machine learning approaches use embedding techniques. These techniques typically compute an embedding, i.e., vector representations of the nodes as input for the main machine learning algorithm. If a graph update occurs later on -- specifically when nodes are added or removed -- the training has to be done all over again. This is undesirable, because of the time it takes and also because downstream models which were trained with these embeddings have to be retrained if they change significantly. In this paper, we investigate embedding updates that do not require full retraining and evaluate them in combination with various embedding models on real dynamic Knowledge Graphs covering multiple use cases. We study approaches that place newly appearing nodes optimally according to local information, but notice that this does not work well. However, we find that if we continue the training of the old embedding, interleaved with epochs during which we only optimize for the added and removed parts, we obtain good results in terms of typical metrics used in link prediction. This performance is obtained much faster than with a complete retraining and hence makes it possible to maintain embeddings for dynamic Knowledge Graphs.

[209]  arXiv:2109.10897 [pdf]
Title: ProvLet: A Provenance Management Service for Long Tail Microscopy Data
Comments: 5 pages, 5 figures
Subjects: Cryptography and Security (cs.CR); Systems and Control (eess.SY)

Provenance management must be present to enhance the overall security and reliability of long-tail microscopy (LTM) data management systems. However, there are challenges in provenance for domains with LTM data. The provenance data need to be collected more frequently, which increases system overheads (in terms of computation and storage) and results in scalability issues. Moreover, in most scientific application domains a provenance solution must consider network-related events as well. Therefore, provenance data in LTM data management systems are highly diverse and must be organized and processed carefully. In this paper, we introduce a novel provenance service, called ProvLet, to collect, distribute, analyze, and visualize provenance data in LTM data management systems. This means (1) we address how to filter and store the desired transactions on disk; (2) we consider a data organization model at higher level data abstractions, suitable for step-by-step scientific experiments, such as datasets and collections, and develop provenance algorithms over these data abstractions, rather than solutions considering low-level abstractions such as files and folders. (3) We utilize ProvLet's log files and visualize provenance information for further forensics explorations. The validation of ProvLet with actual long tail microscopy data, collected over a period of six years, shows a provenance service that yields a low system overhead and enables scalability.

[210]  arXiv:2109.10899 [pdf]
Title: Learning Geometric Transformations for Parametric Design: An Augmented Reality (AR)-Powered Approach
Subjects: Human-Computer Interaction (cs.HC)

Despite the remarkable development of parametric modeling methods for architectural design, a significant problem still exists, which is the lack of knowledge and skill regarding the professional implementation of parametric design in architectural modeling. Considering the numerous advantages of digital/parametric modeling in rapid prototyping and simulation most instructors encourage students to use digital modeling even from the early stages of design; however, an appropriate context to learn the basics of digital design thinking is rarely provided in architectural pedagogy. This paper presents an educational tool, specifically an Augmented Reality (AR) intervention, to help students understand the fundamental concepts of para-metric modeling before diving into complex parametric modeling platforms. The goal of the AR intervention is to illustrate geometric transformation and the associated math functions so that students learn the mathematical logic behind the algorithmic thinking of parametric modeling. We have developed BRICKxAR_T, an educational AR prototype, that intends to help students learn geometric transformations in an immersive spatial AR environment. A LEGO set is used within the AR intervention as a physical manipulative to support physical interaction and im-prove spatial skill through body gesture.

[211]  arXiv:2109.10900 [pdf, other]
Title: Towards Multi-Agent Reinforcement Learning using Quantum Boltzmann Machines
Comments: Submitted to ICAART 2022, 10 pages, 11 figures
Subjects: Artificial Intelligence (cs.AI); Multiagent Systems (cs.MA)

Reinforcement learning has driven impressive advances in machine learning. Simultaneously, quantum-enhanced machine learning algorithms using quantum annealing underlie heavy developments. Recently, a multi-agent reinforcement learning (MARL) architecture combining both paradigms has been proposed. This novel algorithm, which utilizes Quantum Boltzmann Machines (QBMs) for Q-value approximation has outperformed regular deep reinforcement learning in terms of time-steps needed to converge. However, this algorithm was restricted to single-agent and small 2x2 multi-agent grid domains. In this work, we propose an extension to the original concept in order to solve more challenging problems. Similar to classic DQNs, we add an experience replay buffer and use different networks for approximating the target and policy values. The experimental results show that learning becomes more stable and enables agents to find optimal policies in grid-domains with higher complexity. Additionally, we assess how parameter sharing influences the agents behavior in multi-agent domains. Quantum sampling proves to be a promising method for reinforcement learning tasks, but is currently limited by the QPU size and therefore by the size of the input and Boltzmann machine.

Cross-lists for Thu, 23 Sep 21

[212]  arXiv:2109.10241 (cross-list from physics.hist-ph) [pdf, other]
Title: Life, the universe and the hidden meaning of everything
Comments: 6 pages, 2 figures. Request for comments
Subjects: History and Philosophy of Physics (physics.hist-ph); Cosmology and Nongalactic Astrophysics (astro-ph.CO); Artificial Intelligence (cs.AI); Data Analysis, Statistics and Probability (physics.data-an)

It is hard to look at the universe and not wonder about the meaning, of, well, everything. A natural question is whether what we see is a sign of intelligent design. The antithesis of design would be a random universe or, assuming laws of physics, one whose fundamental physical parameters were randomly selected, but conditioned on life (ourselves) being here to observe it. In unpublished work, the British physicist Dennis Sciama argued that such a randomly selected universe would display a statistical signature. He concluded that a random universe would almost certainly have parameters only just allowing for the possibility of life. Here we consider whether this signature is definitive. We find that with plausible additional assumptions Sciama's signature would appear to reverse: Were our universe random, it could give the false impression of being intelligently designed, with the fundamental constants appearing to be fine-tuned to a strong probability for life to emerge and be maintained.

[213]  arXiv:2109.10353 (cross-list from eess.IV) [pdf, other]
Title: An Ultra-Fast Method for Simulation of Realistic Ultrasound Images
Comments: arXiv admin note: text overlap with arXiv:2109.09969
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

Convolutional neural networks (CNNs) have attracted a rapidly growing interest in a variety of different processing tasks in the medical ultrasound community. However, the performance of CNNs is highly reliant on both the amount and fidelity of the training data. Therefore, scarce data is almost always a concern, particularly in the medical field, where clinical data is not easily accessible. The utilization of synthetic data is a popular approach to address this challenge. However, but simulating a large number of images using packages such as Field II is time-consuming, and the distribution of simulated images is far from that of the real images. Herein, we introduce a novel ultra-fast ultrasound image simulation method based on the Fourier transform and evaluate its performance in a lesion segmentation task. We demonstrate that data augmentation using the images generated by the proposed method substantially outperforms Field II in terms of Dice similarity coefficient, while the simulation is almost 36000 times faster (both on CPU).

[214]  arXiv:2109.10360 (cross-list from astro-ph.CO) [pdf, other]
Title: Robust marginalization of baryonic effects for cosmological inference at the field level
Comments: 7 pages, 4 figures. Second paper of a series of four. The 2D maps, codes, and network weights used in this paper are publicly available at this https URL
Subjects: Cosmology and Nongalactic Astrophysics (astro-ph.CO); Astrophysics of Galaxies (astro-ph.GA); Instrumentation and Methods for Astrophysics (astro-ph.IM); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

We train neural networks to perform likelihood-free inference from $(25\,h^{-1}{\rm Mpc})^2$ 2D maps containing the total mass surface density from thousands of hydrodynamic simulations of the CAMELS project. We show that the networks can extract information beyond one-point functions and power spectra from all resolved scales ($\gtrsim 100\,h^{-1}{\rm kpc}$) while performing a robust marginalization over baryonic physics at the field level: the model can infer the value of $\Omega_{\rm m} (\pm 4\%)$ and $\sigma_8 (\pm 2.5\%)$ from simulations completely different to the ones used to train it.

[215]  arXiv:2109.10399 (cross-list from physics.ao-ph) [pdf, other]
Title: Learned Benchmarks for Subseasonal Forecasting
Comments: 15 pages of main paper and 18 pages of appendix text
Subjects: Atmospheric and Oceanic Physics (physics.ao-ph); Machine Learning (cs.LG); Machine Learning (stat.ML)

We develop a subseasonal forecasting toolkit of simple learned benchmark models that outperform both operational practice and state-of-the-art machine learning and deep learning methods. Our new models include (a) Climatology++, an adaptive alternative to climatology that, for precipitation, is 9% more accurate and 250% more skillful than the United States operational Climate Forecasting System (CFSv2); (b) CFSv2++, a learned CFSv2 correction that improves temperature and precipitation accuracy by 7-8% and skill by 50-275%; and (c) Persistence++, an augmented persistence model that combines CFSv2 forecasts with lagged measurements to improve temperature and precipitation accuracy by 6-9% and skill by 40-130%. Across the contiguous U.S., our Climatology++, CFSv2++, and Persistence++ toolkit consistently outperforms standard meteorological baselines, state-of-the-art machine and deep learning methods, and the European Centre for Medium-Range Weather Forecasts ensemble. Overall, we find that augmenting traditional forecasting approaches with learned enhancements yields an effective and computationally inexpensive strategy for building the next generation of subseasonal forecasting benchmarks.

[216]  arXiv:2109.10404 (cross-list from eess.SP) [pdf]
Title: Digital Signal Processing Using Deep Neural Networks
Subjects: Signal Processing (eess.SP); Machine Learning (cs.LG)

Currently there is great interest in the utility of deep neural networks (DNNs) for the physical layer of radio frequency (RF) communications. In this manuscript, we describe a custom DNN specially designed to solve problems in the RF domain. Our model leverages the mechanisms of feature extraction and attention through the combination of an autoencoder convolutional network with a transformer network, to accomplish several important communications network and digital signals processing (DSP) tasks. We also present a new open dataset and physical data augmentation model that enables training of DNNs that can perform automatic modulation classification, infer and correct transmission channel effects, and directly demodulate baseband RF signals.

[217]  arXiv:2109.10452 (cross-list from stat.ML) [pdf, other]
Title: Personalized Online Machine Learning
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)

In this work, we introduce the Personalized Online Super Learner (POSL) -- an online ensembling algorithm for streaming data whose optimization procedure accommodates varying degrees of personalization. Namely, POSL optimizes predictions with respect to baseline covariates, so personalization can vary from completely individualized (i.e., optimization with respect to baseline covariate subject ID) to many individuals (i.e., optimization with respect to common baseline covariates). As an online algorithm, POSL learns in real-time. POSL can leverage a diversity of candidate algorithms, including online algorithms with different training and update times, fixed algorithms that are never updated during the procedure, pooled algorithms that learn from many individuals' time-series, and individualized algorithms that learn from within a single time-series. POSL's ensembling of this hybrid of base learning strategies depends on the amount of data collected, the stationarity of the time-series, and the mutual characteristics of a group of time-series. In essence, POSL decides whether to learn across samples, through time, or both, based on the underlying (unknown) structure in the data. For a wide range of simulations that reflect realistic forecasting scenarios, and in a medical data application, we examine the performance of POSL relative to other current ensembling and online learning methods. We show that POSL is able to provide reliable predictions for time-series data and adjust to changing data-generating environments. We further cultivate POSL's practicality by extending it to settings where time-series enter/exit dynamically over chronological time.

[218]  arXiv:2109.10472 (cross-list from physics.med-ph) [pdf, other]
Title: Rotor Localization and Phase Mapping of Cardiac Excitation Waves using Deep Neural Networks
Subjects: Medical Physics (physics.med-ph); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV); Biological Physics (physics.bio-ph)

The analysis of electrical impulse phenomena in cardiac muscle tissue is important for the diagnosis of heart rhythm disorders and other cardiac pathophysiology. Cardiac mapping techniques acquire numerous local temporal measurements and combine them to visualize the spread of electrophysiological wave phenomena across the heart surface. However, low spatial resolutions, sparse measurement locations, noise and other artifacts make it challenging to accurately visualize spatio-temporal activity. For instance, electro-anatomical catheter mapping is severely limited by the sparsity of the measurements and optical mapping is prone to noise and motion artifacts. In the past, several approaches have been proposed to obtain more reliable maps from noisy or sparse mapping data. Here, we demonstrate that deep learning can be used to compute phase maps and detect phase singularities from both noisy and sparse electrical mapping data with high precision and efficiency. The self-supervised deep learning approach is fundamentally different from classical phase mapping techniques. Rather than encoding a phase signal from time-series data, the network instead learns to directly associate short spatio-temporal sequences of electrical data with phase maps and the positions of phase singularities. Using this method, we were able to accurately compute phase maps and locate rotor cores even from extremely sparse and noisy data, generated from both optical mapping experiments and computer simulations. Neural networks are a promising alternative to conventional phase mapping and rotor core localization methods, that could be used in optical mapping studies in basic cardiovascular research as well as in the clinical setting for the analysis of atrial fibrillation.

[219]  arXiv:2109.10474 (cross-list from q-bio.QM) [pdf, other]
Title: Rapid detection and recognition of whole brain activity in a freely behaving Caenorhabditis elegans
Subjects: Quantitative Methods (q-bio.QM); Computer Vision and Pattern Recognition (cs.CV)

Advanced volumetric imaging methods and genetically encoded activity indicators have permitted a comprehensive characterization of whole brain activity at single neuron resolution in \textit{Caenorhabditis elegans}. The constant motion and deformation of the mollusc nervous system, however, impose a great challenge for a consistent identification of densely packed neurons in a behaving animal. Here, we propose a cascade solution for long-term and rapid recognition of head ganglion neurons in a freely moving \textit{C. elegans}. First, potential neuronal regions from a stack of fluorescence images are detected by a deep learning algorithm. Second, 2 dimensional neuronal regions are fused into 3 dimensional neuron entities. Third, by exploiting the neuronal density distribution surrounding a neuron and relative positional information between neurons, a multi-class artificial neural network transforms engineered neuronal feature vectors into digital neuronal identities. Under the constraint of a small number (20-40 volumes) of training samples, our bottom-up approach is able to process each volume - $1024 \times 1024 \times 18$ in voxels - in less than 1 second and achieves an accuracy of $91\%$ in neuronal detection and $74\%$ in neuronal recognition. Our work represents an important development towards a rapid and fully automated algorithm for decoding whole brain activity underlying natural animal behaviors.

[220]  arXiv:2109.10481 (cross-list from math.ST) [pdf, other]
Title: Sparse Uniformity Testing
Comments: 32 pages, 1 figure
Subjects: Statistics Theory (math.ST); Information Theory (cs.IT)

In this paper we consider the uniformity testing problem for high-dimensional discrete distributions (multinomials) under sparse alternatives. More precisely, we derive sharp detection thresholds for testing, based on $n$ samples, whether a discrete distribution supported on $d$ elements differs from the uniform distribution only in $s$ (out of the $d$) coordinates and is $\varepsilon$-far (in total variation distance) from uniformity. Our results reveal various interesting phase transitions which depend on the interplay of the sample size $n$ and the signal strength $\varepsilon$ with the dimension $d$ and the sparsity level $s$. For instance, if the sample size is less than a threshold (which depends on $d$ and $s$), then all tests are asymptotically powerless, irrespective of the magnitude of the signal strength. On the other hand, if the sample size is above the threshold, then the detection boundary undergoes a further phase transition depending on the signal strength. Here, a $\chi^2$-type test attains the detection boundary in the dense regime, whereas in the sparse regime a Bonferroni correction of two maximum-type tests and a version of the Higher Criticism test is optimal up to sharp constants. These results combined provide a complete description of the phase diagram for the sparse uniformity testing problem across all regimes of the parameters $n$, $d$, and $s$. One of the challenges in dealing with multinomials is that the parameters are always constrained to lie in the simplex. This results in the aforementioned two-layered phase transition, a new phenomenon which does not arise in classical high-dimensional sparse testing problems.

[221]  arXiv:2109.10496 (cross-list from cond-mat.soft) [pdf, other]
Title: Vibration Improves Performance in Granular Jamming Grippers
Comments: This paper is under consideration for publication by IEEE and may be removed without notice
Subjects: Soft Condensed Matter (cond-mat.soft); Robotics (cs.RO)

Granular jamming is a popular soft robotics technology that has seen recent widespread applications including industrial gripping, surgical robotics and haptics. However, to date the field has not fully exploited the fundamental science of the jamming phase transition, which has been rigorously studied in the field of statistical and condensed matter physics. This work introduces vibration as a means to improve the properties of granular jamming grippers through vibratory fluidisation and the exploitation of resonant modes within the granular material. We show that vibration in soft jamming grippers can improve holding strength, reduce the downwards force needed for the gripping action, and lead to a simplified setup where the second air pump, generally used for unjamming, could be removed. In a series of studies, we show that frequency and amplitude of the waveforms are key determinants to performance, and that jamming performance is also dependent on temporal properties of the induced waveform. We hope to encourage further study in transitioning fundamental jamming mechanisms into a soft robotics context to improve performance and increase diversity of applications for granular jamming grippers.

[222]  arXiv:2109.10499 (cross-list from eess.IV) [pdf, other]
Title: Joint Optical Neuroimaging Denoising with Semantic Tasks
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

Optical neuroimaging is a vital tool for understanding the brain structure and the connection between regions and nuclei. However, the image noise introduced in the sample preparation and the imaging system hinders the extraction of the possible knowlege from the dataset, thus denoising for the optical neuroimaging is usually necessary. The supervised denoisng methods often outperform the unsupervised ones, but the training of the supervised denoising models needs the corresponding clean labels, which is not always avaiable due to the high labeling cost. On the other hand, those semantic labels, such as the located soma positions, the reconstructed neuronal fibers, and the nuclei segmentation result, are generally available and accumulated from everyday neuroscience research. This work connects a supervised denoising and a semantic segmentation model together to form a end-to-end model, which can make use of the semantic labels while still provides a denoised image as an intermediate product. We use both the supervised and the self-supervised models for the denoising and introduce a new cost term for the joint denoising and the segmentation setup. We test the proposed approach on both the synthetic data and the real-world data, including the optical neuroimaing dataset and the electron microscope dataset. The result shows that the joint denoising result outperforms the one using the denoising method alone and the joint model benefits the segmentation and other downstream task as well.

[223]  arXiv:2109.10503 (cross-list from astro-ph.EP) [pdf, other]
Title: Identifying Potential Exomoon Signals with Convolutional Neural Networks
Comments: 14 pages, 13 figures, 1 table. Accepted for publication in Monthly Notices of the Royal Astronomical Society, 15 September 2021
Subjects: Earth and Planetary Astrophysics (astro-ph.EP); Machine Learning (cs.LG)

Targeted observations of possible exomoon host systems will remain difficult to obtain and time-consuming to analyze in the foreseeable future. As such, time-domain surveys such as Kepler, K2 and TESS will continue to play a critical role as the first step in identifying candidate exomoon systems, which may then be followed-up with premier ground- or space-based telescopes. In this work, we train an ensemble of convolutional neural networks (CNNs) to identify candidate exomoon signals in single-transit events observed by Kepler. Our training set consists of ${\sim}$27,000 examples of synthetic, planet-only and planet+moon single transits, injected into Kepler light curves. We achieve up to 88\% classification accuracy with individual CNN architectures and 97\% precision in identifying the moons in the validation set when the CNN ensemble is in total agreement. We then apply the CNN ensemble to light curves from 1880 Kepler Objects of Interest with periods $>10$ days ($\sim$57,000 individual transits), and further test the accuracy of the CNN classifier by injecting planet transits into each light curve, thus quantifying the extent to which residual stellar activity may result in false positive classifications. We find a small fraction of these transits contain moon-like signals, though we caution against strong inferences of the exomoon occurrence rate from this result. We conclude by discussing some ongoing challenges to utilizing neural networks for the exomoon search.

[224]  arXiv:2109.10505 (cross-list from eess.SP) [pdf, ps, other]
Title: Sensor-Based Satellite IoT for Early Wildfire Detection
Comments: To appear in IEEE GLOBECOM 2021 Workshops
Subjects: Signal Processing (eess.SP); Information Theory (cs.IT); Systems and Control (eess.SY)

Frequent and severe wildfires have been observed lately on a global scale. Wildfires not only threaten lives and properties, but also pose negative environmental impacts that transcend national boundaries (e.g., greenhouse gas emission and global warming). Thus, early wildfire detection with timely feedback is much needed. We propose to use the emerging beyond fifth-generation (B5G) and sixth-generation (6G) satellite Internet of Things (IoT) communication technology to enable massive sensor deployment for wildfire detection. We propose wildfire and carbon emission models that take into account real environmental data including wind speed, soil wetness, and biomass, to simulate the fire spreading process and quantify the fire burning areas, carbon emissions, and economical benefits of the proposed system against the backdrop of recent California wildfires. We also conduct a satellite IoT feasibility check by analyzing the satellite link budget. Future research directions to further illustrate the promise of the proposed system are discussed.

[225]  arXiv:2109.10553 (cross-list from eess.SP) [pdf]
Title: On the Comparison of Single-Carrier vs. Digital Multi-Carrier Signaling for Long-Haul Transmission of Probabilistically Shaped Constellation Formats
Comments: presented in Optical Fiber Communication (OFC) Conference 2021, June 2021; Paper M3H.6
Subjects: Signal Processing (eess.SP); Information Theory (cs.IT)

We report on theoretical and experimental investigations of the nonlinear tolerance of single carrier and digital multicarrier approaches with probabilistically shaped constellations. Experimental transmission of PCS16QAM is assessed at 120 GBd over an ultra-long-haul distance.

[226]  arXiv:2109.10581 (cross-list from eess.SP) [pdf, other]
Title: Deep Augmented MUSIC Algorithm for Data-Driven DoA Estimation
Comments: Submitted to ICASSP2022
Subjects: Signal Processing (eess.SP); Machine Learning (cs.LG)

Direction of arrival (DoA) estimation is a crucial task in sensor array signal processing, giving rise to various successful model-based (MB) algorithms as well as recently developed data-driven (DD) methods. This paper introduces a new hybrid MB/DD DoA estimation architecture, based on the classical multiple signal classification (MUSIC) algorithm. Our approach augments crucial aspects of the original MUSIC structure with specifically designed neural architectures, allowing it to overcome certain limitations of the purely MB method, such as its inability to successfully localize coherent sources. The deep augmented MUSIC algorithm is shown to outperform its unaltered version with a superior resolution.

[227]  arXiv:2109.10594 (cross-list from math.CO) [pdf, other]
Title: On the Connectivity and the Diameter of Betweenness-Uniform Graphs
Subjects: Combinatorics (math.CO); Discrete Mathematics (cs.DM)

Betweenness centrality is a centrality measure based on the overall amount of shortest paths passing through a given vertex. A graph is betweenness-uniform if all its vertices have the same betweenness centrality. We study the properties of betweenness-uniform graphs. In particular, we show that every connected betweenness-uniform graph is either a cycle or a $3$-connected graph. Also, we show that betweenness uniform graphs of high maximal degree have small diameter.

[228]  arXiv:2109.10601 (cross-list from eess.IV) [pdf, other]
Title: Efficient Context-Aware Network for Abdominal Multi-organ Segmentation
Authors: Fan Zhang, Yu Wang
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

The contextual information, presented in abdominal CT scan, is relative consistent. In order to make full use of the overall 3D context, we develop a whole-volumebased coarse-to-fine framework for efficient and effective abdominal multi-organ segmentation. We propose a new efficientSegNet network, which is composed of encoder, decoder and context block. For the decoder module, anisotropic convolution with a k*k*1 intra-slice convolution and a 1*1*k inter-slice convolution, is designed to reduce the computation burden. For the context block, we propose strip pooling module to capture anisotropic and long-range contextual information, which exists in abdominal scene. Quantitative evaluation on the FLARE2021 validation cases, this method achieves the average dice similarity coefficient (DSC) of 0.895 and average normalized surface distance (NSD) of 0.775. The average running time is 9.8 s per case in inference phase, and maximum used GPU memory is 1017 MB.

[229]  arXiv:2109.10623 (cross-list from stat.ML) [pdf, ps, other]
Title: Sharp Analysis of Random Fourier Features in Classification
Authors: Zhu Li
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)

We study the theoretical properties of random Fourier features classification with Lipschitz continuous loss functions such as support vector machine and logistic regression. Utilizing the regularity condition, we show for the first time that random Fourier features classification can achieve $O(1/\sqrt{n})$ learning rate with only $\Omega(\sqrt{n} \log n)$ features, as opposed to $\Omega(n)$ features suggested by previous results. Our study covers the standard feature sampling method for which we reduce the number of features required, as well as a problem-dependent sampling method which further reduces the number of features while still keeping the optimal generalization property. Moreover, we prove that the random Fourier features classification can obtain a fast $O(1/n)$ learning rate for both sampling schemes under Massart's low noise assumption. Our results demonstrate the potential effectiveness of random Fourier features approximation in reducing the computational complexity (roughly from $O(n^3)$ in time and $O(n^2)$ in space to $O(n^2)$ and $O(n\sqrt{n})$ respectively) without having to trade-off the statistical prediction accuracy. In addition, the achieved trade-off in our analysis is at least the same as the optimal results in the literature under the worst case scenario and significantly improves the optimal results under benign regularity conditions.

[230]  arXiv:2109.10636 (cross-list from math.AP) [pdf, ps, other]
Title: Weak-strong Uniqueness for Heat Conducting non-Newtonian Incompressible Fluids
Subjects: Analysis of PDEs (math.AP); Numerical Analysis (math.NA)

In this work, we introduce a notion of dissipative weak solution for a system describing the evolution of a heat-conducting incompressible non-Newtonian fluid. This concept of solution is based on the balance of entropy instead of the balance of energy and has the advantage that it admits a weak-strong uniqueness principle, justifying the proposed formulation. We provide a proof of existence of solutions based on finite element approximations, thus obtaining the first convergence result of a numerical scheme for the full evolutionary system including temperature dependent coefficients and viscous dissipation terms. Then we proceed to prove the weak-strong uniqueness property of the system by means of a relative energy inequality.

[231]  arXiv:2109.10641 (cross-list from eess.IV) [pdf, other]
Title: Uncertainty-Aware Training for Cardiac Resynchronisation Therapy Response Prediction
Comments: STACOM 2021 Workshop
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

Evaluation of predictive deep learning (DL) models beyond conventional performance metrics has become increasingly important for applications in sensitive environments like healthcare. Such models might have the capability to encode and analyse large sets of data but they often lack comprehensive interpretability methods, preventing clinical trust in predictive outcomes. Quantifying uncertainty of a prediction is one way to provide such interpretability and promote trust. However, relatively little attention has been paid to how to include such requirements into the training of the model. In this paper we: (i) quantify the data (aleatoric) and model (epistemic) uncertainty of a DL model for Cardiac Resynchronisation Therapy response prediction from cardiac magnetic resonance images, and (ii) propose and perform a preliminary investigation of an uncertainty-aware loss function that can be used to retrain an existing DL image-based classification model to encourage confidence in correct predictions and reduce confidence in incorrect predictions. Our initial results are promising, showing a significant increase in the (epistemic) confidence of true positive predictions, with some evidence of a reduction in false negative confidence.

[232]  arXiv:2109.10666 (cross-list from math.OC) [pdf, other]
Title: Optimal Control for Linear Networked Control Systems with Information Transmission Constraints
Comments: Published in the 60th IEEE Conference in Decision and Control (2021). 8 pages, 4 figures, 1 table
Subjects: Optimization and Control (math.OC); Systems and Control (eess.SY)

This paper addresses the problem of robust control of a linear discrete-time system subject to bounded disturbances and to measurement and control budget constraints.
Using Q-parameterization and a polytope containment method, we prove that the co-design of an affine feedback controller, a measurement schedule and a control schedule can be exactly formulated as a mixed integer linear program with 2 binary variables per time step. As a consequence, this problem can be solved efficiently, even when an exhaustive search for measurement and control times would have been impossible in a reasonable amount of time.

[233]  arXiv:2109.10674 (cross-list from eess.IV) [pdf]
Title: Self-Training Based Unsupervised Cross-Modality Domain Adaptation for Vestibular Schwannoma and Cochlea Segmentation
Comments: 6 pages, 5 figures, MICCAI 2021 Cross-Modality Domain Adaptation for Medical Image Segmentation Challenge
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

With the advances of deep learning, many medical image segmentation studies achieve human-level performance when in fully supervised condition. However, it is extremely expensive to acquire annotation on every data in medical fields, especially on magnetic resonance images (MRI) that comprise many different contrasts. Unsupervised methods can alleviate this problem; however, the performance drop is inevitable compared to fully supervised methods. In this work, we propose a self-training based unsupervised-learning framework that performs automatic segmentation of Vestibular Schwannoma (VS) and cochlea on high-resolution T2 scans. Our method consists of 4 main stages: 1) VS-preserving contrast conversion from contrast-enhanced T1 scan to high-resolution T2 scan, 2) training segmentation on generated T2 scans with annotations on T1 scans, and 3) Inferring pseudo-labels on non-annotated real T2 scans, and 4) boosting the generalizability of VS and cochlea segmentation by training with combined data (i.e., real T2 scans with pseudo-labels and generated T2 scans with true annotations). Our method showed mean Dice score and Average Symmetric Surface Distance (ASSD) of 0.8570 (0.0705) and 0.4970 (0.3391) for VS, 0.8446 (0.0211) and 0.1513 (0.0314) for Cochlea on CrossMoDA2021 challenge validation phase leaderboard, outperforming most other approaches.

[234]  arXiv:2109.10679 (cross-list from physics.flu-dyn) [pdf, ps, other]
Title: Application of Video-to-Video Translation Networks to Computational Fluid Dynamics
Authors: Hiromitsu Kigure
Comments: Published in Frontiers in Artificial Intelligence
Subjects: Fluid Dynamics (physics.flu-dyn); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

In recent years, the evolution of artificial intelligence, especially deep learning, has been remarkable, and its application to various fields has been growing rapidly. In this paper, I report the results of the application of generative adversarial networks (GANs), specifically video-to-video translation networks, to computational fluid dynamics (CFD) simulations. The purpose of this research is to reduce the computational cost of CFD simulations with GANs. The architecture of GANs in this research is a combination of the image-to-image translation networks (the so-called "pix2pix") and Long Short-Term Memory (LSTM). It is shown that the results of high-cost and high-accuracy simulations (with high-resolution computational grids) can be estimated from those of low-cost and low-accuracy simulations (with low-resolution grids). In particular, the time evolution of density distributions in the cases of a high-resolution grid is reproduced from that in the cases of a low-resolution grid through GANs, and the density inhomogeneity estimated from the image generated by GANs recovers the ground truth with good accuracy. Qualitative and quantitative comparisons of the results of the proposed method with those of several super-resolution algorithms are also presented.

[235]  arXiv:2109.10681 (cross-list from stat.ML) [pdf, ps, other]
Title: A Latent Restoring Force Approach to Nonlinear System Identification
Comments: 18 pages, 11 figures, preprint submitted to Mechanical Systems and Signal Processing
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Systems and Control (eess.SY)

Identification of nonlinear dynamic systems remains a significant challenge across engineering. This work suggests an approach based on Bayesian filtering to extract and identify the contribution of an unknown nonlinear term in the system which can be seen as an alternative viewpoint on restoring force surface type approaches. To achieve this identification, the contribution which is the nonlinear restoring force is modelled, initially, as a Gaussian process in time. That Gaussian process is converted into a state-space model and combined with the linear dynamic component of the system. Then, by inference of the filtering and smoothing distributions, the internal states of the system and the nonlinear restoring force can be extracted. In possession of these states a nonlinear model can be constructed. The approach is demonstrated to be effective in both a simulated case study and on an experimental benchmark dataset.

[236]  arXiv:2109.10731 (cross-list from eess.IV) [pdf, other]
Title: Automatic Plane Adjustment of Orthopedic Intra-operative Flat Panel Detector CT-Volumes
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Medical Physics (physics.med-ph)

3D acquisitions are often acquired to assess the result in orthopedic trauma surgery. With a mobile C-Arm system, these acquisitions can be performed intra-operatively. That reduces the number of required revision surgeries. However, due to the operation room setup, the acquisitions typically cannot be performed such that the acquired volumes are aligned to the anatomical regions. Thus, the multiplanar reconstructed (MPR) planes need to be adjusted manually during the review of the volume. In this paper, we present a detailed study of multi-task learning (MTL) regression networks to estimate the parameters of the MPR planes.
First, various mathematical descriptions for rotation, including Euler angle, quaternion, and matrix representation, are revised. Then, three different MTL network architectures based on the PoseNet are compared with a single task learning network.
Using a matrix description rather than the Euler angle description, the accuracy of the regressed normals improves from $7.7^{\circ}$ to $7.3^{\circ}$ in the mean value for single anatomies. The multi-head approach improves the regression of the plane position from $7.4mm$ to $6.1mm$, while the orientation does not benefit from this approach.
The results show that a multi-head approach can lead to slightly better results than the individual tasks networks. The most important benefit of the MTL approach is that it is a single network for standard plane regression for all body regions with a reduced number of stored parameters.

[237]  arXiv:2109.10756 (cross-list from math.OC) [pdf, other]
Title: Constrained multi-agent ergodic area surveying control based on finite element approximation of the potential field
Subjects: Optimization and Control (math.OC); Multiagent Systems (cs.MA); Robotics (cs.RO); Systems and Control (eess.SY)

Heat Equation Driven Area Coverage (HEDAC) is a state-of-the-art multi-agent ergodic motion control guided by a gradient of a potential field. A finite element method is hereby implemented to obtain a solution of Helmholtz partial differential equation, which models the potential field for surveying motion control. This allows us to survey arbitrarily shaped domains and to include obstacles in an elegant and robust manner intrinsic to HEDAC's fundamental idea. For a simple kinematic motion, the obstacles and boundary avoidance constraints are successfully handled by directing the agent motion with the gradient of the potential. However, including additional constraints, such as the minimal clearance dsitance from stationary and moving obstacles and the minimal path curvature radius, requires further alternations of the control algorithm. We introduce a relatively simple yet robust approach for handling these constraints by formulating a straightforward optimization problem based on collision-free escapes route maneuvers. This approach provides a guaranteed collision avoidance mechanism, while being computationally inexpensive as a result of the optimization problem partitioning. The proposed motion control is evaluated in three realistic surveying scenarios simulations, showing the effectiveness of the surveying and the robustness of the control algorithm. Furthermore, potential maneuvering difficulties due to improperly defined surveying scenarios are highlighted and we provide guidelines on how to overpass them. The results are promising and indiacate real-world applicability of proposed constrained multi-agent motion control for autonomous surveying and potentially other HEDAC utilizations.

[238]  arXiv:2109.10759 (cross-list from astro-ph.IM) [pdf, ps, other]
Title: Astronomical Pipeline Provenance: A Use Case Evaluation
Comments: 9 pages, 1 table
Journal-ref: 13th International Workshop on Theory and Practice of Provenance (2021) 54
Subjects: Instrumentation and Methods for Astrophysics (astro-ph.IM); Information Retrieval (cs.IR)

In this decade astronomy is undergoing a paradigm shift to handle data from next generation observatories such as the Square Kilometre Array (SKA) or the Vera C. Rubin Observatory (LSST). Producing real time data streams of up to 10 TB/s and data products of the order of 600 Pbytes/year, the SKA will be the biggest civil data producing machine of the world that demands novel solutions on how these data volumes can be stored and analysed. Through the use of complex, automated pipelines the provenance of this real time data processing is key to establish confidence within the system, its final data products, and ultimately its scientific results.
The intention of this paper is to lay the foundation for making an automated provenance generation tool for astronomical/data-processing pipelines. We therefore present a use case analysis, specific to the astronomical needs which addresses the issues of trust and reproducibility as well as other ulterior use cases which are of interest to astronomers. This analysis is subsequently used as the basis to discuss the requirements, challenges, and opportunities involved in designing both the tool and the associated provenance model.

[239]  arXiv:2109.10765 (cross-list from physics.flu-dyn) [pdf, other]
Title: An artificial neural network approach to bifurcating phenomena in computational fluid dynamics
Comments: 28 pages, 22 figures
Subjects: Fluid Dynamics (physics.flu-dyn); Machine Learning (cs.LG); Numerical Analysis (math.NA); Computational Physics (physics.comp-ph)

This work deals with the investigation of bifurcating fluid phenomena using a reduced order modelling setting aided by artificial neural networks. We discuss the POD-NN approach dealing with non-smooth solutions set of nonlinear parametrized PDEs. Thus, we study the Navier-Stokes equations describing: (i) the Coanda effect in a channel, and (ii) the lid driven triangular cavity flow, in a physical/geometrical multi-parametrized setting, considering the effects of the domain's configuration on the position of the bifurcation points. Finally, we propose a reduced manifold-based bifurcation diagram for a non-intrusive recovery of the critical points evolution. Exploiting such detection tool, we are able to efficiently obtain information about the pattern flow behaviour, from symmetry breaking profiles to attaching/spreading vortices, even at high Reynolds numbers.

[240]  arXiv:2109.10793 (cross-list from math.OC) [pdf, other]
Title: Physics-informed Neural Networks-based Model Predictive Control for Multi-link Manipulators
Subjects: Optimization and Control (math.OC); Machine Learning (cs.LG); Robotics (cs.RO)

We discuss nonlinear model predictive control (NMPC) for multi-body dynamics via physics-informed machine learning methods. Physics-informed neural networks (PINNs) are a promising tool to approximate (partial) differential equations. PINNs are not suited for control tasks in their original form since they are not designed to handle variable control actions or variable initial values. We thus present the idea of enhancing PINNs by adding control actions and initial conditions as additional network inputs. The high-dimensional input space is subsequently reduced via a sampling strategy and a zero-hold assumption. This strategy enables the controller design based on a PINN as an approximation of the underlying system dynamics. The additional benefit is that the sensitivities are easily computed via automatic differentiation, thus leading to efficient gradient-based algorithms. Finally, we present our results using our PINN-based MPC to solve a tracking problem for a complex mechanical system, a multi-link manipulator.

[241]  arXiv:2109.10794 (cross-list from stat.ML) [pdf, ps, other]
Title: Entropic Issues in Likelihood-Based OOD Detection
Comments: NeurIPS Workshop Submission
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)

Deep generative models trained by maximum likelihood remain very popular methods for reasoning about data probabilistically. However, it has been observed that they can assign higher likelihoods to out-of-distribution (OOD) data than in-distribution data, thus calling into question the meaning of these likelihood values. In this work we provide a novel perspective on this phenomenon, decomposing the average likelihood into a KL divergence term and an entropy term. We argue that the latter can explain the curious OOD behaviour mentioned above, suppressing likelihood values on datasets with higher entropy. Although our idea is simple, we have not seen it explored yet in the literature. This analysis provides further explanation for the success of OOD detection methods based on likelihood ratios, as the problematic entropy term cancels out in expectation. Finally, we discuss how this observation relates to recent success in OOD detection with manifold-supported models, for which the above decomposition does not hold.

[242]  arXiv:2109.10833 (cross-list from quant-ph) [pdf, other]
Title: Bounds on approximating Max $k$XOR with quantum and classical local algorithms
Comments: 14+11 pages, 6 figures, code online at this https URL and this https URL
Subjects: Quantum Physics (quant-ph); Data Structures and Algorithms (cs.DS)

We consider the power of local algorithms for approximately solving Max $k$XOR, a generalization of two constraint satisfaction problems previously studied with classical and quantum algorithms (MaxCut and Max E3LIN2). On instances with either random signs or no overlapping clauses and $D+1$ clauses per variable, we calculate the average satisfying fraction of the depth-1 QAOA and compare with a generalization of the local threshold algorithm. Notably, the quantum algorithm outperforms the threshold algorithm for $k > 4$.
On the other hand, we highlight potential difficulties for the QAOA to achieve computational quantum advantage on this problem. We first compute a tight upper bound on the maximum satisfying fraction of nearly all large random regular Max $k$XOR instances by numerically calculating the ground state energy density $P(k)$ of a mean-field $k$-spin glass. The upper bound grows with $k$ much faster than the performance of both one-local algorithms. We also identify a new obstruction result for low-depth quantum circuits (including the QAOA) when $k=3$, generalizing a result of Bravyi et al [arXiv:1910.08980] when $k=2$. We conjecture that a similar obstruction exists for all $k$.

[243]  arXiv:2109.10834 (cross-list from astro-ph.SR) [pdf, other]
Title: SCSS-Net: Solar Corona Structures Segmentation by Deep Learning
Comments: accepted for publication in Monthly Notices of the Royal Astronomical Society; for associated code, see this https URL
Subjects: Solar and Stellar Astrophysics (astro-ph.SR); Instrumentation and Methods for Astrophysics (astro-ph.IM); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Space Physics (physics.space-ph)

Structures in the solar corona are the main drivers of space weather processes that might directly or indirectly affect the Earth. Thanks to the most recent space-based solar observatories, with capabilities to acquire high-resolution images continuously, the structures in the solar corona can be monitored over the years with a time resolution of minutes. For this purpose, we have developed a method for automatic segmentation of solar corona structures observed in EUV spectrum that is based on a deep learning approach utilizing Convolutional Neural Networks. The available input datasets have been examined together with our own dataset based on the manual annotation of the target structures. Indeed, the input dataset is the main limitation of the developed model's performance. Our \textit{SCSS-Net} model provides results for coronal holes and active regions that could be compared with other generally used methods for automatic segmentation. Even more, it provides a universal procedure to identify structures in the solar corona with the help of the transfer learning technique. The outputs of the model can be then used for further statistical studies of connections between solar activity and the influence of space weather on Earth.

[244]  arXiv:2109.10849 (cross-list from eess.IV) [pdf, other]
Title: DVC-P: Deep Video Compression with Perceptual Optimizations
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)

Recent years have witnessed the significant development of learning-based video compression methods, which aim at optimizing objective or perceptual quality and bit rates. In this paper, we introduce deep video compression with perceptual optimizations (DVC-P), which aims at increasing perceptual quality of decoded videos. Our proposed DVC-P is based on Deep Video Compression (DVC) network, but improves it with perceptual optimizations. Specifically, a discriminator network and a mixed loss are employed to help our network trade off among distortion, perception and rate. Furthermore, nearest-neighbor interpolation is used to eliminate checkerboard artifacts which can appear in sequences encoded with DVC frameworks. Thanks to these two improvements, the perceptual quality of decoded sequences is improved. Experimental results demonstrate that, compared with the baseline DVC, our proposed method can generate videos with higher perceptual quality achieving 12.27% reduction in a perceptual BD-rate equivalent, on average.

[245]  arXiv:2109.10854 (cross-list from math.OC) [pdf, ps, other]
Title: Imitation Learning of Stabilizing Policies for Nonlinear Systems
Authors: Sebastian East
Subjects: Optimization and Control (math.OC); Machine Learning (cs.LG)

There has been a recent interest in imitation learning methods that are guaranteed to produce a stabilizing control law with respect to a known system. Work in this area has generally considered linear systems and controllers, for which stabilizing imitation learning takes the form of a biconvex optimization problem. In this paper it is demonstrated that the same methods developed for linear systems and controllers can be readily extended to polynomial systems and controllers using sum of squares techniques. A projected gradient descent algorithm and an alternating direction method of multipliers algorithm are proposed as heuristics for solving the stabilizing imitation learning problem, and their performance is illustrated through numerical experiments.

[246]  arXiv:2109.10863 (cross-list from physics.soc-ph) [pdf]
Title: A Transportation Digital-Twin Approach for Adaptive Traffic Control Systems
Subjects: Physics and Society (physics.soc-ph); Systems and Control (eess.SY)

A transportation digital twin represents a digital version of a transportation physical object or process, such as a traffic signal controller, and thereby a two-way real-time data exchange between the physical twin and digital twin. This paper introduces a digital twin approach for adaptive traffic signal control (ATSC) to improve a traveler's driving experience by reducing and redistributing waiting time at an intersection. While an ATSC combined with a connected vehicle concept can reduce waiting time at an intersection and improve travel time in a signalized corridor, it is nearly impossible to reduce traffic delay for congested traffic conditions. To remedy this defect of the traditional ATCS with connected vehicle data, we have developed a digital twin-based ATSC (DT-based ATSC) that considers the waiting time of approaching vehicles towards a subject intersection along with the waiting time of those vehicles at the immediate upstream intersection. We conducted a case study using a microscopic traffic simulation, Simulation of Urban Mobility (SUMO), by developing a digital replica of a roadway network with signalized intersections in an urban setting where vehicle and traffic signal data were collected in real-time. Our analyses reveal that the DT-based ATSC outperforms the connected vehicle-based baseline ATSC in terms of average cumulative waiting time, distribution of drivers' waiting time, and level of services for each approach for different traffic demands and therefore demonstrates our method's superior efficacy.

[247]  arXiv:2109.10864 (cross-list from eess.SP) [pdf, other]
Title: Reliable Linearized Phase Retrieval for Near-Field Antenna Measurements with Truncated Measurement Surfaces
Comments: 7 pages, 5 figures, submitted to IEEE Transaction and Antennas and Propagation
Subjects: Signal Processing (eess.SP); Numerical Analysis (math.NA); Optimization and Control (math.OC)

Most methods tackling the phase retrieval problem of magnitude-only antenna measurements suffer from unrealistic sampling requirements, from unfeasible computational complexities, and, most severely, from the lacking reliability of nonlinear and nonconvex formulations. As an alternative, we propose a partially coherent (PC) multi-probe measurement technique and an associated linear reconstruction method which mitigate all these issues. Hence, reliable and accurate phase retrieval can be achieved in near-field far-field transformations (NFFFTs). In particular, we resolve the issues related to open measurement surfaces (as they may emerge in drone-based measurement setups) and we highlight the importance of considering the measurement setup and the phaseless NFFFT simultaneously. Specifically, the influence of special multi-probe arrangements on the reconstruction quality of PC solvers is shown.

[248]  arXiv:2109.10898 (cross-list from stat.ML) [pdf, other]
Title: A Robust Asymmetric Kernel Function for Bayesian Optimization, with Application to Image Defect Detection in Manufacturing Systems
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)

Some response surface functions in complex engineering systems are usually highly nonlinear, unformed, and expensive-to-evaluate. To tackle this challenge, Bayesian optimization, which conducts sequential design via a posterior distribution over the objective function, is a critical method used to find the global optimum of black-box functions. Kernel functions play an important role in shaping the posterior distribution of the estimated function. The widely used kernel function, e.g., radial basis function (RBF), is very vulnerable and susceptible to outliers; the existence of outliers is causing its Gaussian process surrogate model to be sporadic. In this paper, we propose a robust kernel function, Asymmetric Elastic Net Radial Basis Function (AEN-RBF). Its validity as a kernel function and computational complexity are evaluated. When compared to the baseline RBF kernel, we prove theoretically that AEN-RBF can realize smaller mean squared prediction error under mild conditions. The proposed AEN-RBF kernel function can also realize faster convergence to the global optimum. We also show that the AEN-RBF kernel function is less sensitive to outliers, and hence improves the robustness of the corresponding Bayesian optimization with Gaussian processes. Through extensive evaluations carried out on synthetic and real-world optimization problems, we show that AEN-RBF outperforms existing benchmark kernel functions.

Replacements for Thu, 23 Sep 21

[249]  arXiv:1708.08103 (replaced) [pdf, other]
Title: Universal Weak Variable-Length Source Coding on Countable Infinite Alphabets
Comments: This article has been accepted for publication by IEEE. Digital Object Identifier 10.1109/TIT.2019.2941895. Link: this https URL The material in this paper was partially published in ISIT2016 [1] and ISIT2017 [2], International Symposium on Information Theory (ISIT)
Subjects: Information Theory (cs.IT)
[250]  arXiv:1804.00598 (replaced) [pdf, other]
Title: Small-d MSR Codes with Optimal Access, Optimal Sub-Packetization and Linear Field Size
Subjects: Information Theory (cs.IT)
[251]  arXiv:1906.06427 (replaced) [pdf, other]
Title: Real-Time Privacy-Preserving Data Release for Smart Meters
Journal-ref: in IEEE Transactions on Smart Grid, vol. 11, no. 6, pp. 5174-5183, Nov. 2020
Subjects: Signal Processing (eess.SP); Cryptography and Security (cs.CR); Information Theory (cs.IT); Machine Learning (cs.LG); Machine Learning (stat.ML)
[252]  arXiv:1906.08823 (replaced) [pdf, other]
Title: Cross-Subject Statistical Shift Estimation for Generalized Electroencephalography-based Mental Workload Assessment
Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Subjects: Machine Learning (cs.LG); Signal Processing (eess.SP); Machine Learning (stat.ML)
[253]  arXiv:1908.05268 (replaced) [pdf, other]
Title: The Power of the Weisfeiler-Leman Algorithm to Decompose Graphs
Comments: 48 pages, 6 figures, full version of a paper accepted at MFCS 2019. Added Definition 5.4 and Theorem 5.6 to formalise the notions and arguments. New appendix contains extended proofs for Theorems 5.6 and 5.7. Results remain unchanged
Subjects: Discrete Mathematics (cs.DM); Logic in Computer Science (cs.LO); Combinatorics (math.CO)
[254]  arXiv:1908.05715 (replaced) [pdf, other]
Title: Automated classification of plasma regions using 3D particle energy distributions
Comments: Accepted to JGR: Space Physics
Subjects: Space Physics (physics.space-ph); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[255]  arXiv:1910.05126 (replaced) [src]
Title: Prediction-based Resource Allocation using Bayesian Neural Networks and Minimum Cost and Maximum Flow Algorithm
Comments: The design of effect analysis on prediction accuracy is incomplete
Subjects: Artificial Intelligence (cs.AI)
[256]  arXiv:1910.13906 (replaced) [pdf, other]
Title: Probabilistic performance validation of deep learning-based robust NMPC controllers
Comments: 23 pages, 4 figures
Journal-ref: International Journal of Robust and Nonlinear Control, 2021, 1-22
Subjects: Systems and Control (eess.SY); Optimization and Control (math.OC)
[257]  arXiv:1912.09526 (replaced) [pdf, other]
Title: Inference for Hit Enrichment Curves, with Applications to Drug Discovery
Comments: 34 pages, 6 figures, 2 tables. Original version was a chapter in Jeremy Ash's dissertation. Current version is the paper submitted to Journal of American Statistical Association: Applications and Case Studies
Subjects: Applications (stat.AP); Information Retrieval (cs.IR); Machine Learning (cs.LG); Quantitative Methods (q-bio.QM); Methodology (stat.ME)
[258]  arXiv:1912.13321 (replaced) [pdf, other]
Title: OTEANN: Estimating the Transparency of Orthographies with an Artificial Neural Network
Authors: Xavier Marjou
Comments: 9 pages, 9 figures, 3 tables
Subjects: Computation and Language (cs.CL)
[259]  arXiv:2002.09055 (replaced) [pdf, other]
Title: Encryption without Centralization: Distributing DNS Queries Across Recursive Resolvers
Comments: Presented at the ACM/IRTF Applied Networking Research Workshop 2021 (ANRW '21)
Subjects: Networking and Internet Architecture (cs.NI)
[260]  arXiv:2002.09616 (replaced) [pdf, other]
Title: "Wait, I'm Still Talking!" Predicting the Dialogue Interaction Behavior Using Imagine-Then-Arbitrate Model
Comments: This is an outdated version of "Predict-then-Decide: A Predictive Approach for Wait or Answer Task in Dialogue Systems," in IEEE/ACM Transactions on Audio, Speech, and Language Processing, doi: 10.1109/TASLP.2021.3110145
Subjects: Computation and Language (cs.CL)
[261]  arXiv:2002.10631 (replaced) [pdf, other]
Title: Batch norm with entropic regularization turns deterministic autoencoders into generative models
Journal-ref: Published in the Proceedings of the International Conference on Uncertainty in Artificial Intelligence (UAI), 2020
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
[262]  arXiv:2003.11999 (replaced) [pdf]
Title: Software-Defined Elastic Provisioning of IoT Edge Computing Virtual Resources
Comments: 27 pages, 15 figures, 4 tables, 30 references; The current version evolved from a previous draft (unpublished) version entitled "A Software-defined solution for managing fog computing resources in sensor networks"; The current version is under submission to a journal
Subjects: Networking and Internet Architecture (cs.NI)
[263]  arXiv:2004.06383 (replaced) [pdf, other]
Title: Extending Adversarial Attacks to Produce Adversarial Class Probability Distributions
Comments: 31 pages, 10 figures, 3 tables, 1 algorithms
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
[264]  arXiv:2005.00033 (replaced) [pdf, other]
Title: Fighting the COVID-19 Infodemic: Modeling the Perspective of Journalists, Fact-Checkers, Social Media Platforms, Policy Makers, and the Society
Comments: disinformation, misinformation, factuality, fact-checking, fact-checkers, check-worthiness, Social Media Platforms, COVID-19, social media
Journal-ref: EMNLP-2021 (Findings)
Subjects: Computation and Language (cs.CL); Computers and Society (cs.CY); Information Retrieval (cs.IR)
[265]  arXiv:2005.12873 (replaced) [pdf, other]
Title: Benchmarking Graph Data Management and Processing Systems: A Survey
Comments: 26 pages, 5 figures
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Databases (cs.DB); Performance (cs.PF); Social and Information Networks (cs.SI)
[266]  arXiv:2005.13119 (replaced) [pdf, other]
Title: Predict-then-Decide: A Predictive Approach for Wait or Answer Task in Dialogue Systems
Comments: The latest version has been accepted to IEEE/ACM Transactions on Audio, Speech, and Language Processing, doi: 10.1109/TASLP.2021.3110145
Subjects: Computation and Language (cs.CL)
[267]  arXiv:2006.01738 (replaced) [pdf, other]
Title: Jointly Learning Environments and Control Policies with Projected Stochastic Gradient Ascent
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
[268]  arXiv:2006.05826 (replaced) [pdf, other]
Title: Transient Non-Stationarity and Generalisation in Deep Reinforcement Learning
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
[269]  arXiv:2006.07573 (replaced) [pdf, other]
Title: GIPFA: Generating IPA Pronunciation from Audio
Authors: Xavier Marjou
Comments: 10 pages, 2 figures, 7 tables
Journal-ref: Proceedings of the eLex 2021 conference, page 588
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[270]  arXiv:2007.05071 (replaced) [pdf, ps, other]
Title: Age-Limited Capacity of Massive MIMO
Subjects: Information Theory (cs.IT); Statistics Theory (math.ST)
[271]  arXiv:2007.08970 (replaced) [pdf, other]
Title: Compositional Generalization in Semantic Parsing: Pre-training vs. Specialized Architectures
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)
[272]  arXiv:2007.12911 (replaced) [pdf, other]
Title: Tighter risk certificates for neural networks
Comments: New version includes: i) experiment showing the potential of the risk certificate for neural architecture search (Fig. 2); ii) experiments spanning uncertainty quantification and analysis of prior/posterior (Section 7.8); iii) an outline of the strengths of probabilistic neural networks trained by PBB (Section 7.9) and iv) a strengthened discussion on the connection to Bayesian learning
Journal-ref: Journal of Machine Learning Research, 2021
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
[273]  arXiv:2007.14863 (replaced) [pdf, other]
Title: Automatic Detection of Aedes aegypti Breeding Grounds Based on Deep Networks with Spatio-Temporal Consistency
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[274]  arXiv:2008.04980 (replaced) [pdf, other]
Title: Robust Model Predictive Control with State Estimation under Set-Membership Uncertainty
Subjects: Systems and Control (eess.SY)
[275]  arXiv:2008.05074 (replaced) [pdf, ps, other]
Title: A Review of Deep Reinforcement Learning for Smart Building Energy Management
Comments: 21 pages, 12 figures
Journal-ref: IEEE Internet of Things Journal, vol. 8, no. 15, pp. 12046-12063, 2021
Subjects: Systems and Control (eess.SY); Signal Processing (eess.SP)
[276]  arXiv:2008.10087 (replaced) [pdf, other]
Title: Blindness of score-based methods to isolated components and mixing proportions
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)
[277]  arXiv:2010.11773 (replaced) [pdf, other]
Title: On Resource-Efficient Bayesian Network Classifiers and Deep Neural Networks
Comments: Accepted at ICPR 2020, fixed Figure 5
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
[278]  arXiv:2010.15559 (replaced) [pdf]
Title: Quantum Computing: A Taxonomy, Systematic Review and Future Directions
Comments: 37 pages, 7 figures, 12 tables; paper accepted for publication in "Software: Practice and Experience", Wiley Press, USA on Sept. 22, 2021
Journal-ref: Software: Practice and Experience, Wiley Press, USA, Sept. 22, 2021
Subjects: Emerging Technologies (cs.ET); Distributed, Parallel, and Cluster Computing (cs.DC)
[279]  arXiv:2010.16177 (replaced) [pdf, ps, other]
Title: Near-Optimal Distributed Implementations of Dynamic Algorithms for Symmetry-Breaking Problems
Comments: Abstract truncated to fit arXiv limits
Subjects: Data Structures and Algorithms (cs.DS)
[280]  arXiv:2011.03186 (replaced) [pdf, other]
Title: Revisiting Model-Agnostic Private Learning: Faster Rates and Active Learning
Comments: The paper was published in AISTATS-2021 and the longer version of the paper is under review at JMLR
Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR)
[281]  arXiv:2011.04908 (replaced) [pdf, other]
Title: Effective Model Compression via Stage-wise Pruning
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[282]  arXiv:2011.05463 (replaced) [pdf, other]
Title: Deep Sound Change: Deep and Iterative Learning, Convolutional Neural Networks, and Language Change
Authors: Gašper Beguš
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[283]  arXiv:2011.07018 (replaced) [pdf, other]
Title: Synthetic Data -- Anonymisation Groundhog Day
Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR)
[284]  arXiv:2011.08558 (replaced) [pdf, other]
Title: On the Transferability of Adversarial Attacksagainst Neural Text Classifier
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL)
[285]  arXiv:2011.09365 (replaced) [pdf, other]
Title: Learning in repeated auctions
Subjects: Computer Science and Game Theory (cs.GT)
[286]  arXiv:2011.11241 (replaced) [pdf, other]
Title: Data-driven Holistic Framework for Automated Laparoscope Optimal View Control with Learning-based Depth Perception
Comments: 7 pages, 7 figures, 2021 IEEE International Conference on Robotics and Automation (ICRA)
Subjects: Robotics (cs.RO)
[287]  arXiv:2011.11355 (replaced) [pdf, other]
Title: Data-Driven Control of Nonlinear Systems: Beyond Polynomial Dynamics
Comments: Accepted for 60th IEEE Conference on Decision and Control (CDC)
Subjects: Systems and Control (eess.SY); Optimization and Control (math.OC)
[288]  arXiv:2012.03209 (replaced) [pdf]
Title: Scheduling of Separable Mobile Energy Storage Systems with Mobile Generators and Fuel Tankers to Boost Distribution System Resilience
Comments: Accepted by IEEE Transactions on Smart Grid
Subjects: Systems and Control (eess.SY)
[289]  arXiv:2012.03597 (replaced) [pdf, other]
Title: PSGCNet: A Pyramidal Scale and Global Context Guided Network for Dense Object Counting in Remote Sensing Images
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[290]  arXiv:2012.04124 (replaced) [pdf, other]
Title: Parameter Efficient Multimodal Transformers for Video Representation Learning
Comments: Accepted to ICLR 2021
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[291]  arXiv:2012.06609 (replaced) [pdf, other]
Title: RegulaTor: A Straightforward Website Fingerprinting Defense
Subjects: Cryptography and Security (cs.CR)
[292]  arXiv:2012.07483 (replaced) [pdf, other]
Title: On the Treatment of Optimization Problems with L1 Penalty Terms via Multiobjective Continuation
Comments: Accepted by IEEE TPAMI 2021
Subjects: Optimization and Control (math.OC); Machine Learning (cs.LG)
[293]  arXiv:2012.08496 (replaced) [pdf, other]
Title: Spectral Methods for Data Science: A Statistical Perspective
Subjects: Machine Learning (stat.ML); Information Theory (cs.IT); Machine Learning (cs.LG); Signal Processing (eess.SP); Statistics Theory (math.ST)
[294]  arXiv:2012.15739 (replaced) [pdf, other]
Title: Uncertainty Bounds for Multivariate Machine Learning Predictions on High-Strain Brittle Fracture
Subjects: Materials Science (cond-mat.mtrl-sci); Machine Learning (cs.LG)
[295]  arXiv:2101.07415 (replaced) [pdf, other]
Title: ES-ENAS: Blackbox Optimization over Hybrid Spaces via Combinatorial and Continuous Evolution
Comments: 22 pages. This is an updated version of a previous submission which can be found at arXiv:1907.06511. See this https URL for associated code
Subjects: Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Robotics (cs.RO)
[296]  arXiv:2101.09028 (replaced) [pdf, other]
Title: A Closer Look at Temporal Sentence Grounding in Videos: Dataset and Metric
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[297]  arXiv:2101.11149 (replaced) [pdf, other]
Title: In-IDE Code Generation from Natural Language: Promise and Challenges
Comments: 47 pages, accepted to ACM Transactions on Software Engineering and Methodology
Subjects: Software Engineering (cs.SE)
[298]  arXiv:2101.12190 (replaced) [pdf, other]
Title: Practical distributed quantum information processing with LOCCNet
Comments: 19 pages
Subjects: Quantum Physics (quant-ph); Disordered Systems and Neural Networks (cond-mat.dis-nn); Information Theory (cs.IT); Machine Learning (cs.LG); High Energy Physics - Theory (hep-th)
[299]  arXiv:2102.00310 (replaced) [pdf, other]
Title: Symmetry-Aware Reservoir Computing
Comments: 10 pages, 7 Figures
Subjects: Neural and Evolutionary Computing (cs.NE); Machine Learning (cs.LG); Chaotic Dynamics (nlin.CD)
[300]  arXiv:2102.01621 (replaced) [pdf, ps, other]
Title: Depth separation beyond radial functions
Subjects: Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)
[301]  arXiv:2102.02771 (replaced) [pdf, other]
Title: Mask Guided Attention For Fine-Grained Patchy Image Classification
Comments: Accepted to ICIP2021
Journal-ref: 2021 IEEE International Conference on Image Processing (ICIP), 2021, pp. 1044-1048
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[302]  arXiv:2102.07252 (replaced) [pdf, other]
Title: On Topology Optimization and Routing in Integrated Access and Backhaul Networks: A Genetic Algorithm-based Approach
Comments: Revised manuscript in IEEE Open Journal of the Communications Society
Subjects: Networking and Internet Architecture (cs.NI); Information Theory (cs.IT); Machine Learning (cs.LG)
[303]  arXiv:2102.10242 (replaced) [pdf, other]
Title: Towards Automatic Evaluation of Dialog Systems: A Model-Free Off-Policy Evaluation Approach
Comments: Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[304]  arXiv:2102.10556 (replaced) [pdf, other]
Title: Inductive logic programming at 30
Comments: Extension of IJCAI20 survey paper. Accepted for the MLJ. arXiv admin note: substantial text overlap with arXiv:2002.11002, arXiv:2008.07912
Subjects: Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[305]  arXiv:2102.12452 (replaced) [pdf, ps, other]
Title: Probing Classifiers: Promises, Shortcomings, and Advances
Authors: Yonatan Belinkov
Comments: Accepted to Computational Linguistics as a squib
Subjects: Computation and Language (cs.CL)
[306]  arXiv:2103.03629 (replaced) [pdf, other]
Title: Self-supervised Mean Teacher for Semi-supervised Chest X-ray Classification
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[307]  arXiv:2103.04942 (replaced) [pdf, other]
Title: Task-Specific Design Optimization and Fabrication for Inflated-Beam Soft Robots with Growable Discrete Joints
Subjects: Robotics (cs.RO)
[308]  arXiv:2103.05134 (replaced) [pdf, other]
Title: Constrained Learning with Non-Convex Losses
Comments: arXiv admin note: text overlap with arXiv:2006.05487
Subjects: Machine Learning (cs.LG); Statistics Theory (math.ST); Machine Learning (stat.ML)
[309]  arXiv:2103.07397 (replaced) [pdf, ps, other]
Title: An extensible equality checking algorithm for dependent type theories
Subjects: Logic in Computer Science (cs.LO); Logic (math.LO)
[310]  arXiv:2103.09448 (replaced) [pdf, other]
Title: Adversarial Attacks on Camera-LiDAR Models for 3D Car Detection
Comments: arXiv admin note: text overlap with arXiv:2101.10747 Updates in v2: Expanded conclusion and future work, reduced Figure 5's size, and a small correction in Table 3
Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR); Graphics (cs.GR); Machine Learning (cs.LG)
[311]  arXiv:2103.12287 (replaced) [pdf, other]
Title: Optimising the selection of samples for robust lidar camera calibration
Comments: ITSC2021
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[312]  arXiv:2103.14122 (replaced) [pdf, other]
Title: Private and Resource-Bounded Locally Decodable Codes for Insertions and Deletions
Authors: Alexander R. Block (Purdue University), Jeremiah Blocki (Purdue University)
Subjects: Information Theory (cs.IT); Cryptography and Security (cs.CR)
[313]  arXiv:2104.05893 (replaced) [pdf, other]
Title: NewsCLIPpings: Automatic Generation of Out-of-Context Multimodal Media
Comments: EMNLP 2021
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[314]  arXiv:2104.10849 (replaced) [pdf]
Title: Transient Stability of Hybrid Power Systems Dominated by Different Types of Grid-Forming Devices
Authors: Xiuqiang He, Hua Geng
Subjects: Systems and Control (eess.SY)
[315]  arXiv:2104.11123 (replaced) [pdf, ps, other]
Title: Universal Horn Sentences and the Joint Embedding Property
Comments: 10 pages
Subjects: Logic in Computer Science (cs.LO); Logic (math.LO)
[316]  arXiv:2104.11414 (replaced) [pdf, other]
Title: Passive soft-reset controllers for nonlinear systems
Subjects: Systems and Control (eess.SY)
[317]  arXiv:2104.11709 (replaced) [pdf, other]
Title: Risk-Aware Path Planning for Ground Vehicles using Occluded Aerial Images
Subjects: Robotics (cs.RO)
[318]  arXiv:2105.06744 (replaced) [pdf, ps, other]
Title: A Separator Theorem for Hypergraphs and a CSP-SAT Algorithm
Subjects: Logic in Computer Science (cs.LO); Computational Complexity (cs.CC)
[319]  arXiv:2105.09266 (replaced) [pdf, ps, other]
Title: Copyright in Generative Deep Learning
Comments: 16 pages. Second version contains updates after entry into force of EU's directive on copyright in the Digital Single Market, and corrections of typos. Third version contains a new section about GitHub Copilot and its copyright implications. Fourth version contains improvements in abstract, introduction and conclusions, and a general rearrangement of the central sections
Subjects: Computers and Society (cs.CY); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[320]  arXiv:2105.10678 (replaced) [pdf, other]
Title: Video-based Person Re-identification without Bells and Whistles
Comments: This paper was accepted by CVPR 2021 Biometrics Workshop
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[321]  arXiv:2105.11450 (replaced) [pdf, other]
Title: SAT: 2D Semantics Assisted Training for 3D Visual Grounding
Comments: ICCV 2021 Oral
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[322]  arXiv:2105.11972 (replaced) [pdf]
Title: Reservoir Computing based on Mutually Injected Phase Modulated Semiconductor Lasers as a monolithic integrated hardware accelerator
Subjects: Emerging Technologies (cs.ET); Signal Processing (eess.SP)
[323]  arXiv:2105.12797 (replaced) [pdf, other]
Title: Self-supervised Monocular Multi-robot Relative Localization with Efficient Deep Neural Networks
Comments: 6+1 pages, submitted to ICRA 2022
Subjects: Robotics (cs.RO)
[324]  arXiv:2105.13327 (replaced) [pdf, other]
Title: Encoders and Ensembles for Task-Free Continual Learning
Subjects: Machine Learning (cs.LG)
[325]  arXiv:2105.14103 (replaced) [pdf, other]
Title: An Attention Free Transformer
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[326]  arXiv:2105.14884 (replaced) [pdf, other]
Title: Control of bifurcation structures using shape optimization
Comments: 20 pages, 11 figures
Subjects: Numerical Analysis (math.NA); Optimization and Control (math.OC)
[327]  arXiv:2106.00730 (replaced) [pdf, other]
Title: Enabling Efficiency-Precision Trade-offs for Label Trees in Extreme Classification
Comments: To appear in CIKM 2021
Subjects: Machine Learning (cs.LG); Data Structures and Algorithms (cs.DS)
[328]  arXiv:2106.02036 (replaced) [pdf, other]
Title: Anticipative Video Transformer
Comments: ICCV 2021. Ranked #1 in CVPR'21 EPIC-Kitchens-100 Action Anticipation challenge. Webpage/code/models: this http URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multimedia (cs.MM)
[329]  arXiv:2106.02473 (replaced) [pdf, other]
Title: GasHisSDB: A New Gastric Histopathology Image Dataset for Computer Aided Diagnosis of Gastric Cancer
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[330]  arXiv:2106.07024 (replaced) [pdf, other]
Title: Finite-Length Bounds on Hypothesis Testing Subject to Vanishing Type I Error Restrictions
Journal-ref: Vol. 28, 2021, 229 - 233
Subjects: Information Theory (cs.IT); Statistics Theory (math.ST)
[331]  arXiv:2106.07154 (replaced) [pdf, other]
Title: Local time stepping for the shallow water equations in MPAS
Subjects: Numerical Analysis (math.NA)
[332]  arXiv:2106.07233 (replaced) [pdf, other]
Title: Minimality Notions via Factorization Systems
Subjects: Formal Languages and Automata Theory (cs.FL); Discrete Mathematics (cs.DM); Category Theory (math.CT)
[333]  arXiv:2106.08433 (replaced) [pdf, other]
Title: Combining Lexical and Dense Retrieval for Computationally Efficient Multi-hop Question Answering
Comments: Accepted at the 2nd Workshop on Simple and Efficient Natural Language Processing (SustaiNLP 2021)
Subjects: Information Retrieval (cs.IR)
[334]  arXiv:2106.08620 (replaced) [pdf, ps, other]
Title: Accurate and efficient hydrodynamic analysis of structures with sharp edges by the Extended Finite Element Method (XFEM): 2D studies
Subjects: Numerical Analysis (math.NA); Fluid Dynamics (physics.flu-dyn)
[335]  arXiv:2106.10934 (replaced) [pdf, other]
Title: GRAND: Graph Neural Diffusion
Comments: 15 pages, 4 figures. Proceedings of the 38th International Conference on Machine Learning, PMLR 139, 2021. Copyright 2021 by the author(s)
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
[336]  arXiv:2106.14279 (replaced) [pdf, other]
Title: Numerical dispersion effects on the energy cascade in large-eddy simulation
Comments: 9 pages, 2 figures
Subjects: Fluid Dynamics (physics.flu-dyn); Numerical Analysis (math.NA)
[337]  arXiv:2107.01347 (replaced) [pdf, other]
Title: Traffic Signal Control with Communicative Deep Reinforcement Learning Agents: a Case Study
Subjects: Multiagent Systems (cs.MA); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[338]  arXiv:2107.01351 (replaced) [pdf, other]
Title: EAR-NET: Error Attention Refining Network For Retinal Vessel Segmentation
Comments: Accepted to DICTA2021
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[339]  arXiv:2107.02168 (replaced) [pdf, other]
Title: DPPIN: A Biological Repository of Dynamic Protein-Protein Interaction Network Data
Authors: Dongqi Fu, Jingrui He
Subjects: Machine Learning (cs.LG)
[340]  arXiv:2107.04782 (replaced) [pdf, other]
Title: TA2N: Two-Stage Action Alignment Network for Few-shot Action Recognition
Comments: manuscripts
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[341]  arXiv:2107.05583 (replaced) [pdf, other]
Title: Few-shot Learning with Global Relatedness Decoupled-Distillation
Comments: 11 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[342]  arXiv:2107.07208 (replaced) [pdf, other]
Title: Design of Distributed Reconfigurable Robotics Systems with ReconROS
Comments: Paper is under review
Subjects: Robotics (cs.RO); Distributed, Parallel, and Cluster Computing (cs.DC)
[343]  arXiv:2107.07634 (replaced) [pdf, other]
Title: Multi-task Learning with Cross Attention for Keyword Spotting
Comments: Accepted at ASRU 2021
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[344]  arXiv:2107.08047 (replaced) [pdf, other]
Title: Quantum computations (course of lectures)
Authors: Yuri I. Ozhigov
Comments: 145 pages, Latex, some figures added, some misprints corrected
Subjects: Quantum Physics (quant-ph); Emerging Technologies (cs.ET)
[345]  arXiv:2107.08209 (replaced) [pdf, other]
Title: Minimising quantifier variance under prior probability shift
Authors: Dirk Tasche
Comments: 12 pages, 1 figure
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Statistics Theory (math.ST)
[346]  arXiv:2107.08472 (replaced) [pdf, other]
Title: A locally calculable $P^3$-pressure in a decoupled method for incompressible Stokes equations
Authors: Chunjae Park
Comments: arXiv admin note: text overlap with arXiv:2104.05149
Subjects: Numerical Analysis (math.NA)
[347]  arXiv:2107.09896 (replaced) [pdf, ps, other]
Title: Terahertz-supported Untrusted UAV-Relaying: Secrecy Energy Efficiency Maximization via Trajectory and Communication Co-design
Comments: 14 pages, 12 figures, this work has been submitted to the IEEE for possible publication
Subjects: Information Theory (cs.IT); Computational Engineering, Finance, and Science (cs.CE); Networking and Internet Architecture (cs.NI); Signal Processing (eess.SP)
[348]  arXiv:2107.11774 (replaced) [pdf, other]
Title: SGD May Never Escape Saddle Points
Comments: Typoes fixed
Subjects: Machine Learning (cs.LG); Optimization and Control (math.OC); Machine Learning (stat.ML)
[349]  arXiv:2107.14171 (replaced) [pdf, other]
Title: Tianshou: a Highly Modularized Deep Reinforcement Learning Library
Comments: 10 pages, 4 figures, 4 tables
Subjects: Machine Learning (cs.LG)
[350]  arXiv:2107.14432 (replaced) [pdf, other]
Title: Adaptive Optimizers with Sparse Group Lasso for Neural Networks in CTR Prediction
Comments: 24 pages. Published as a conference paper at ECML PKDD 2021. This version includes Appendix which was not included in the published version because of page limit
Journal-ref: Machine Learning and Knowledge Discovery in Databases. Research Track - European Conference, ECML PKDD 2021, Bilbao, Spain, September 13-17, 2021, Proceedings, Part III
Subjects: Machine Learning (cs.LG)
[351]  arXiv:2108.00344 (replaced) [pdf, other]
Title: Groot: An Event-graph-based Approach for Root Cause Analysis in Industrial Settings
Journal-ref: Proceedings of the thirty-sixth IEEE/ACM international conference on Automated software engineering, 2021
Subjects: Software Engineering (cs.SE)
[352]  arXiv:2108.02909 (replaced) [pdf, other]
Title: Lumos: Increasing Awareness of Analytic Behavior during Visual Data Analysis
Comments: 10 pages, 9 figures, TVCG Special Issue on the 2021 IEEE Visualization Conference (VIS)
Subjects: Human-Computer Interaction (cs.HC)
[353]  arXiv:2108.03536 (replaced) [pdf, other]
Title: Left, Right, and Gender: Exploring Interaction Traces to Mitigate Human Biases
Comments: 10 pages, 7 figures, TVCG Special Issue on the 2021 IEEE Visualization Conference (VIS)
Subjects: Human-Computer Interaction (cs.HC)
[354]  arXiv:2108.04417 (replaced) [pdf, other]
Title: Privacy-Preserving Machine Learning: Methods, Challenges and Directions
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR)
[355]  arXiv:2108.06076 (replaced) [pdf, other]
Title: PVT: Point-Voxel Transformer for 3D Deep Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR)
[356]  arXiv:2108.07636 (replaced) [pdf, other]
Title: Semi-parametric Bayesian Additive Regression Trees
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)
[357]  arXiv:2108.09020 (replaced) [pdf, other]
Title: Online Continual Learning with Natural Distribution Shifts: An Empirical Study with Visual Data
Comments: Accepted to ICCV 2021
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[358]  arXiv:2108.09717 (replaced) [pdf, other]
Title: External Knowledge enabled Text Visual Question Answering
Comments: Submitted to Neurocomputing
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[359]  arXiv:2108.11993 (replaced) [pdf, other]
Title: Evaluating Transformer-based Semantic Segmentation Networks for Pathological Image Segmentation
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[360]  arXiv:2108.12211 (replaced) [pdf, other]
Title: Enel: Context-Aware Dynamic Scaling of Distributed Dataflow Jobs using Graph Propagation
Comments: 8 pages, 5 figures, 3 tables
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Machine Learning (cs.LG)
[361]  arXiv:2108.12251 (replaced) [pdf, other]
Title: Changes in Twitter geolocations: Insights and suggestions for future usage
Subjects: Social and Information Networks (cs.SI)
[362]  arXiv:2109.00590 (replaced) [pdf, other]
Title: WebQA: Multihop and Multimodal QA
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[363]  arXiv:2109.03310 (replaced) [pdf]
Title: Melatect: A Machine Learning Model Approach For Identifying Malignant Melanoma in Skin Growths
Comments: 7 Pages, Preprint
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[364]  arXiv:2109.04699 (replaced) [pdf, other]
Title: EfficientCLIP: Efficient Cross-Modal Pre-training by Ensemble Confident Learning and Language Modeling
Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[365]  arXiv:2109.05184 (replaced) [pdf, other]
Title: MOMENTA: A Multimodal Framework for Detecting Harmful Memes and Their Targets
Comments: The paper has been accepted in the Findings of Empirical Methods in Natural Language Processing (EMNLP), 2021
Subjects: Multimedia (cs.MM); Computation and Language (cs.CL)
[366]  arXiv:2109.05443 (replaced) [pdf, other]
Title: CAN3D: Fast 3D Medical Image Segmentation via Compact Context Aggregation
Comments: 21 pages, 7 figures
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[367]  arXiv:2109.05486 (replaced) [pdf, other]
Title: A Socially Aware Reinforcement Learning Agent for The Single Track Road Problem
Subjects: Artificial Intelligence (cs.AI); Computer Science and Game Theory (cs.GT); Human-Computer Interaction (cs.HC); Robotics (cs.RO)
[368]  arXiv:2109.05979 (replaced) [pdf, other]
Title: Keyword Extraction for Improved Document Retrieval in Conversational Search
Comments: Accepted in IIR 2021
Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR)
[369]  arXiv:2109.06007 (replaced) [pdf, other]
Title: Visualization for Villainy
Comments: To appear at the alt.vis 2021 workshop
Subjects: Human-Computer Interaction (cs.HC)
[370]  arXiv:2109.06148 (replaced) [pdf, other]
Title: DAFNe: A One-Stage Anchor-Free Deep Model for Oriented Object Detection
Comments: Main paper: 7 pages, References: 2 pages, Appendix: 5 pages; Main paper: 5 figures, Appendix: 5 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[371]  arXiv:2109.06427 (replaced) [pdf, ps, other]
Title: Commonsense-Focused Dialogues for Response Generation: An Empirical Study
Comments: Accepted at SIGDIAL 2021. 12 pages, 5 tables
Subjects: Computation and Language (cs.CL)
[372]  arXiv:2109.06557 (replaced) [pdf, other]
Title: The concept of class invariant in object-oriented programming
Subjects: Programming Languages (cs.PL); Software Engineering (cs.SE)
[373]  arXiv:2109.06788 (replaced) [pdf, other]
Title: Computing Balanced Solutions for Large International Kidney Exchange Schemes
Subjects: Computer Science and Game Theory (cs.GT); Discrete Mathematics (cs.DM); Data Structures and Algorithms (cs.DS)
[374]  arXiv:2109.06831 (replaced) [pdf, other]
Title: GALOPP: Multi-Agent Deep Reinforcement Learning For Persistent Monitoring With Localization Constraints
Subjects: Robotics (cs.RO)
[375]  arXiv:2109.07236 (replaced) [pdf, ps, other]
Title: Recursive Hierarchical Projection for Whole-Body Control with Task Priority Transition
Comments: 6 pages, 9 figures, submitted to ICRA 2022
Subjects: Robotics (cs.RO); Systems and Control (eess.SY)
[376]  arXiv:2109.07243 (replaced) [pdf, other]
Title: Enhancing Clinical Information Extraction with Transferred Contextual Embeddings
Comments: 6 pages, 4 figures
Subjects: Computation and Language (cs.CL)
[377]  arXiv:2109.08111 (replaced) [pdf, ps, other]
Title: Stabilization of physical systems via saturated controllers with only partial state measurements
Comments: 38 pages, 11 figures, 5 tables
Subjects: Systems and Control (eess.SY)
[378]  arXiv:2109.08245 (replaced) [pdf, other]
Title: The 2021 RecSys Challenge Dataset: Fairness is not optional
Subjects: Social and Information Networks (cs.SI)
[379]  arXiv:2109.08336 (replaced) [pdf, other]
Title: LoGG3D-Net: Locally Guided Global Descriptor Learning for 3D Place Recognition
Comments: Under review
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[380]  arXiv:2109.08356 (replaced) [pdf, other]
Title: Accurate, Interpretable, and Fast Animation: An Iterative, Sparse, and Nonconvex Approach
Subjects: Machine Learning (cs.LG); Graphics (cs.GR)
[381]  arXiv:2109.08381 (replaced) [pdf, other]
Title: From Known to Unknown: Knowledge-guided Transformer for Time-Series Sales Forecasting in Alibaba
Comments: 8 pages, 7 figure
Subjects: Machine Learning (cs.LG)
[382]  arXiv:2109.08818 (replaced) [pdf, other]
Title: DyLex: Incorporating Dynamic Lexicons into BERT for Sequence Labeling
Comments: EMNLP 2021 Long Paper
Subjects: Computation and Language (cs.CL)
[383]  arXiv:2109.08890 (replaced) [src]
Title: Towards Joint Intent Detection and Slot Filling via Higher-order Attention
Comments: The authors withdraw the manuscript and will update it in the original arXiv address (arXiv:2108.11916)
Subjects: Computation and Language (cs.CL)
[384]  arXiv:2109.09142 (replaced) [src]
Title: Decentralized Wireless Federated Learning with Differential Privacy
Comments: The proof is not perfect in section 4
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
[385]  arXiv:2109.09307 (replaced) [pdf, other]
Title: Assisted Learning for Organizations with Limited Data
Comments: 16 pages, 18 figures
Subjects: Machine Learning (cs.LG)
[386]  arXiv:2109.09416 (replaced) [pdf, other]
Title: ElasticFace: Elastic Margin Loss for Deep Face Recognition
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[387]  arXiv:2109.09705 (replaced) [pdf, other]
Title: Neural forecasting at scale
Subjects: Machine Learning (cs.LG); Distributed, Parallel, and Cluster Computing (cs.DC); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)
[388]  arXiv:2109.09888 (replaced) [pdf, other]
Title: Chemical-Reaction-Aware Molecule Representation Learning
Subjects: Machine Learning (cs.LG); Chemical Physics (physics.chem-ph); Quantitative Methods (q-bio.QM)
[389]  arXiv:2109.09948 (replaced) [pdf, other]
Title: Neural networks with trainable matrix activation functions
Subjects: Machine Learning (cs.LG)
[390]  arXiv:2109.10020 (replaced) [pdf, other]
Title: Online Multi-horizon Transaction Metric Estimation with Multi-modal Learning in Payment Networks
Comments: 10 pages
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
[391]  arXiv:2109.10115 (replaced) [pdf, other]
Title: StereOBJ-1M: Large-scale Stereo Image Dataset for 6D Object Pose Estimation
Comments: ICCV 2021
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Robotics (cs.RO)
[392]  arXiv:2109.10267 (replaced) [pdf]
Title: Artificial Intelligence Edge Applications in 5G Networks
Comments: 11 pages, 11 figures This is a preprint published in Proceedings of Sixth International Congress on Information and Communication Technology, edited by Yang XS., Sherratt S., Dey N., Joshi A., 2021, Springer reproduced with permission of Springer Nature Singapore Pte Ltd. The final authenticated version is available online at: this https URL
Subjects: Networking and Internet Architecture (cs.NI)
[393]  arXiv:2109.10282 (replaced) [pdf, other]
Title: TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models
Comments: Work in Progress
Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[394]  arXiv:2109.10285 (replaced) [pdf, other]
Title: Early and Revocable Time Series Classification
Comments: submitted to ACML'21
Subjects: Artificial Intelligence (cs.AI)
[395]  arXiv:2109.10303 (replaced) [pdf, other]
Title: Computing Complexity-aware Plans Using Kolmogorov Complexity
Comments: Accepted to CDC 2021
Subjects: Systems and Control (eess.SY); Artificial Intelligence (cs.AI); Optimization and Control (math.OC)
[ total of 395 entries: 1-395 ]
[ showing up to 2000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, recent, 2109, contact, help  (Access key information)