We gratefully acknowledge support from
the Simons Foundation and member institutions.

Computer Science

New submissions

[ total of 462 entries: 1-462 ]
[ showing up to 2000 entries per page: fewer | more ]

New submissions for Wed, 20 Oct 21

[1]  arXiv:2110.09519 [pdf]
Title: Protection of the patient data against intentional attacks using a hybrid robust watermarking code
Comments: 21 pages, 14 figures, 7 tables, PeerJ Computer Science, ISSN: 2376-5992 , indexed by Scopus (Impact Factor 3.67, Cite Score 6.7, SNIP 2.341, Scimago Ranking 1.601, Q1 Computer scince). this https URL
Subjects: Cryptography and Security (cs.CR)

The security of patient information is important during the transfer of medical data. A hybrid spatial domain watermarking algorithm that includes encryption, integrity protection and steganography is proposed to strengthen the information originality based on the authentication. The proposed algorithm checks whether the information of patients has been deliberately changed or not. The created code is distributed at every pixel of the medical image and not only in the regions of noninterest pixels, while the image details are still preserved. To enhance the security of the watermarking code, "SHA-1" is used to get the initial key for the Symmetric Encryption Algorithm. The target of this approach is to preserve the content of the image and the watermark simultaneously, this is achieved by synthesizing an encrypted watermark from one of the components of the original image and not by embedding a watermark in the image. To evaluate the proposed code the Least Significant Bit (LSB), Bit2SB, and Bit3SB were used. The evaluation of the proposed code showed that the LSB is of better quality but overall the Bit2SB is better in its ability against the active attacks up to a size of 2*2 pixels, and it preserves the high image quality.

[2]  arXiv:2110.09520 [pdf]
Title: Image Protection against Forgery and Pixel Tampering based on a Triple Hybrid Security Approach
Comments: 10 pages, 5 figures, 4 tables, (AISC, Vol. 1058), The 5th International Conference on Advanced Intelligent Systems and Informatics 2019, AISI2019. Advances in Intelligent Systems and Computing, Vol. 1058. Springer, Cham. p.p 588-597, ISSN 2194-5357, ISSN 2194-5365 (electronic) and Scopus. Online ISBN: 978-3-030-311292, this https URL
Subjects: Cryptography and Security (cs.CR)

Due to the widespread of advanced digital imaging devices, forgery of digital images became more serious attack patterns. In this attack scenario, the attacker tries to manipulate the digital image to conceal some meaningful information of the genuine image for malicious purposes. This leads to increase security interest about protecting images against integrity tampers. This paper proposes a novel technique for protecting colored images against forgery and pixel tamper. The proposed approach is designed as a hybrid model from three security techniques, Message Digest hashing algorithm (MD5), Advanced Encryption Standard-128 bits (AES), and Stenography. The proposed approach has been evaluated using set of image quality metrics for testing the impact of embedding the protection code on image quality. The evaluation results proved that protecting image based on Least Significant Bit (LSB) is the best technique that keep image quality compared with other two bit-substitution methods. Moreover, the results proved the superiority of the proposed approach compared with other technique in the literature.

[3]  arXiv:2110.09524 [pdf, other]
Title: Understanding GNN Computational Graph: A Coordinated Computation, IO, and Memory Perspective
Subjects: Machine Learning (cs.LG); Performance (cs.PF)

Graph Neural Networks (GNNs) have been widely used in various domains, and GNNs with sophisticated computational graph lead to higher latency and larger memory consumption. Optimizing the GNN computational graph suffers from: (1) Redundant neural operator computation. The same data are propagated through the graph structure to perform the same neural operation multiple times in GNNs, leading to redundant computation which accounts for 92.4% of total operators. (2) Inconsistent thread mapping. Efficient thread mapping schemes for vertex-centric and edge-centric operators are different. This inconsistency prohibits operator fusion to reduce memory IO. (3) Excessive intermediate data. For GNN training which is usually performed concurrently with inference, intermediate data must be stored for the backward pass, consuming 91.9% of the total memory requirement. To tackle these challenges, we propose following designs to optimize the GNN computational graph from a novel coordinated computation, IO, and memory perspective: (1) Propagation-postponed operator reorganization. We reorganize operators to perform neural operations before the propagation, thus the redundant computation is eliminated. (2) Unified thread mapping for fusion. We propose a unified thread mapping scheme for both vertex- and edge-centric operators to enable fusion and reduce IO. (3) Intermediate data recomputation. Intermediate data are recomputed during the backward pass to reduce the total memory consumption. Extensive experimental results on three typical GNN models show that, we achieve up to 2.75x end-to-end speedup, 6.89x less memory IO, and 7.73x less memory consumption over state-of-the-art frameworks.

[4]  arXiv:2110.09525 [pdf, other]
Title: Eigenbehaviour as an Indicator of Cognitive Abilities
Subjects: Machine Learning (cs.LG)

With growing usage of machine learning algorithms and big data in health applications, digital biomarkers have become an important key feature to ensure the success of those applications. In this paper, we focus on one important use-case, the long-term continuous monitoring of the cognitive ability of older adults. The cognitive ability is a factor both for long-term monitoring of people living alone as well as an outcome in clinical studies. In this work, we propose a new digital biomarker for cognitive abilities based on location eigenbehaviour obtained from contactless ambient sensors. Indoor location information obtained from passive infrared sensors is used to build a location matrix covering several weeks of measurement. Based on the eigenvectors of this matrix, the reconstruction error is calculated for various numbers of used eigenvectors. The reconstruction error is used to predict cognitive ability scores collected at baseline, using linear regression. Additionally, classification of normal versus pathological cognition level is performed using a support-vector-machine. Prediction performance is strong for high levels of cognitive ability, but grows weaker for low levels of cognitive ability. Classification into normal versus pathological cognitive ability level reaches high accuracy with a AUC = 0.94. Due to the unobtrusive method of measurement based on contactless ambient sensors, this digital biomarker of cognitive ability is easily obtainable. The usage of the reconstruction error is a strong digital biomarker for the binary classification and, to a lesser extent, for more detailed prediction of interindividual differences in cognition.

[5]  arXiv:2110.09526 [pdf]
Title: Infinite Servers Queue Systems Busy Period Time Length Distribution and Parameters Study through Computational Simulation
Comments: 15 pages and 9 figures
Subjects: Performance (cs.PF)

A FORTRAN program to simulate the operation of infinite servers queues is presented in this work. Poisson arrivals processes are considered but not only. For many parameters of interest in queuing systems study or application, either there are not theoretical results or, existing, they are mathematically intractable what makes their utility doubtful. In this case a possible issue is to use simulation methods in order to get more useful results. Indeed, using simulation, some experiences may be performed and the respective results used to conjecture about certain queue systems interesting quantities. In this paper this procedure is followed to learn something more about quantities of interest for those infinite servers queue systems, in particular about busy period parameters and probability distributions.

[6]  arXiv:2110.09548 [pdf, other]
Title: Path Regularization: A Convexity and Sparsity Inducing Regularization for Parallel ReLU Networks
Comments: Accepted to NeurIPS 2021. arXiv admin note: text overlap with arXiv:2110.05518
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

Despite several attempts, the fundamental mechanisms behind the success of deep neural networks still remain elusive. To this end, we introduce a novel analytic framework to unveil hidden convexity in training deep neural networks. We consider a parallel architecture with multiple ReLU sub-networks, which includes many standard deep architectures and ResNets as its special cases. We then show that the training problem with path regularization can be cast as a single convex optimization problem in a high-dimensional space. We further prove that the equivalent convex program is regularized via a group sparsity inducing norm. Thus, a path regularized parallel architecture with ReLU sub-networks can be viewed as a parsimonious feature selection method in high-dimensions. More importantly, we show that the computational complexity required to globally optimize the equivalent convex problem is polynomial-time with respect to the number of data samples and feature dimension. Therefore, we prove exact polynomial-time trainability for path regularized deep ReLU networks with global optimality guarantees. We also provide several numerical experiments corroborating our theory.

[7]  arXiv:2110.09554 [pdf, other]
Title: TransFusion: Cross-view Fusion with Transformer for 3D Human Pose Estimation
Comments: BMVC 2021. Code is available at: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Estimating the 2D human poses in each view is typically the first step in calibrated multi-view 3D pose estimation. But the performance of 2D pose detectors suffers from challenging situations such as occlusions and oblique viewing angles. To address these challenges, previous works derive point-to-point correspondences between different views from epipolar geometry and utilize the correspondences to merge prediction heatmaps or feature representations. Instead of post-prediction merge/calibration, here we introduce a transformer framework for multi-view 3D pose estimation, aiming at directly improving individual 2D predictors by integrating information from different views. Inspired by previous multi-modal transformers, we design a unified transformer architecture, named TransFusion, to fuse cues from both current views and neighboring views. Moreover, we propose the concept of epipolar field to encode 3D positional information into the transformer model. The 3D position encoding guided by the epipolar field provides an efficient way of encoding correspondences between pixels of different views. Experiments on Human 3.6M and Ski-Pose show that our method is more efficient and has consistent improvements compared to other fusion methods. Specifically, we achieve 25.8 mm MPJPE on Human 3.6M with only 5M parameters on 256 x 256 resolution.

[8]  arXiv:2110.09557 [pdf, other]
Title: On-the-fly Code Activation for Attack Surface Reduction
Subjects: Cryptography and Security (cs.CR)

Modern code reuse attacks are taking full advantage of bloated software. Attackers piece together short sequences of instructions in otherwise benign code to carry out malicious actions. Eliminating these reusable code snippets, known as gadgets, has become one of the prime concerns of attack surface reduction. The aim is to break these chains of gadgets, thereby making such code reuse attacks impossible or substantially less common. Previous work on attack surface reduction has typically tried to eliminate such attacks by subsetting the application, e.g. via user-specified inputs, configurations, or features, or by focusing on third-party libraries to achieve high gadget reductions with minimal interference to the application.
In this work we present a general, whole-program attack surface reduction technique called OCA that significantly reduces gadgets and has minor performance degradation. OCA requires no user inputs and leaves all features intact. OCA identifies specific program points and through analysis determines key function sets to enable/disable at runtime. The runtime system, thus, controls the set of enabled functions during execution, thereby significantly reducing the set of active gadgets an attacker can use, and by extension, cutting down the set of active gadget chains dramatically. On SPEC CPU 2017, our framework achieves 73.2% total gadget reduction with only 4% average slowdown. On 10 GNU coreutils applications, it achieves 87.2% reduction. On the nginx server it achieves 80.3% reduction with 2% slowdown. We also provide a gadget chain-breaking study across all applications, and show that our framework breaks the shell-spawning chain in all cases.

[9]  arXiv:2110.09563 [pdf, other]
Title: WONDER: Workload Optimized Network Defined Edge Routing
Authors: Oleg Berzin
Comments: 17 pages, 28 figures
Subjects: Networking and Internet Architecture (cs.NI)

The 5G standards enable cellular network capabilities that significantly improve key network characteristics such as latency, capacity, throughput and reliability, compared to the previous generations of wireless networks. It is, however, clear that in order to achieve these improvements in real network implementations, the supporting physical and logical infrastructure needs to be designed appropriately. The key components of this infrastructure are Radio Access Network, Edge Data Centers, Packet/Optical Interconnection Fabric and Edge Computing. This paper concentrates on the Edge Data Centers, Interconnection and Edge Computing capabilities that target ability to deliver high-performing services on the 5G network by means of intelligent network slicing, traffic routing and coordinated compute workload distribution. We propose new methods of ensuring optimal traffic routing and edge compute workload placement under mobility conditions, subject to application requirements and constraints within a set of interconnected Edge Data Centers, utilizing Segment Routing/IPv6 and software defined control mechanisms.

[10]  arXiv:2110.09564 [pdf, other]
Title: BGaitR-Net: Occluded Gait Sequence reconstructionwith temporally constrained model for gait recognition
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

Recent advancements in computational resources and Deep Learning methodologies has significantly benefited development of intelligent vision-based surveillance applications. Gait recognition in the presence of occlusion is one of the challenging research topics in this area, and the solutions proposed by researchers to date lack in robustness and also dependent of several unrealistic constraints, which limits their practical applicability. We improve the state-of-the-art by developing novel deep learning-based algorithms to identify the occluded frames in an input sequence and next reconstruct these occluded frames by exploiting the spatio-temporal information present in the gait sequence. The multi-stage pipeline adopted in this work consists of key pose mapping, occlusion detection and reconstruction, and finally gait recognition. While the key pose mapping and occlusion detection phases are done %using Constrained KMeans Clustering and via a graph sorting algorithm, reconstruction of occluded frames is done by fusing the key pose-specific information derived in the previous step along with the spatio-temporal information contained in a gait sequence using a Bi-Directional Long Short Time Memory. This occlusion reconstruction model has been trained using synthetically occluded CASIA-B and OU-ISIR data, and the trained model is termed as Bidirectional Gait Reconstruction Network BGait-R-Net. Our LSTM-based model reconstructs occlusion and generates frames that are temporally consistent with the periodic pattern of a gait cycle, while simultaneously preserving the body structure.

[11]  arXiv:2110.09570 [pdf]
Title: A Data Bootstrapping Recipe for Low Resource Multilingual Relation Classification
Subjects: Computation and Language (cs.CL)

Relation classification (sometimes called 'extraction') requires trustworthy datasets for fine-tuning large language models, as well as for evaluation. Data collection is challenging for Indian languages, because they are syntactically and morphologically diverse, as well as different from resource-rich languages like English. Despite recent interest in deep generative models for Indian languages, relation classification is still not well served by public data sets. In response, we present IndoRE, a dataset with 21K entity and relation tagged gold sentences in three Indian languages, plus English. We start with a multilingual BERT (mBERT) based system that captures entity span positions and type information and provides competitive monolingual relation classification. Using this system, we explore and compare transfer mechanisms between languages. In particular, we study the accuracy efficiency tradeoff between expensive gold instances vs. translated and aligned 'silver' instances. We release the dataset for future research.

[12]  arXiv:2110.09571 [pdf, other]
Title: Hands Off: A Handshake Interaction Detection and Localization Model for COVID-19 Threat Control
Comments: 6 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)

The COVID-19 outbreak has affected millions of people across the globe and is continuing to spread at a drastic scale. Out of the numerous steps taken to control the spread of the virus, social distancing has been a crucial and effective practice. However, recent reports of social distancing violations suggest the need for non-intrusive detection techniques to ensure safety in public spaces. In this paper, a real-time detection model is proposed to identify handshake interactions in a range of realistic scenarios with multiple people in the scene and also detect multiple interactions in a single frame. This is the first work that performs dyadic interaction localization in a multi-person setting. The efficacy of the proposed model was evaluated across two different datasets on more than 3200 frames, thus enabling a robust localization model in different environments. The proposed model is the first dyadic interaction localizer in a multi-person setting, which enables it to be used in public spaces to identify handshake interactions and thereby identify and mitigate COVID-19 transmission.

[13]  arXiv:2110.09574 [pdf, other]
Title: Multilingual Domain Adaptation for NMT: Decoupling Language and Domain Information with Adapters
Comments: Accepted at The Sixth Conference in Machine Translation (WMT21)
Subjects: Computation and Language (cs.CL)

Adapter layers are lightweight, learnable units inserted between transformer layers. Recent work explores using such layers for neural machine translation (NMT), to adapt pre-trained models to new domains or language pairs, training only a small set of parameters for each new setting (language pair or domain). In this work we study the compositionality of language and domain adapters in the context of Machine Translation. We aim to study, 1) parameter-efficient adaptation to multiple domains and languages simultaneously (full-resource scenario) and 2) cross-lingual transfer in domains where parallel data is unavailable for certain language pairs (partial-resource scenario). We find that in the partial resource scenario a naive combination of domain-specific and language-specific adapters often results in `catastrophic forgetting' of the missing languages. We study other ways to combine the adapters to alleviate this issue and maximize cross-lingual transfer. With our best adapter combinations, we obtain improvements of 3-4 BLEU on average for source languages that do not have in-domain data. For target languages without in-domain data, we achieve a similar improvement by combining adapters with back-translation. Supplementary material is available at https://tinyurl.com/r66stbxj

[14]  arXiv:2110.09578 [pdf, ps, other]
Title: Permutation Invariance of Deep Neural Networks with ReLUs
Authors: Diganta Mukhopadhyay (1), Kumar Madhukar (2), Mandayam Srivas (3) (Chennai Mathematical Institute (1), TCS Research (2))
Comments: There are 31 pages and 2 figures in this document. This paper was submitted to the 23rd International Conference on Verification, Model Checking, and Abstract Interpretation, but was not selected for publication
Subjects: Logic in Computer Science (cs.LO); Machine Learning (cs.LG)

Consider a deep neural network (DNN) that is being used to suggest the direction in which an aircraft must turn to avoid a possible collision with an intruder aircraft. Informally, such a network is well-behaved if it asks the own ship to turn right (left) when an intruder approaches from the left (right). Consider another network that takes four inputs -- the cards dealt to the players in a game of contract bridge -- and decides which team can bid game. Loosely speaking, if you exchange the hands of partners (north and south, or east and west), the decision would not change. However, it will change if, say, you exchange north's hand with east. This permutation invariance property, for certain permutations at input and output layers, is central to the correctness and robustness of these networks.
This paper proposes a sound, abstraction-based technique to establish permutation invariance in DNNs with ReLU as the activation function. The technique computes an over-approximation of the reachable states, and an under-approximation of the safe states, and propagates this information across the layers, both forward and backward. The novelty of our approach lies in a useful tie-class analysis, that we introduce for forward propagation, and a scalable 2-polytope under-approximation method that escapes the exponential blow-up in the number of regions during backward propagation.
An experimental comparison shows the efficiency of our algorithm over that of verifying permutation invariance as a two-safety property (using FFNN verification over two copies of the network).

[15]  arXiv:2110.09580 [pdf, other]
Title: Flexible Accuracy for Differential Privacy
Comments: 42 pages
Subjects: Cryptography and Security (cs.CR)

Differential Privacy (DP) has become a gold standard in privacy-preserving data analysis. While it provides one of the most rigorous notions of privacy, there are many settings where its applicability is limited.
Our main contribution is in augmenting differential privacy with {\em Flexible Accuracy}, which allows small distortions in the input (e.g., dropping outliers) before measuring accuracy of the output, allowing one to extend DP mechanisms to high-sensitivity functions. We present mechanisms that can help in achieving this notion for functions that had no meaningful differentially private mechanisms previously. In particular, we illustrate an application to differentially private histograms, which in turn yields mechanisms for revealing the support of a dataset or the extremal values in the data. Analyses of our constructions exploit new versatile composition theorems that facilitate modular design.
All the above extensions use our new definitional framework, which is in terms of "lossy Wasserstein distance" -- a 2-parameter error measure for distributions. This may be of independent interest.

[16]  arXiv:2110.09581 [pdf, other]
Title: Improving GNSS Positioning using Neural Network-based Corrections
Comments: 13 pages, 6 figures, submitted to ION GNSS+ 2021
Subjects: Robotics (cs.RO)

Deep Neural Networks (DNNs) are a promising tool for Global Navigation Satellite System (GNSS) positioning in the presence of multipath and non-line-of-sight errors, owing to their ability to model complex errors using data. However, developing a DNN for GNSS positioning presents various challenges, such as 1) poor numerical conditioning caused by large variations in measurements and position values across the globe, 2) varying number and order within the set of measurements due to changing satellite visibility, and 3) overfitting to available data. In this work, we address the aforementioned challenges and propose an approach for GNSS positioning by applying DNN-based corrections to an initial position guess. Our DNN learns to output the position correction using the set of pseudorange residuals and satellite line-of-sight vectors as inputs. The limited variation in these input and output values improves the numerical conditioning for our DNN. We design our DNN architecture to combine information from the available GNSS measurements, which vary both in number and order, by leveraging recent advancements in set-based deep learning methods. Furthermore, we present a data augmentation strategy for reducing overfitting in the DNN by randomizing the initial position guesses. We first perform simulations and show an improvement in the initial positioning error when our DNN-based corrections are applied. After this, we demonstrate that our approach outperforms a WLS baseline on real-world data. Our implementation is available at github.com/Stanford-NavLab/deep_gnss.

[17]  arXiv:2110.09584 [pdf, other]
Title: Set-based State Estimation with Probabilistic Consistency Guarantee under Epistemic Uncertainty
Subjects: Systems and Control (eess.SY); Robotics (cs.RO)

Consistent state estimation is challenging, especially under the epistemic uncertainties arising from learned (nonlinear) dynamic and observation models. In this work, we develop a set-based estimation algorithm, that produces zonotopic state estimates that respect the epistemic uncertainties in the learned models, in addition to the aleatoric uncertainties. Our algorithm guarantees probabilistic consistency, in the sense that the true state is always bounded by the zonotopes, with a high probability. We formally relate our set-based approach with the corresponding probabilistic approach (GP-EKF) in the case of learned (nonlinear) models. In particular, when linearization errors and aleatoric uncertainties are omitted, and epistemic uncertainties are simplified, our set-based approach reduces to its probabilistic counterpart. Our method's improved consistency is empirically demonstrated in both a simulated pendulum domain and a real-world robot-assisted dressing domain, where the robot estimates the configuration of the human arm utilizing the force measurements at its end effector.

[18]  arXiv:2110.09585 [pdf, other]
Title: A-Optimal Active Learning
Comments: 11 pages, submitted to SIAM journal on Mathematics of Data Science
Subjects: Machine Learning (cs.LG)

In this work we discuss the problem of active learning. We present an approach that is based on A-optimal experimental design of ill-posed problems and show how one can optimally label a data set by partially probing it, and use it to train a deep network. We present two approaches that make different assumptions on the data set. The first is based on a Bayesian interpretation of the semi-supervised learning problem with the graph Laplacian that is used for the prior distribution and the second is based on a frequentist approach, that updates the estimation of the bias term based on the recovery of the labels. We demonstrate that this approach can be highly efficient for estimating labels and training a deep network.

[19]  arXiv:2110.09587 [pdf]
Title: Robust Control of a Surface Vessel with Adaptive Rejection of Disturbances with Unknown Parameters
Comments: in Russian
Subjects: Systems and Control (eess.SY)

This paper solves the problem of station-keeping control of a surface vessel under conditions of sinusoidal disturbances with unknown parameters. The proposed control algorithm is based on the geometric approach with the use of the adaptive internal model and the extended observer. The paper analytically proves the boundedness of the trajectories of the system and their semiglobal convergence to an arbitrarily small set. The performance of the algorithm is illustrated by computer simulation.

[20]  arXiv:2110.09591 [pdf]
Title: Geometry-Based Output Robust Tracking Control of a Quadrotor
Comments: in Russian
Subjects: Systems and Control (eess.SY)

The paper solves the problem of tracking control of a quadrotor with unmeasurable pitch and roll angles based on the geometric approach with the use of the enhanced extended observer and the internal model. The proposed approach makes it possible to ensure the movement of a quadrotor in a horizontal plane along a trajectory given in the form of a sinusoidal or second-order polynomial function with semiglobal asymptotic convergence of the tracking errors to zero.

[21]  arXiv:2110.09593 [pdf, other]
Title: Active Tapping via Gaussian Process for Efficient Unknown Object Surface Reconstruction
Subjects: Robotics (cs.RO)

Object surface reconstruction brings essential benefits to robot grasping, object recognition, and object manipulation. When measuring the surface distribution of an unknown object by tapping, the greatest challenge is to select tapping positions efficiently and accurately without prior knowledge of object region. Given a searching range, we propose an active exploration method, to efficiently and intelligently guide the tapping to learn the object surface without exhaustive and unnecessary off-surface tapping. We analyze the performance of our approach in modeling object surfaces within an exploration range larger than the object using a robot arm equipped with an end-of-arm tapping tool to execute tapping motions. Experimental results show that the approach successfully models the surface of unknown objects with a relative 59% improvement in the proportion of necessary taps among all taps compared with state-of-art performance.

[22]  arXiv:2110.09598 [pdf, ps, other]
Title: Adversarial Domain Adaptation with Paired Examples for Acoustic Scene Classification on Different Recording Devices
Comments: Accepted for publication in the Proceedings of the 29th European Signal Processing Conference (EUSIPCO), Dublin, Ireland, 2021
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)

In classification tasks, the classification accuracy diminishes when the data is gathered in different domains. To address this problem, in this paper, we investigate several adversarial models for domain adaptation (DA) and their effect on the acoustic scene classification task. The studied models include several types of generative adversarial networks (GAN), with different loss functions, and the so-called cycle GAN which consists of two interconnected GAN models. The experiments are performed on the DCASE20 challenge task 1A dataset, in which we can leverage the paired examples of data recorded using different devices, i.e., the source and target domain recordings. The results of performed experiments indicate that the best performing domain adaptation can be obtained using the cycle GAN, which achieves as much as 66% relative improvement in accuracy for the target domain device, while only 6\% relative decrease in accuracy on the source domain. In addition, by utilizing the paired data examples, we are able to improve the overall accuracy over the model trained using larger unpaired data set, while decreasing the computational cost of the model training.

[23]  arXiv:2110.09599 [pdf, other]
Title: Label-Descriptive Patterns and their Application to Characterizing Classification Errors
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL)

State-of-the-art deep learning methods achieve human-like performance on many tasks, but make errors nevertheless. Characterizing these errors in easily interpretable terms gives insight into whether a model is prone to making systematic errors, but also gives a way to act and improve the model. In this paper we propose a method that allows us to do so for arbitrary classifiers by mining a small set of patterns that together succinctly describe the input data that is partitioned according to correctness of prediction. We show this is an instance of the more general label description problem, which we formulate in terms of the Minimum Description Length principle. To discover good pattern sets we propose the efficient and hyperparameter-free Premise algorithm, which through an extensive set of experiments we show on both synthetic and real-world data performs very well in practice; unlike existing solutions it ably recovers ground truth patterns, even on highly imbalanced data over many unique items, or where patterns are only weakly associated to labels. Through two real-world case studies we confirm that Premise gives clear and actionable insight into the systematic errors made by modern NLP classifiers.

[24]  arXiv:2110.09600 [pdf, other]
Title: Who calls the shots? Rethinking Few-Shot Learning for Audio
Comments: WASPAA 2021
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)

Few-shot learning aims to train models that can recognize novel classes given just a handful of labeled examples, known as the support set. While the field has seen notable advances in recent years, they have often focused on multi-class image classification. Audio, in contrast, is often multi-label due to overlapping sounds, resulting in unique properties such as polyphony and signal-to-noise ratios (SNR). This leads to unanswered questions concerning the impact such audio properties may have on few-shot learning system design, performance, and human-computer interaction, as it is typically up to the user to collect and provide inference-time support set examples. We address these questions through a series of experiments designed to elucidate the answers to these questions. We introduce two novel datasets, FSD-MIX-CLIPS and FSD-MIX-SED, whose programmatic generation allows us to explore these questions systematically. Our experiments lead to audio-specific insights on few-shot learning, some of which are at odds with recent findings in the image domain: there is no best one-size-fits-all model, method, and support set selection criterion. Rather, it depends on the expected application scenario. Our code and data are available at https://github.com/wangyu/rethink-audio-fsl.

[25]  arXiv:2110.09601 [pdf, ps, other]
Title: Fair and Efficient Allocations of Chores under Bivalued Preferences
Comments: 25 pages
Subjects: Computer Science and Game Theory (cs.GT)

We study the problem of fair and efficient allocation of a set of indivisible chores to agents with additive cost functions. We consider the popular fairness notion of envy-freeness up to one good (EF1) with the efficiency notion of Pareto-optimality (PO). While it is known that an EF1+PO allocation exists and can be computed in pseudo-polynomial time in the case of goods, the same problem is open for chores.
Our first result is a strongly polynomial-time algorithm for computing an EF1+PO allocation for bivalued instances, where agents have (at most) two disutility values for the chores. To the best of our knowledge, this is the first non-trivial class of indivisible chores to admit an EF1+PO allocation and an efficient algorithm for its computation.
We also study the problem of computing an envy-free (EF) and PO allocation for the case of divisible chores. While the existence of an EF+PO allocation is known via competitive equilibrium with equal incomes, its efficient computation is open. Our second result shows that for bivalued instances, an EF+PO allocation can be computed in strongly polynomial-time.

[26]  arXiv:2110.09605 [pdf, other]
Title: Neural Synthesis of Footsteps Sound Effects with Generative Adversarial Networks
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)

Footsteps are among the most ubiquitous sound effects in multimedia applications. There is substantial research into understanding the acoustic features and developing synthesis models for footstep sound effects. In this paper, we present a first attempt at adopting neural synthesis for this task. We implemented two GAN-based architectures and compared the results with real recordings as well as six traditional sound synthesis methods. Our architectures reached realism scores as high as recorded samples, showing encouraging results for the task at hand.

[27]  arXiv:2110.09606 [pdf, other]
Title: Efficient Analysis of COVID-19 Clinical Data using Machine Learning Models
Subjects: Machine Learning (cs.LG)

Because of the rapid spread of COVID-19 to almost every part of the globe, huge volumes of data and case studies have been made available, providing researchers with a unique opportunity to find trends and make discoveries like never before, by leveraging such big data. This data is of many different varieties, and can be of different levels of veracity e.g., precise, imprecise, uncertain, and missing, making it challenging to extract important information from such data. Yet, efficient analyses of this continuously growing and evolving COVID-19 data is crucial to inform -- often in real-time -- the relevant measures needed for controlling, mitigating, and ultimately avoiding viral spread. Applying machine learning based algorithms to this big data is a natural approach to take to this aim, since they can quickly scale to such data, and extract the relevant information in the presence of variety and different levels of veracity. This is important for COVID-19, and for potential future pandemics in general.
In this paper, we design a straightforward encoding of clinical data (on categorical attributes) into a fixed-length feature vector representation, and then propose a model that first performs efficient feature selection from such representation. We apply this approach on two clinical datasets of the COVID-19 patients and then apply different machine learning algorithms downstream for classification purposes. We show that with the efficient feature selection algorithm, we can achieve a prediction accuracy of more than 90\% in most cases. We also computed the importance of different attributes in the dataset using information gain. This can help the policy makers to focus on only certain attributes for the purposes of studying this disease rather than focusing on multiple random factors that may not be very informative to patient outcomes.

[28]  arXiv:2110.09607 [pdf]
Title: Hierarchical Mobility Label Based Network: System Model and Performance Analysis
Authors: Oleg Berzin
Subjects: Networking and Internet Architecture (cs.NI)

Hierarchical Mobility Label Based Network (HMLBN) is a new approach to the network layer mobility management problem that relies on MPLS-aware control plane and MPLS-based forwarding plane to provide IP mobility support for IPv4 and IPv6 mobile hosts and routers while being able to ensure optimal traffic delivery between the communicating devices. The hierarchical system is capable of both macro- and micro-mobility support without the use of Mobile IP and its derivatives thus eliminating the user and network facing performance penalties associated with triangular routing and bi-directional tunneling. This paper presents a system model and provides performance analysis for H-MLBN and compares its performance with the Mobile IP based schemes. The results indicate significant performance improvements in the forwarding plane traffic delivery as well as the control plane network update costs.

[29]  arXiv:2110.09609 [pdf]
Title: Mobility Label Based Network: Hierarchical Mobility Management and Packet Forwarding Architecture
Authors: Oleg Berzin
Subjects: Networking and Internet Architecture (cs.NI)

Scalability of the network layer mobility management solution is one of the most important requirements for the mobility control plane. Mobility Label Based Network (MLBN) is a new approach to the network layer mobility management problem that relies solely on MPLS to provide both macro- and micro-mobility for IPv4 and IPv6 mobile hosts and routers. This new approach does not rely on the existing IP mobility management protocols such as Mobile IP and is based on the combination of Multi- Protocol BGP (MP-BGP) and MPLS. In the context of the MLBN the scalable control plane should be capable of efficient Mobility Label distribution while allowing the MPLS based forwarding plane to deliver mobile traffic in an optimal manner. This paper presents a hierarchical mobility management system capable of both macro- and micromobility support without the use of Mobile IP and its derivatives and allows scalable Mobility Label distribution and MPLS label stack based packet forwarding in support of optimal traffic delivery between the communicating mobile users.

[30]  arXiv:2110.09610 [pdf, other]
Title: A Survey on Machine Learning Techniques for Source Code Analysis
Subjects: Software Engineering (cs.SE); Machine Learning (cs.LG)

Context: The advancements in machine learning techniques have encouraged researchers to apply these techniques to a myriad of software engineering tasks that use source code analysis such as testing and vulnerabilities detection. A large number of studies poses challenges to the community to understand the current landscape. Objective: We aim to summarize the current knowledge in the area of applied machine learning for source code analysis. Method: We investigate studies belonging to twelve categories of software engineering tasks and corresponding machine learning techniques, tools, and datasets that have been applied to solve them. To do so, we carried out an extensive literature search and identified 364 primary studies published between 2002 and 2021. We summarize our observations and findings with the help of the identified studies. Results: Our findings suggest that the usage of machine learning techniques for source code analysis tasks is consistently increasing. We synthesize commonly used steps and the overall workflow for each task, and summarize the employed machine learning techniques. Additionally, we collate a comprehensive list of available datasets and tools useable in this context. Finally, we summarize the perceived challenges in this area that include availability of standard datasets, reproducibility and replicability, and hardware resources.

[31]  arXiv:2110.09615 [pdf, other]
Title: A Generalised Logical Layered Architecture for Blockchain Technology
Comments: 24 pages, 7 figures
Subjects: Cryptography and Security (cs.CR); Software Engineering (cs.SE)

Precision, validity, reliability, timeliness, availability, and granularity are the desired characteristics for data and information systems. However due to the desired trait of data mutability, information systems have inherently lacked the ability to enforce data integrity without governance. A resolution to this challenge has emerged in the shape of blockchain architecture, which ensures immutability of stored information, whilst remaining in an online state. Blockchain technology achieves this through the serial attachment of set-sized parcels of data called blocks. Links (liken to a chain) between these blocks are implemented using a cryptographic seal created using mathematical functions on the data inside the blocks. Practical implementations of blockchain vary by different components, concepts, and terminologies. Researchers proposed various architectural models using different layers to implement blockchain technologies. In this paper, we investigated those layered architectures for different use cases. We identified essential layers and components for a generalised blockchain architecture. We present a novel three-tiered storage model for the purpose of logically defining and categorising blockchain as a storage technology. We envision that this generalised model will be used as a guide when referencing and building any blockchain storage solution.

[32]  arXiv:2110.09619 [pdf, other]
Title: Further Generalizations of the Jaccard Index
Comments: 10 pages, 9 figures, a working manuscript
Subjects: Machine Learning (cs.LG)

Quantifying the similarity between two sets constitutes a particularly interesting and useful operation in several theoretical and applied problems involving set theory. Aimed at quantifying the similarity between two sets, the Jaccard index has been extensively used in the most diverse types of problems, also motivating respective generalizations. The present work addressew further generalizations of this index, including its modification into a coincidence index capable of accounting also for the level of interiority of the sets, an extension for sets in continuous vector spaces, the consideration of weights associated to the involved set elements, the generalization to densities and generic scalar fields, as well as a means to quantify the joint interdependence between random variables. The also interesting possibility to take into account more than two sets was also addressed, including the description of an index capable of quantifying the level of chaining between three sets. Several of the described and suggested generalizations have been illustrated with respect to numeric case examples. It is also posited that these indices can play an important role while analyzing and integrating datasets in modeling approaches and pattern recognition activities.

[33]  arXiv:2110.09621 [pdf, other]
Title: Probabilistic Semantic Data Association for Collaborative Human-Robot Sensing
Comments: 15 pages, 14 figures
Subjects: Robotics (cs.RO)

In collaborative human-robot semantic sensing problems, e.g. for scientific exploration, robots could potentially overtrust information given by a human partner, resulting in suboptimal state estimation and poor team performance. When humans cannot be treated as oracles, robots need to update state beliefs to correctly account for possible discrepancies between human semantic observations and the actual world states which lead to those observations. This work develops strategies for rigorous online calculation of probabilistic semantic data association (PSDA) probabilities for semantic likelihoods in general settings, unlike previous work which developed naive or heuristic approximations for specific settings. The new PSDA method is incorporated into a hybrid Bayesian data fusion scheme which uses Gaussian mixture priors for object states and softmax functions for semantic human sensor observation likelihoods, and is demonstrated in Monte Carlo simulations of collaborative multi-object search missions featuring a range of relevant human sensing characteristics (e.g. false detection rate). It is shown that PSDA leads to robust estimation of observation association probabilities under a wide range of conditions whenever semantic human sensor data contain significant target reference ambiguities for autonomous object search and localization.

[34]  arXiv:2110.09622 [pdf, other]
Title: Robust Representation and Efficient Feature Selection Allows for Effective Clustering of SARS-CoV-2 Variants
Subjects: Machine Learning (cs.LG); Quantitative Methods (q-bio.QM)

The widespread availability of large amounts of genomic data on the SARS-CoV-2 virus, as a result of the COVID-19 pandemic, has created an opportunity for researchers to analyze the disease at a level of detail unlike any virus before it. One one had, this will help biologists, policy makers and other authorities to make timely and appropriate decisions to control the spread of the coronavirus. On the other hand, such studies will help to more effectively deal with any possible future pandemic. Since the SARS-CoV-2 virus contains different variants, each of them having different mutations, performing any analysis on such data becomes a difficult task. It is well known that much of the variation in the SARS-CoV-2 genome happens disproportionately in the spike region of the genome sequence -- the relatively short region which codes for the spike protein(s). Hence, in this paper, we propose an approach to cluster spike protein sequences in order to study the behavior of different known variants that are increasing at very high rate throughout the world. We use a k-mers based approach to first generate a fixed-length feature vector representation for the spike sequences. We then show that with the appropriate feature selection, we can efficiently and effectively cluster the spike sequences based on the different variants. Using a publicly available set of SARS-CoV-2 spike sequences, we perform clustering of these sequences using both hard and soft clustering methods and show that with our feature selection methods, we can achieve higher F1 scores for the clusters.

[35]  arXiv:2110.09624 [pdf]
Title: Ideal Partition of Resources for Metareasoning
Comments: 12 pages, 5 figures. January 1990 technical report on principles of metareasoning and bounded optimality
Subjects: Artificial Intelligence (cs.AI)

We can achieve significant gains in the value of computation by metareasoning about the nature or extent of base-level problem solving before executing a solution. However, resources that are irrevocably committed to metareasoning are not available for executing a solution. Thus, it is important to determine the portion of resources we wish to apply to metareasoning and control versus to the execution of a solution plan. Recent research on rational agency has highlighted the importance of limiting the consumption of resources by metareasoning machinery. We shall introduce the metareasoning-partition problem--the problem of ideally apportioning costly reasoning resources to planning a solution versus applying resource to executing a solution to a problem. We exercise prototypical metareasoning-partition models to probe the relationships between time allocated to metareasoning and to execution for different problem classes. Finally, we examine the value of metareasoning in the context of our functional analyses.

[36]  arXiv:2110.09635 [pdf, other]
Title: A ground-truth dataset of real security patches
Authors: Sofia Reis, Rui Abreu
Subjects: Cryptography and Security (cs.CR); Software Engineering (cs.SE)

Training machine learning approaches for vulnerability identification and producing reliable tools to assist developers in implementing quality software -- free of vulnerabilities -- is challenging due to the lack of large datasets and real data. Researchers have been looking at these issues and building datasets. However, these datasets usually miss natural language artifacts and programming language diversity. We scraped the entire CVE details database for GitHub references and augmented the data with 3 security-related datasets. We used the data to create a ground-truth dataset of natural language artifacts (such as commit messages, commits comments, and summaries), meta-data and code changes. Our dataset integrates a total of 8057 security-relevant commits -- the equivalent to 5942 security patches -- from 1339 different projects spanning 146 different types of vulnerabilities and 20 languages. A dataset of 110k non-security-related commits is also provided. Data and scripts are all available on GitHub. Data is stored in a .CSV file. Codebases can be downloaded using our scripts. Our dataset is a valuable asset to answer research questions on different topics such as the identification of security-relevant information using NLP models; software engineering and security best practices; and, vulnerability detection and patching; and, security program analysis.

[37]  arXiv:2110.09637 [pdf, other]
Title: Go with the Flow? A Large-Scale Analysis of Health Care Delivery Networks in the United States Using Hodge Theory
Subjects: Social and Information Networks (cs.SI); Algebraic Topology (math.AT)

Health care delivery is a collaborative process, requiring close coordination among networks of providers with specialized expertise. Yet in the United States, care is often spread across multiple disconnected providers (e.g., primary care physicians, specialists), leading to fragmented care delivery networks, and contributing to higher costs and lower quality. While this problem is well known, there are relatively few quantitative tools available for characterizing care delivery networks at scale, thereby inhibiting deeper understanding of care fragmentation and efforts to address it. In this, study, we conduct a large-scale analysis of care delivery networks across the United States using the discrete Hodge decomposition, an emerging method of topological data analysis. Using this technique, we decompose networks of patient flows among physicians into three orthogonal subspaces: gradient (acyclic flow), harmonic (global cyclic flow), and curl (local cyclic flow). We document substantial variation in the relative importance of each subspace, suggesting that there may be systematic differences in the organization of care delivery networks across health care markets. Moreover, we find that the relative importance of each subspace is predictive of local care cost and quality, with outcomes tending to be better with greater curl flow and worse with greater harmonic flow.

[38]  arXiv:2110.09638 [pdf, other]
Title: Repeated Games, Optimal Channel Capture, and Open Problems for Slotted Multiple Access
Authors: Michael J. Neely
Comments: 19 pages
Subjects: Computer Science and Game Theory (cs.GT); Optimization and Control (math.OC)

This paper revisits a classical problem of slotted multiple access with success, idle, and collision events on each slot. First, results of a 2-user multiple access game are reported. The game was conducted at the University of Southern California over multiple semesters and involved competitions between student-designed algorithms. An algorithm called 4-State was a consistent winner. This algorithm is analyzed and shown to have an optimal expected score when competing against an independent version of itself. The structure of 4-State motivates exploration of the open question of how to minimize the expected time to capture the channel for a $n$-user situation. It is assumed that the system delivers perfect feedback on the number of users who transmitted at the end of each slot. An efficient algorithm is developed and conjectured to have an optimal expected capture time for all positive integers $n$. Optimality is proven in the special cases $n \in \{1, 2, 3, 4, 6\}$ using a novel analytical technique that introduces virtual users with enhanced capabilities.

[39]  arXiv:2110.09641 [pdf, other]
Title: Dynamic Feature Alignment for Semi-supervised Domain Adaptation
Comments: BMVC 2021
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Most research on domain adaptation has focused on the purely unsupervised setting, where no labeled examples in the target domain are available. However, in many real-world scenarios, a small amount of labeled target data is available and can be used to improve adaptation. We address this semi-supervised setting and propose to use dynamic feature alignment to address both inter- and intra-domain discrepancy. Unlike previous approaches, which attempt to align source and target features within a mini-batch, we propose to align the target features to a set of dynamically updated class prototypes, which we use both for minimizing divergence and pseudo-labeling. By updating based on class prototypes, we avoid problems that arise in previous approaches due to class imbalances. Our approach, which doesn't require extensive tuning or adversarial training, significantly improves the state of the art for semi-supervised domain adaptation. We provide a quantitative evaluation on two standard datasets, DomainNet and Office-Home, and performance analysis.

[40]  arXiv:2110.09643 [pdf, other]
Title: In-memory Multi-valued Associative Processor
Subjects: Hardware Architecture (cs.AR)

In-memory associative processor architectures are offered as a great candidate to overcome memory-wall bottleneck and to enable vector/parallel arithmetic operations. In this paper, we extend the functionality of the associative processor to multi-valued arithmetic. To allow for in-memory compute implementation of arithmetic or logic functions, we propose a structured methodology enabling the automatic generation of the corresponding look-up tables (LUTs). We propose two approaches to build the LUTs: a first approach that formalizes the intuition behind LUT pass ordering and a more optimized approach that reduces the number of required write cycles. To demonstrate these methodologies, we present a novel ternary associative processor (TAP) architecture that is employed to implement efficient ternary vector in-place addition. A SPICE-MATLAB co-simulator is implemented to test the functionality of the TAP and to evaluate the performance of the proposed AP ternary in-place adder implementations in terms of energy, delay, and area. Results show that compared to the binary AP adder, the ternary AP adder results in a 12.25\% and 6.2\% reduction in energy and area, respectively. The ternary AP also demonstrates a 52.64\% reduction in energy and a delay that is up to 9.5x smaller when compared to a state-of-art ternary carry-lookahead adder.

[41]  arXiv:2110.09646 [pdf, other]
Title: Monotonic Simultaneous Translation with Chunk-wise Reordering and Refinement
Comments: To be published in WMT2021
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

Recent work in simultaneous machine translation is often trained with conventional full sentence translation corpora, leading to either excessive latency or necessity to anticipate as-yet-unarrived words, when dealing with a language pair whose word orders significantly differ. This is unlike human simultaneous interpreters who produce largely monotonic translations at the expense of the grammaticality of a sentence being translated. In this paper, we thus propose an algorithm to reorder and refine the target side of a full sentence translation corpus, so that the words/phrases between the source and target sentences are aligned largely monotonically, using word alignment and non-autoregressive neural machine translation. We then train a widely used wait-k simultaneous translation model on this reordered-and-refined corpus. The proposed approach improves BLEU scores and resulting translations exhibit enhanced monotonicity with source sentences.

[42]  arXiv:2110.09647 [pdf, other]
Title: Relational Neural Markov Random Fields
Comments: StarAI 2021 workshop on IJCLR 2021
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)

Statistical Relational Learning (SRL) models have attracted significant attention due to their ability to model complex data while handling uncertainty. However, most of these models have been limited to discrete domains due to their limited potential functions. We introduce Relational Neural Markov Random Fields (RN-MRFs) which allow for handling of complex relational hybrid domains. The key advantage of our model is that it makes minimal data distributional assumptions and can seamlessly allow for human knowledge through potentials or relational rules. We propose a maximum pseudolikelihood estimation-based learning algorithm with importance sampling for training the neural potential parameters. Our empirical evaluations across diverse domains such as image processing and relational object mapping, clearly demonstrate its effectiveness against non-neural counterparts.

[43]  arXiv:2110.09654 [pdf, other]
Title: Privacy-Preserving Mutual Authentication and Key Agreement Scheme for Multi-Server Healthcare System
Comments: 22 Pages
Journal-ref: Information Systems Frontiers, Vol. 23, No. 4, p. 835, 2021
Subjects: Cryptography and Security (cs.CR)

The usage of different technologies and smart devices helps people to get medical services remotely for multiple benefits. Thus, critical and sensitive data is exchanged between a user and a doctor. When health data is transmitted over a common channel, it becomes essential to preserve various privacy and security properties in the system. Further, the number of users for remote services is increasing day-by-day exponentially, and thus, it is not adequate to deal with all users using the one server due to the verification overhead, server failure, and scalability issues. Thus, researchers proposed various authentication protocols for multi-server architecture, but most of them are vulnerable to different security attacks and require high computational resources during the implementation. To Tackle privacy and security issues using less computational resources, we propose a privacy-preserving mutual authentication and key agreement protocol for a multi-server healthcare system. We discuss the proposed scheme's security analysis and performance results to understand its security strengths and the computational resource requirement, respectively. Further, we do the comparison of security and performance results with recent relevant authentication protocols.

[44]  arXiv:2110.09658 [pdf, ps, other]
Title: System Norm Regularization Methods for Koopman Operator Approximation
Comments: 7 pages. arXiv admin note: text overlap with arXiv:2102.03613
Subjects: Systems and Control (eess.SY); Machine Learning (cs.LG); Dynamical Systems (math.DS)

Approximating the Koopman operator from data is numerically challenging when many lifting functions are considered. Even low-dimensional systems can yield unstable or ill-conditioned results in a high-dimensional lifted space. In this paper, Extended DMD and DMD with control, two popular methods for approximating the Koopman operator, are reformulated as convex optimization problems with linear matrix inequality constraints. Both hard asymptotic stability constraints and system norm regularizers are considered as methods to improve the numerical conditioning of the approximate Koopman operator. In particular, the $\mathcal{H}_\infty$ norm is used as a regularizer to penalize the input-output gain of the linear system defined by the Koopman operator. Weighting functions are then applied to penalize the system gain at particular frequencies.

[45]  arXiv:2110.09660 [pdf, other]
Title: BEV-SGD: Best Effort Voting SGD for Analog Aggregation Based Federated Learning against Byzantine Attackers
Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR); Distributed, Parallel, and Cluster Computing (cs.DC); Information Theory (cs.IT)

As a promising distributed learning technology, analog aggregation based federated learning over the air (FLOA) provides high communication efficiency and privacy provisioning in edge computing paradigm. When all edge devices (workers) simultaneously upload their local updates to the parameter server (PS) through the commonly shared time-frequency resources, the PS can only obtain the averaged update rather than the individual local ones. As a result, such a concurrent transmission and aggregation scheme reduces the latency and costs of communication but makes FLOA vulnerable to Byzantine attacks which then degrade FLOA performance. For the design of Byzantine-resilient FLOA, this paper starts from analyzing the channel inversion (CI) power control mechanism that is widely used in existing FLOA literature. Our theoretical analysis indicates that although CI can achieve good learning performance in the non-attacking scenarios, it fails to work well with limited defensive capability to Byzantine attacks. Then, we propose a novel defending scheme called best effort voting (BEV) power control policy integrated with stochastic gradient descent (SGD). Our BEV-SGD improves the robustness of FLOA to Byzantine attacks, by allowing all the workers to send their local updates at their maximum transmit power. Under the strongest-attacking circumstance, we derive the expected convergence rates of FLOA with CI and BEV power control policies, respectively. The rate comparison reveals that our BEV-SGD outperforms its counterpart with CI in terms of better convergence behavior, which is verified by experimental simulations.

[46]  arXiv:2110.09663 [pdf]
Title: EILEEN: A recommendation system for scientific publications and grants
Comments: 16 pages, 3 figures, 2 tables
Subjects: Information Retrieval (cs.IR); Digital Libraries (cs.DL)

Finding relevant scientific articles is crucial for advancing knowledge. Recommendation systems are helpful for such purpose, although they have only been applied to science recently. This article describes EILEEN (Exploratory Innovator of LitEraturE Networks), a recommendation system for scientific publications and grants with open source code and datasets. We describe EILEEN's architecture for ingesting and processing documents and modeling the recommendation system and keyphrase estimator. Using a unique dataset of log-in user behavior, we validate our recommendation system against Latent Semantic Analysis (LSA) and the standard ranking from Elasticsearch (Lucene scoring). We find that a learning-to-rank with Random Forest achieves an AUC of 0.9, significantly outperforming both baselines. Our results suggest that we can substantially improve science recommendations and learn about scientists' behavior through their search behavior. We make our system available through eileen.io

[47]  arXiv:2110.09665 [pdf, other]
Title: Ensemble ALBERT on SQuAD 2.0
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

Machine question answering is an essential yet challenging task in natural language processing. Recently, Pre-trained Contextual Embeddings (PCE) models like Bidirectional Encoder Representations from Transformers (BERT) and A Lite BERT (ALBERT) have attracted lots of attention due to their great performance in a wide range of NLP tasks. In our Paper, we utilized the fine-tuned ALBERT models and implemented combinations of additional layers (e.g. attention layer, RNN layer) on top of them to improve model performance on Stanford Question Answering Dataset (SQuAD 2.0). We implemented four different models with different layers on top of ALBERT-base model, and two other models based on ALBERT-xlarge and ALBERT-xxlarge. We compared their performance to our baseline model ALBERT-base-v2 + ALBERT-SQuAD-out with details. Our best-performing individual model is ALBERT-xxlarge + ALBERT-SQuAD-out, which achieved an F1 score of 88.435 on the dev set. Furthermore, we have implemented three different ensemble algorithms to boost overall performance. By passing in several best-performing models' results into our weighted voting ensemble algorithm, our final result ranks first on the Stanford CS224N Test PCE SQuAD Leaderboard with F1 = 90.123.

[48]  arXiv:2110.09667 [pdf, other]
Title: Performance of Low Synchronization Orthogonalization Methods in Anderson Accelerated Fixed Point Solvers
Comments: 11 pages, 6 figures
Subjects: Numerical Analysis (math.NA)

Anderson Acceleration (AA) is a method to accelerate the convergence of fixed point iterations for nonlinear, algebraic systems of equations. Due to the requirement of solving a least squares problem at each iteration and a reliance on modified Gram-Schmidt for updating the iteration space, AA requires extra costly synchronization steps for global reductions. Moreover, the number of reductions in each iteration depends on the size of the iteration space. In this work, we introduce three low synchronization orthogonalization algorithms into AA within SUNDIALS that reduce the total number of global reductions per iteration to a constant of 2 or 3, independent of the size of the iteration space. A performance study demonstrates the reduced time required by the new algorithms at large processor counts with CPUs and demonstrates the predicted performance on multi-GPU architectures. Most importantly, we provide convergence and timing data for multiple numerical experiments to demonstrate reliability of the algorithms within AA and improved performance at parallel strong-scaling limits.

[49]  arXiv:2110.09670 [pdf, other]
Title: Private measurement of nonlinear correlations between data hosted across multiple parties
Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR); Computation (stat.CO); Machine Learning (stat.ML)

We introduce a differentially private method to measure nonlinear correlations between sensitive data hosted across two entities. We provide utility guarantees of our private estimator. Ours is the first such private estimator of nonlinear correlations, to the best of our knowledge within a multi-party setup. The important measure of nonlinear correlation we consider is distance correlation. This work has direct applications to private feature screening, private independence testing, private k-sample tests, private multi-party causal inference and private data synthesis in addition to exploratory data analysis. Code access: A link to publicly access the code is provided in the supplementary file.

[50]  arXiv:2110.09672 [pdf, ps, other]
Title: A survey on active noise control techniques -- Part II: Nonlinear systems
Comments: 59 pages, 9 figures
Subjects: Machine Learning (cs.LG); Systems and Control (eess.SY)

Part I of this paper reviewed the development of the linear active noise control (ANC) technique in the past decade. However, ANC systems might have to deal with some nonlinear components and the performance of linear ANC techniques may degrade in this scenario. To overcome this limitation, nonlinear ANC (NLANC) algorithms were developed. In Part II, we review the development of NLANC algorithms during the last decade. The contributions of heuristic ANC algorithms are outlined. Moreover, we emphasize recent advances of NLANC algorithms, such as spline ANC algorithms, kernel adaptive filters, and nonlinear distributed ANC algorithms. Then, we present recent applications of ANC technique including linear and nonlinear perspectives. Future research challenges regarding ANC techniques are also discussed.

[51]  arXiv:2110.09674 [pdf, other]
Title: Adaptive Distillation: Aggregating Knowledge from Multiple Paths for Efficient Distillation
Comments: Accepted to BMVC 2021 for publication
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

Knowledge Distillation is becoming one of the primary trends among neural network compression algorithms to improve the generalization performance of a smaller student model with guidance from a larger teacher model. This momentous rise in applications of knowledge distillation is accompanied by the introduction of numerous algorithms for distilling the knowledge such as soft targets and hint layers. Despite this advancement in different techniques for distilling the knowledge, the aggregation of different paths for distillation has not been studied comprehensively. This is of particular significance, not only because different paths have different importance, but also due to the fact that some paths might have negative effects on the generalization performance of the student model. Hence, we need to adaptively adjust the importance of each path to maximize the impact of distillation on the student model. In this paper, we explore different approaches for aggregating these different paths and introduce our proposed adaptive approach based on multitask learning methods. We empirically demonstrate the effectiveness of the proposed approach over other baselines on the applications of knowledge distillation in classification, semantic segmentation, and object detection tasks.

[52]  arXiv:2110.09677 [pdf, other]
Title: Accelerated Graph Learning from Smooth Signals
Subjects: Machine Learning (cs.LG); Signal Processing (eess.SP)

We consider network topology identification subject to a signal smoothness prior on the nodal observations. A fast dual-based proximal gradient algorithm is developed to efficiently tackle a strongly convex, smoothness-regularized network inverse problem known to yield high-quality graph solutions. Unlike existing solvers, the novel iterations come with global convergence rate guarantees and do not require additional step-size tuning. Reproducible simulated tests demonstrate the effectiveness of the proposed method in accurately recovering random and real-world graphs, markedly faster than state-of-the-art alternatives and without incurring an extra computational burden.

[53]  arXiv:2110.09681 [pdf, other]
Title: Permutation invariant graph-to-sequence model for template-free retrosynthesis and reaction prediction
Subjects: Machine Learning (cs.LG)

Synthesis planning and reaction outcome prediction are two fundamental problems in computer-aided organic chemistry for which a variety of data-driven approaches have emerged. Natural language approaches that model each problem as a SMILES-to-SMILES translation lead to a simple end-to-end formulation, reduce the need for data preprocessing, and enable the use of well-optimized machine translation model architectures. However, SMILES representations are not an efficient representation for capturing information about molecular structures, as evidenced by the success of SMILES augmentation to boost empirical performance. Here, we describe a novel Graph2SMILES model that combines the power of Transformer models for text generation with the permutation invariance of molecular graph encoders that mitigates the need for input data augmentation. As an end-to-end architecture, Graph2SMILES can be used as a drop-in replacement for the Transformer in any task involving molecule(s)-to-molecule(s) transformations. In our encoder, an attention-augmented directed message passing neural network (D-MPNN) captures local chemical environments, and the global attention encoder allows for long-range and intermolecular interactions, enhanced by graph-aware positional embedding. Graph2SMILES improves the top-1 accuracy of the Transformer baselines by $1.7\%$ and $1.9\%$ for reaction outcome prediction on USPTO_480k and USPTO_STEREO datasets respectively, and by $9.8\%$ for one-step retrosynthesis on the USPTO_50k dataset.

[54]  arXiv:2110.09695 [pdf, other]
Title: Tackling Dynamics in Federated Incremental Learning with Variational Embedding Rehearsal
Subjects: Machine Learning (cs.LG)

Federated Learning is a fast growing area of ML where the training datasets are extremely distributed, all while dynamically changing over time. Models need to be trained on clients' devices without any guarantees for either homogeneity or stationarity of the local private data. The need for continual training has also risen, due to the ever-increasing production of in-task data. However, pursuing both directions at the same time is challenging, since client data privacy is a major constraint, especially for rehearsal methods. Herein, we propose a novel algorithm to address the incremental learning process in an FL scenario, based on realistic client enrollment scenarios where clients can drop in or out dynamically. We first propose using deep Variational Embeddings that secure the privacy of the client data. Second, we propose a server-side training method that enables a model to rehearse the previously learnt knowledge. Finally, we investigate the performance of federated incremental learning in dynamic client enrollment scenarios. The proposed method shows parity with offline training on domain-incremental learning, addressing challenges in both the dynamic enrollment of clients and the domain shifting of client data.

[55]  arXiv:2110.09696 [pdf, other]
Title: Near-Optimal Quantum Algorithms for String Problems
Authors: Shyan Akmal, Ce Jin
Comments: To appear in SODA 2022
Subjects: Data Structures and Algorithms (cs.DS); Quantum Physics (quant-ph)

We study quantum algorithms for several fundamental string problems, including Longest Common Substring, Lexicographically Minimal String Rotation, and Longest Square Substring. These problems have been widely studied in the stringology literature since the 1970s, and are known to be solvable by near-linear time classical algorithms. In this work, we give quantum algorithms for these problems with near-optimal query complexities and time complexities. Specifically, we show that:
- Longest Common Substring can be solved by a quantum algorithm in $\tilde O(n^{2/3})$ time, improving upon the recent $\tilde O(n^{5/6})$-time algorithm by Le Gall and Seddighin (2020). Our algorithm uses the MNRS quantum walk framework, together with a careful combination of string synchronizing sets (Kempa and Kociumaka, 2019) and generalized difference covers.
- Lexicographically Minimal String Rotation can be solved by a quantum algorithm in $n^{1/2 + o(1)}$ time, improving upon the recent $\tilde O(n^{3/4})$-time algorithm by Wang and Ying (2020). We design our algorithm by first giving a new classical divide-and-conquer algorithm in near-linear time based on exclusion rules, and then speeding it up quadratically using nested Grover search and quantum minimum finding.
- Longest Square Substring can be solved by a quantum algorithm in $\tilde O(\sqrt{n})$ time. Our algorithm is an adaptation of the algorithm by Le Gall and Seddighin (2020) for the Longest Palindromic Substring problem, but uses additional techniques to overcome the difficulty that binary search no longer applies.
Our techniques naturally extend to other related string problems, such as Longest Repeated Substring, Longest Lyndon Substring, and Minimal Suffix.

[56]  arXiv:2110.09698 [pdf, other]
Title: Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge
Comments: 5 pages, 3 figures
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)

End-to-end TTS suffers from high data requirements as it is difficult for both costly speech corpora to cover all necessary knowledge and neural models to learn the knowledge, hence additional knowledge needs to be injected manually. For example, to capture pronunciation knowledge on languages without regular orthography, a complicated grapheme-to-phoneme pipeline needs to be built based on a structured, large pronunciation lexicon, leading to extra, sometimes high, costs to extend neural TTS to such languages. In this paper, we propose a framework to learn to extract knowledge from unstructured external resources using Token2Knowledge attention modules. The framework is applied to build a novel end-to-end TTS model named Neural Lexicon Reader that extracts pronunciations from raw lexicon texts. Experiments support the potential of our framework that the model significantly reduces pronunciation errors in low-resource, end-to-end Chinese TTS, and the lexicon-reading capability can be transferred to other languages with a smaller amount of data.

[57]  arXiv:2110.09699 [pdf, ps, other]
Title: Image Quality Assessment in the Modern Age
Authors: Kede Ma, Yuming Fang
Comments: ACM Multimedia 2021 Tutorial
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)

This tutorial provides the audience with the basic theories, methodologies, and current progresses of image quality assessment (IQA). From an actionable perspective, we will first revisit several subjective quality assessment methodologies, with emphasis on how to properly select visual stimuli. We will then present in detail the design principles of objective quality assessment models, supplemented by an in-depth analysis of their advantages and disadvantages. Both hand-engineered and (deep) learning-based methods will be covered. Moreover, the limitations with the conventional model comparison methodology for objective quality models will be pointed out, and novel comparison methodologies such as those based on the theory of "analysis by synthesis" will be introduced. We will last discuss the real-world multimedia applications of IQA, and give a list of open challenging problems, in the hope of encouraging more and more talented researchers and engineers devoting to this exciting and rewarding research field.

[58]  arXiv:2110.09702 [pdf, other]
Title: A non-hierarchical attention network with modality dropout for textual response generation in multimodal dialogue systems
Comments: For ICASSP 2022
Subjects: Computation and Language (cs.CL)

Existing text- and image-based multimodal dialogue systems use the traditional Hierarchical Recurrent Encoder-Decoder (HRED) framework, which has an utterance-level encoder to model utterance representation and a context-level encoder to model context representation. Although pioneer efforts have shown promising performances, they still suffer from the following challenges: (1) the interaction between textual features and visual features is not fine-grained enough. (2) the context representation can not provide a complete representation for the context. To address the issues mentioned above, we propose a non-hierarchical attention network with modality dropout, which abandons the HRED framework and utilizes attention modules to encode each utterance and model the context representation. To evaluate our proposed model, we conduct comprehensive experiments on a public multimodal dialogue dataset. Automatic and human evaluation demonstrate that our proposed model outperforms the existing methods and achieves state-of-the-art performance.

[59]  arXiv:2110.09707 [pdf]
Title: PI(t)D(t) Control and Motion Profiling for Omnidirectional Mobile Robots
Authors: Michael Zeng
Comments: 12 pages, 13 figures
Subjects: Robotics (cs.RO); Systems and Control (eess.SY)

Recently, a trend is emerging toward human-servicing autonomous mobile robots, with diverse applications including delivery of supplies in hospitals, hotels, or labs where personnel are scarce, or reacting to indoor emergencies. However, existing autonomous mobile robot (AMR) motion is slow and inefficient, a foundational barrier to proliferation of human-servicing applications. This research has developed a motion control architecture that demonstrates the potential of several algorithms for increasing speed and efficiency. These include a novel PI(t)D(t) controller that sets integral and derivative gains as functions of time, and motion-profiling applied for holonomic motion. Resulting performance indicates potential for faster, more efficient AMRs, that maintain high levels of accuracy and repeatability. The hope is that this research can serve as a proof of concept for faster motion-control, to remove a key barrier to further use of human-servicing mobile robots.

[60]  arXiv:2110.09710 [pdf, other]
Title: Inter-Sense: An Investigation of Sensory Blending in Fiction
Comments: 18 pages
Journal-ref: CEUR-WS.org 2021
Subjects: Computation and Language (cs.CL)

This study reports on the semantic organization of English sensory descriptors of the five basic senses of sight, hearing, touch, taste, and smell in a large corpus of over 8,000 fiction books. We introduce a large-scale text data-driven approach based on distributional-semantic word embeddings to identify and extract these descriptors as well as analyze their mixing interconnections in the resulting conceptual and sensory space. The findings are relevant for research on concept acquisition and representation, as well as for applications that can benefit from a better understanding of perceptual spaces of sensory experiences, in fiction, in particular, and in language in general.

[61]  arXiv:2110.09712 [pdf, other]
Title: Balancing Value Underestimation and Overestimationwith Realistic Actor-Critic
Subjects: Machine Learning (cs.LG)

Model-free deep reinforcement learning (RL) has been successfully applied to challenging continuous control domains. However, poor sample efficiency prevents these methods from being widely used in real-world domains. This paper introduces a novel model-free algorithm, Realistic Actor-Critic(RAC), which can be incorporated with any off-policy RL algorithms to improve sample efficiency. RAC employs Universal Value Function Approximators (UVFA) to simultaneously learn a policy family with the same neural network, each with different trade-offs between underestimation and overestimation. To learn such policies, we introduce uncertainty punished Q-learning, which uses uncertainty from the ensembling of multiple critics to build various confidence-bounds of Q-function. We evaluate RAC on the MuJoCo benchmark, achieving 10x sample efficiency and 25% performance improvement on the most challenging Humanoid environment compared to SAC.

[62]  arXiv:2110.09714 [pdf, other]
Title: Black-box Adversarial Attacks on Commercial Speech Platforms with Minimal Information
Comments: A version of this paper appears in the proceedings of the 28th ACM Conference on Computer and Communications Security (CCS 2021)
Subjects: Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

Adversarial attacks against commercial black-box speech platforms, including cloud speech APIs and voice control devices, have received little attention until recent years. The current "black-box" attacks all heavily rely on the knowledge of prediction/confidence scores to craft effective adversarial examples, which can be intuitively defended by service providers without returning these messages. In this paper, we propose two novel adversarial attacks in more practical and rigorous scenarios. For commercial cloud speech APIs, we propose Occam, a decision-only black-box adversarial attack, where only final decisions are available to the adversary. In Occam, we formulate the decision-only AE generation as a discontinuous large-scale global optimization problem, and solve it by adaptively decomposing this complicated problem into a set of sub-problems and cooperatively optimizing each one. Our Occam is a one-size-fits-all approach, which achieves 100% success rates of attacks with an average SNR of 14.23dB, on a wide range of popular speech and speaker recognition APIs, including Google, Alibaba, Microsoft, Tencent, iFlytek, and Jingdong, outperforming the state-of-the-art black-box attacks. For commercial voice control devices, we propose NI-Occam, the first non-interactive physical adversarial attack, where the adversary does not need to query the oracle and has no access to its internal information and training data. We combine adversarial attacks with model inversion attacks, and thus generate the physically-effective audio AEs with high transferability without any interaction with target devices. Our experimental results show that NI-Occam can successfully fool Apple Siri, Microsoft Cortana, Google Assistant, iFlytek and Amazon Echo with an average SRoA of 52% and SNR of 9.65dB, shedding light on non-interactive physical attacks against voice control devices.

[63]  arXiv:2110.09720 [pdf, other]
Title: Rep Works in Speaker Verification
Comments: submitted to ICASSP 2022
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)

Multi-branch convolutional neural network architecture has raised lots of attention in speaker verification since the aggregation of multiple parallel branches can significantly improve performance. However, this design is not efficient enough during the inference time due to the increase of model parameters and extra operations. In this paper, we present a new multi-branch network architecture RepSPKNet that uses a re-parameterization technique. With this technique, our backbone model contains an efficient VGG-like inference state while its training state is a complicated multi-branch structure. We first introduce the specific structure of RepVGG into speaker verification and propose several variants of this structure. The performance is evaluated on VoxCeleb-based test sets. We demonstrate that both the branch diversity and the branch capacity play important roles in RepSPKNet designing. Our RepSPKNet achieves state-of-the-art performance with a 1.5982% EER and a 0.1374 minDCF on VoxCeleb1-H.

[64]  arXiv:2110.09721 [pdf, other]
Title: Exploring the Sensory Spaces of English Perceptual Verbs in Natural Language Data
Comments: 19 pages
Journal-ref: CEUR-WS.org 2021
Subjects: Computation and Language (cs.CL)

In this study, we explore how language captures the meaning of words, in particular meaning related to sensory experiences learned from statistical distributions across texts. We focus on the most frequent perception verbs of English analyzed from an and Agentive vs. Experiential distinction across the five basic sensory modalities: Visual (to look vs. to see), Auditory (to listen vs. to hear), Tactile (to touch vs. to feel), Olfactory (to smell), and Gustatory (to taste). In this study we report on a data-driven approach based on distributional-semantic word embeddings and clustering models to identify and uncover the descriptor sensory spaces of the perception verbs. In the analysis, we identified differences and similarities of the generated descriptors based on qualitative and quantitative differences of the perceptual experience they denote. For instance, our results show that while the perceptual spaces of the experiential verbs like to see, to hear show a more detached, logical way of knowing and learning, their agentive counterparts (to look, listen) provide a more intentional as well as more intimate and intuitive way of discovering and interacting with the world around us. We believe that such an approach has a high potential to expand our understanding and the applicability of such sensory spaces to different fields of social and cultural analysis. Research on the semantic organization of sensory spaces for various applications might benefit from an the Agentive/Experiential account to address the complexity of multiple senses wired with each other in still unexplored ways.

[65]  arXiv:2110.09722 [pdf, ps, other]
Title: Batched Lipschitz Bandits
Subjects: Machine Learning (cs.LG)

In this paper, we study the batched Lipschitz bandit problem, where the expected reward is Lipschitz and the reward observations are collected in batches. We introduce a novel landscape-aware algorithm, called Batched Lipschitz Narrowing (BLiN), that naturally fits into the batched feedback setting. In particular, we show that for a $T$-step problem with Lipschitz reward of zooming dimension $d_z$, our algorithm achieves theoretically optimal regret rate of $ \widetilde{\mathcal{O}} \left( T^{\frac{d_z + 1}{d_z + 2}} \right) $ using only $ \mathcal{O} \left( \frac{\log T}{d_z} \right) $ batches. For the lower bound, we show that in an environment with $B$-batches, for any policy $\pi$, there exists a problem instance such that the expected regret is lower bounded by $ \widetilde{\Omega} \left(R_z(T)^\frac{1}{1-\left(\frac{1}{d+2}\right)^B}\right) $, where $R_z (T)$ is the regret lower bound for vanilla Lipschitz bandits that depends on the zooming dimension $d_z$, and $d$ is the dimension of the arm space.

[66]  arXiv:2110.09726 [pdf, other]
Title: CGNN: Traffic Classification with Graph Neural Network
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)

Traffic classification associates packet streams with known application labels, which is vital for network security and network management. With the rise of NAT, port dynamics, and encrypted traffic, it is increasingly challenging to obtain unified traffic features for accurate classification. Many state-of-the-art traffic classifiers automatically extract features from the packet stream based on deep learning models such as convolution networks. Unfortunately, the compositional and causal relationships between packets are not well extracted in these deep learning models, which affects both prediction accuracy and generalization on different traffic types.
In this paper, we present a chained graph model on the packet stream to keep the chained compositional sequence. Next, we propose CGNN, a graph neural network based traffic classification method, which builds a graph classifier over automatically extracted features over the chained graph.
Extensive evaluation over real-world traffic data sets, including normal, encrypted and malicious labels, show that, CGNN improves the prediction accuracy by 23\% to 29\% for application classification, by 2\% to 37\% for malicious traffic classification, and reaches the same accuracy level for encrypted traffic classification. CGNN is quite robust in terms of the recall and precision metrics. We have extensively evaluated the parameter sensitivity of CGNN, which yields optimized parameters that are quite effective for traffic classification.

[67]  arXiv:2110.09733 [pdf, ps, other]
Title: Franchised Quantum Money
Subjects: Cryptography and Security (cs.CR); Quantum Physics (quant-ph)

The construction of public key quantum money based on standard cryptographic assumptions is a longstanding open question. Here we introduce franchised quantum money, an alternative form of quantum money that is easier to construct. Franchised quantum money retains the features of a useful quantum money scheme, namely unforgeability and local verification: anyone can verify banknotes without communicating with the bank. In franchised quantum money, every user gets a unique secret verification key, and the scheme is secure against counterfeiting and sabotage, a new security notion that appears in the franchised model. Finally, we construct franchised quantum money and prove security assuming one-way functions.

[68]  arXiv:2110.09734 [pdf, other]
Title: Mask-aware IoU for Anchor Assignment in Real-time Instance Segmentation
Comments: BMVC 2021, camera ready version
Subjects: Computer Vision and Pattern Recognition (cs.CV)

This paper presents Mask-aware Intersection-over-Union (maIoU) for assigning anchor boxes as positives and negatives during training of instance segmentation methods. Unlike conventional IoU or its variants, which only considers the proximity of two boxes; maIoU consistently measures the proximity of an anchor box with not only a ground truth box but also its associated ground truth mask. Thus, additionally considering the mask, which, in fact, represents the shape of the object, maIoU enables a more accurate supervision during training. We present the effectiveness of maIoU on a state-of-the-art (SOTA) assigner, ATSS, by replacing IoU operation by our maIoU and training YOLACT, a SOTA real-time instance segmentation method. Using ATSS with maIoU consistently outperforms (i) ATSS with IoU by $\sim 1$ mask AP, (ii) baseline YOLACT with fixed IoU threshold assigner by $\sim 2$ mask AP over different image sizes and (iii) decreases the inference time by $25 \%$ owing to using less anchors. Then, exploiting this efficiency, we devise maYOLACT, a faster and $+6$ AP more accurate detector than YOLACT. Our best model achieves $37.7$ mask AP at $25$ fps on COCO test-dev establishing a new state-of-the-art for real-time instance segmentation. Code is available at https://github.com/kemaloksuz/Mask-aware-IoU

[69]  arXiv:2110.09741 [pdf, other]
Title: Trajectory Prediction with Linguistic Representations
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)

Language allows humans to build mental models that interpret what is happening around them resulting in more accurate long-term predictions. We present a novel trajectory prediction model that uses linguistic intermediate representations to forecast trajectories, and is trained using trajectory samples with partially annotated captions. The model learns the meaning of each of the words without direct per-word supervision. At inference time, it generates a linguistic description of trajectories which captures maneuvers and interactions over an extended time interval. This generated description is used to refine predictions of the trajectories of multiple agents. We train and validate our model on the Argoverse dataset, and demonstrate improved accuracy results in trajectory prediction. In addition, our model is more interpretable: it presents part of its reasoning in plain language as captions, which can aid model development and can aid in building confidence in the model before deploying it.

[70]  arXiv:2110.09742 [pdf, other]
Title: Learning Not to Reconstruct Anomalies
Comments: Accepted in BMVC 2021
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Video anomaly detection is often seen as one-class classification (OCC) problem due to the limited availability of anomaly examples. Typically, to tackle this problem, an autoencoder (AE) is trained to reconstruct the input with training set consisting only of normal data. At test time, the AE is then expected to well reconstruct the normal data while poorly reconstructing the anomalous data. However, several studies have shown that, even with only normal data training, AEs can often start reconstructing anomalies as well which depletes the anomaly detection performance. To mitigate this problem, we propose a novel methodology to train AEs with the objective of reconstructing only normal data, regardless of the input (i.e., normal or abnormal). Since no real anomalies are available in the OCC settings, the training is assisted by pseudo anomalies that are generated by manipulating normal data to simulate the out-of-normal-data distribution. We additionally propose two ways to generate pseudo anomalies: patch and skip frame based. Extensive experiments on three challenging video anomaly datasets demonstrate the effectiveness of our method in improving conventional AEs, achieving state-of-the-art performance.

[71]  arXiv:2110.09745 [pdf, other]
Title: UAV Path Planning for Optimal Coverage of Areas with Nonuniform Importance
Comments: 9 pages, 5 figures
Subjects: Robotics (cs.RO)

Coverage of an inaccessible or difficult terrain with potential health and safety hazards, such as in a volcanic region, is difficult yet crucial from scientific and meteorological perspectives. Areas contained within this region can provide us with different types of valuable information of varying importance. We present an algorithm to optimally cover a volcanic region in Hawai`i with an unmanned aerial vehicle (UAV). The target region is assigned with a nonuniform coverage importance score distribution. For a specified battery capacity of the UAV, the optimization problem seeks the path that maximizes the total coverage area and the accumulated importance score while penalizing the revisiting of the same area. Trajectories are generated offline for the UAV based on the available power and coverage information map. The optimal trajectory minimizes the unspent battery power while enforcing that the UAV returns to its starting location. This multi-objective optimization problem is solved by using sequential quadratic programming. The details of the competitive optimization problem are discussed along with the analysis and simulation results to demonstrate the applicability of the proposed algorithm.

[72]  arXiv:2110.09748 [pdf, other]
Title: User Based Design and Evaluation Pipelineo for Indoor Airships
Comments: Submitting to ICRA 2022
Subjects: Systems and Control (eess.SY)

Designing a controllable airship for non-expert users or preemptively evaluating the performance of desired airships has always been a very challenging problem. This paper explores the blimp design parameter space from the aspect of the user by considering various distributions of thrust, combinations of propulsive mechanisms, and balloon shapes. We provide open-source modular hardware and reconfigurable software design tools that allow inexperienced users to design a custom airship in a short time. Based on these design parameters, this paper develops a more engineering-focused evaluation system that can characterize the performance of different indoor blimps. An analytical comparison and some case studies that consider various points in the design parameter space have been conducted to prove the feasibility and validity of our design and evaluation system.

[73]  arXiv:2110.09749 [pdf, other]
Title: Importance Estimation from Multiple Perspectives for Keyphrase Extraction
Comments: 11 pages, 3 figures, Accepted by EMNLP 2021
Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR)

Keyphrase extraction is a fundamental task in Natural Language Processing, which usually contains two main parts: candidate keyphrase extraction and keyphrase importance estimation. From the view of human understanding documents, we typically measure the importance of phrase according to its syntactic accuracy, information saliency, and concept consistency simultaneously. However, most existing keyphrase extraction approaches only focus on the part of them, which leads to biased results. In this paper, we propose a new approach to estimate the importance of keyphrase from multiple perspectives (called as \textit{KIEMP}) and further improve the performance of keyphrase extraction. Specifically, \textit{KIEMP} estimates the importance of phrase with three modules: a chunking module to measure its syntactic accuracy, a ranking module to check its information saliency, and a matching module to judge the concept (i.e., topic) consistency between phrase and the whole document. These three modules are seamlessly jointed together via an end-to-end multi-task learning model, which is helpful for three parts to enhance each other and balance the effects of three perspectives. Experimental results on six benchmark datasets show that \textit{KIEMP} outperforms the existing state-of-the-art keyphrase extraction approaches in most cases.

[74]  arXiv:2110.09751 [pdf, other]
Title: A Lightweight, High-Extension, Planar 3-Degree-of-Freedom Manipulator Using Pinched Bistable Tapes
Comments: ICRA 2022
Subjects: Robotics (cs.RO)

To facilitate sensing and physical interaction in remote and/or constrained environments, high-extension, lightweight robot manipulators are easier to transport and reach substantially further than traditional serial chain manipulators. We propose a novel planar 3-degree-of-freedom manipulator that achieves low weight and high extension through the use of a pair of spooling bistable tapes, commonly used in self-retracting tape measures, which are pinched together to form a reconfigurable revolute joint. The pinching action flattens the tapes to produce a localized bending region, resulting in a revolute joint that can change its orientation by cable tension and its location on the tapes though friction-driven movement of the pinching mechanism. We present the design, implementation, kinematic modeling, stiffness behavior of the revolute joint, and quasi-static performance of this manipulator. In particular, we demonstrate the ability of the manipulator to reach specified targets in free space, reach a 2D target with various orientations, and maintain an end-effector angle or stationary bending point while changing the other. The long-term goal of this work is to integrate the manipulator with an unmanned aerial vehicle to enable more capable aerial manipulation.

[75]  arXiv:2110.09753 [pdf, other]
Title: Unifying Multimodal Transformer for Bi-directional Image and Text Generation
Comments: ACM MM 2021 (Industrial Track). Code: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Multimedia (cs.MM)

We study the joint learning of image-to-text and text-to-image generations, which are naturally bi-directional tasks. Typical existing works design two separate task-specific models for each task, which impose expensive design efforts. In this work, we propose a unified image-and-text generative framework based on a single multimodal model to jointly study the bi-directional tasks. We adopt Transformer as our unified architecture for its strong performance and task-agnostic design. Specifically, we formulate both tasks as sequence generation tasks, where we represent images and text as unified sequences of tokens, and the Transformer learns multimodal interactions to generate sequences. We further propose two-level granularity feature representations and sequence-level training to improve the Transformer-based unified framework. Experiments show that our approach significantly improves previous Transformer-based model X-LXMERT's FID from 37.0 to 29.9 (lower is better) for text-to-image generation, and improves CIDEr-D score from 100.9% to 122.6% for fine-tuned image-to-text generation on the MS-COCO dataset. Our code is available online.

[76]  arXiv:2110.09755 [pdf, other]
Title: MetricHaven -- More Than 23,000 Metrics for Measuring Quality Attributes of Software Product Lines
Journal-ref: Proceedings of the 23rd International Systems and Software Product Line Conference (SPLC '19) - Volume B, 2019, pages 25-28
Subjects: Software Engineering (cs.SE)

Variability-aware metrics are designed to measure qualitative aspects of software product lines. As we identified in a prior SLR \cite{El-SharkawyYamagishi-EichlerSchmid19}, there exist already many metrics that address code or variability separately, while the combination of both has been less researched. MetricHaven fills this gap, as it extensively supports combining information from code files and variability models. Further, we also enable the combination of well established single system metrics with novel variability-aware metrics, going beyond existing variability-aware metrics. Our tool supports most prominent single system and variability-aware code metrics. We provide configuration support for already implemented metrics, resulting in 23,342 metric variations. Further, we present an abstract syntax tree developed for MetricHaven, that allows the realization of additional code metrics.
Tool: https://github.com/KernelHaven/MetricHaven
Video: https://youtu.be/vPEmD5Sr6gM

[77]  arXiv:2110.09756 [pdf, other]
Title: A Picture is Worth a Thousand Words: A Unified System for Diverse Captions and Rich Images Generation
Comments: ACM MM 2021 (Video and Demo Track). Code: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Multimedia (cs.MM)

A creative image-and-text generative AI system mimics humans' extraordinary abilities to provide users with diverse and comprehensive caption suggestions, as well as rich image creations. In this work, we demonstrate such an AI creation system to produce both diverse captions and rich images. When users imagine an image and associate it with multiple captions, our system paints a rich image to reflect all captions faithfully. Likewise, when users upload an image, our system depicts it with multiple diverse captions. We propose a unified multi-modal framework to achieve this goal. Specifically, our framework jointly models image-and-text representations with a Transformer network, which supports rich image creation by accepting multiple captions as input. We consider the relations among input captions to encourage diversity in training and adopt a non-autoregressive decoding strategy to enable real-time inference. Based on these, our system supports both diverse captions and rich images generations. Our code is available online.

[78]  arXiv:2110.09758 [pdf, other]
Title: KernelHaven -- An Open Infrastructure for Product Line Analysis
Journal-ref: Proceedings of the 22nd International Systems and Software Product Line Conference (SPLC '18) - Volume 2, 2018, pages 5-10
Subjects: Software Engineering (cs.SE)

KernelHaven is an open infrastructure for Software Product Line (SPL) analysis. It is intended both as a production-quality analysis tool set as well as a research support tool, e.g., to support researchers in systematically exploring research hypothesis. For flexibility and ease of experimentation KernelHaven components are plug-ins for extracting certain information from SPL artifacts and processing this information, e.g., to check the correctness and consistency of variability information or to apply metrics. A configuration-based setup along with automatic documentation functionality allows different experiments and supports their easy reproduction. Here, we describe KernelHaven as a product line analysis research tool and highlight its basic approach as well as its fundamental capabilities. In particular, we describe available information extraction and processing plug-ins and how to combine them. On this basis, researchers and interested professional users can rapidly conduct a first set of experiments. Further, we describe the concepts for extending KernelHaven by new plug-ins, which reduces development effort when realizing new experiments.

[79]  arXiv:2110.09759 [pdf]
Title: A Regularization Method to Improve Adversarial Robustness of Neural Networks for ECG Signal Classification
Comments: 12 pages, 7 figures
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Signal Processing (eess.SP)

Electrocardiogram (ECG) is the most widely used diagnostic tool to monitor the condition of the human heart. By using deep neural networks (DNNs), interpretation of ECG signals can be fully automated for the identification of potential abnormalities in a patient's heart in a fraction of a second. Studies have shown that given a sufficiently large amount of training data, DNN accuracy for ECG classification could reach human-expert cardiologist level. However, despite of the excellent performance in classification accuracy, DNNs are highly vulnerable to adversarial noises that are subtle changes in the input of a DNN and may lead to a wrong class-label prediction. It is challenging and essential to improve robustness of DNNs against adversarial noises, which are a threat to life-critical applications. In this work, we proposed a regularization method to improve DNN robustness from the perspective of noise-to-signal ratio (NSR) for the application of ECG signal classification. We evaluated our method on PhysioNet MIT-BIH dataset and CPSC2018 ECG dataset, and the results show that our method can substantially enhance DNN robustness against adversarial noises generated from adversarial attacks, with a minimal change in accuracy on clean data.

[80]  arXiv:2110.09764 [pdf, other]
Title: Detecting Blurred Ground-based Sky/Cloud Images
Comments: Accepted in Proc. IEEE AP-S Symposium on Antennas and Propagation and USNC-URSI Radio Science Meeting, 2021
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Ground-based whole sky imagers (WSIs) are being used by researchers in various fields to study the atmospheric events. These ground-based sky cameras capture visible-light images of the sky at regular intervals of time. Owing to the atmospheric interference and camera sensor noise, the captured images often exhibit noise and blur. This may pose a problem in subsequent image processing stages. Therefore, it is important to accurately identify the blurred images. This is a difficult task, as clouds have varying shapes, textures, and soft edges whereas the sky acts as a homogeneous and uniform background. In this paper, we propose an efficient framework that can identify the blurred sky/cloud images. Using a static external marker, our proposed methodology has a detection accuracy of 94\%. To the best of our knowledge, our approach is the first of its kind in the automatic identification of blurred images for ground-based sky/cloud images.

[81]  arXiv:2110.09765 [pdf, ps, other]
Title: Evolutionary Equilibrium Analysis for Decision on Block Size in Blockchain Systems
Comments: 19 pages, 2 figures
Subjects: Computer Science and Game Theory (cs.GT)

In a PoW-based blockchain network, mining pools (the solo miner could be regarded as a mining pool containing one miner) compete to successfully mine blocks to pursue rewards. Generally, the rewards include the fixed block subsidies and time-varying transaction fees. The transaction fees are offered by the senders whose transactions are packaged into blocks and is increasing with the block size. However, the larger size of a block brings the longer latency, resulting in a smaller probability of successfully mining. Therefore, finding the optimal block size to trade off these two factors is a complex and crucial problem for the mining pools. In this paper, we model a repeated mining competition dynamics in blockchain system as an evolutionary game to study the interactions among mining pools. In this game, each pool has two strategies: to follow the default size $\bar{B}$, i.e., the upper bound of a block size, or not follow. Because of the bounded rationality, each mining pool pursues its evolutionary stable block size (ESS) according to the mining pools' computing power and other factors by continuous learning and adjustments during the whole mining process. A study framework is built for the general evolutionary game, based on which we then theoretically explore the existence and stability of the ESSs for a case of two mining pools. Numerical experiments with real Bitcoin data are conducted to show the evolutionary decisions of mining pools and to demonstrate the theoretical findings in this paper.

[82]  arXiv:2110.09766 [pdf, ps, other]
Title: Memory-Augmented Deep Unfolding Network for Compressive Sensing
Comments: 10 pages, 7 figures
Journal-ref: ACM MM 2021
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)

Mapping a truncated optimization method into a deep neural network, deep unfolding network (DUN) has attracted growing attention in compressive sensing (CS) due to its good interpretability and high performance. Each stage in DUNs corresponds to one iteration in optimization. By understanding DUNs from the perspective of the human brain's memory processing, we find there exists two issues in existing DUNs. One is the information between every two adjacent stages, which can be regarded as short-term memory, is usually lost seriously. The other is no explicit mechanism to ensure that the previous stages affect the current stage, which means memory is easily forgotten. To solve these issues, in this paper, a novel DUN with persistent memory for CS is proposed, dubbed Memory-Augmented Deep Unfolding Network (MADUN). We design a memory-augmented proximal mapping module (MAPMM) by combining two types of memory augmentation mechanisms, namely High-throughput Short-term Memory (HSM) and Cross-stage Long-term Memory (CLM). HSM is exploited to allow DUNs to transmit multi-channel short-term memory, which greatly reduces information loss between adjacent stages. CLM is utilized to develop the dependency of deep information across cascading stages, which greatly enhances network representation capability. Extensive CS experiments on natural and MR images show that with the strong ability to maintain and balance information our MADUN outperforms existing state-of-the-art methods by a large margin. The source code is available at https://github.com/jianzhangcs/MADUN/.

[83]  arXiv:2110.09767 [pdf, other]
Title: Pre and Post Counting for Scalable Statistical-Relational Model Discovery
Comments: Presented at the Tenth International Workshop on Statistical Relational AI at the 1st International Joint Conference on Learning & Reasoning
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Databases (cs.DB)

Statistical-Relational Model Discovery aims to find statistically relevant patterns in relational data. For example, a relational dependency pattern may stipulate that a user's gender is associated with the gender of their friends. As with propositional (non-relational) graphical models, the major scalability bottleneck for model discovery is computing instantiation counts: the number of times a relational pattern is instantiated in a database. Previous work on propositional learning utilized pre-counting or post-counting to solve this task. This paper takes a detailed look at the memory and speed trade-offs between pre-counting and post-counting strategies for relational learning. A pre-counting approach computes and caches instantiation counts for a large set of relational patterns before model search. A post-counting approach computes an instantiation count dynamically on-demand for each candidate pattern generated during the model search. We describe a novel hybrid approach, tailored to relational data, that achieves a sweet spot with pre-counting for patterns involving positive relationships (e.g. pairs of users who are friends) and post-counting for patterns involving negative relationships (e.g. pairs of users who are not friends). Our hybrid approach scales model discovery to millions of data facts.

[84]  arXiv:2110.09768 [pdf, other]
Title: Synthetic Temporal Anomaly Guided End-to-End Video Anomaly Detection
Comments: Published at ICCV Workshops 2021. this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Due to the limited availability of anomaly examples, video anomaly detection is often seen as one-class classification (OCC) problem. A popular way to tackle this problem is by utilizing an autoencoder (AE) trained only on normal data. At test time, the AE is then expected to reconstruct the normal input well while reconstructing the anomalies poorly. However, several studies show that, even with normal data only training, AEs can often start reconstructing anomalies as well which depletes their anomaly detection performance. To mitigate this, we propose a temporal pseudo anomaly synthesizer that generates fake-anomalies using only normal data. An AE is then trained to maximize the reconstruction loss on pseudo anomalies while minimizing this loss on normal data. This way, the AE is encouraged to produce distinguishable reconstructions for normal and anomalous frames. Extensive experiments and analysis on three challenging video anomaly datasets demonstrate the effectiveness of our approach to improve the basic AEs in achieving superiority against several existing state-of-the-art models.

[85]  arXiv:2110.09769 [pdf, other]
Title: Digital transformation of droplet/aerosol infection risk assessment realized on "Fugaku" for the fight against COVID-19
Comments: 24 pages, 12 figures
Subjects: Computational Engineering, Finance, and Science (cs.CE); Fluid Dynamics (physics.flu-dyn)

The fastest supercomputer in 2020, Fugaku, has not only achieved digital transformation of epidemiology in allowing end-to-end, detailed quantitative modeling of COVID-19 transmissions for the first time, but also transformed the behavior of the entire Japanese public through its detailed analysis of transmission risks in multitudes of societal situations entailing heavy risks. A novel aerosol simulation methodology was synthesized out of a combination of a new CFD methods meeting industrial demands, CUBE, which not only allowed the simulations to scale massively with high resolution required for micrometer virus-containing aerosol particles, but also extremely rapid time-to-solution due to its ability to generate the digital twins representing multitudes of societal situations in minutes not week, attaining true overall application high performance; such simulations have been running for the past 1.5 years on Fugaku, cumulatively consuming top supercomputer-class resources and the result communicated by the media as well as becoming official public policies.

[86]  arXiv:2110.09770 [pdf, other]
Title: AEFE: Automatic Embedded Feature Engineering for Categorical Features
Comments: 24 pages, 6 figures, 13 tables
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)

The challenge of solving data mining problems in e-commerce applications such as recommendation system (RS) and click-through rate (CTR) prediction is how to make inferences by constructing combinatorial features from a large number of categorical features while preserving the interpretability of the method. In this paper, we propose Automatic Embedded Feature Engineering(AEFE), an automatic feature engineering framework for representing categorical features, which consists of various components including custom paradigm feature construction and multiple feature selection. By selecting the potential field pairs intelligently and generating a series of interpretable combinatorial features, our framework can provide a set of unseen generated features for enhancing model performance and then assist data analysts in discovering the feature importance for particular data mining tasks. Furthermore, AEFE is distributed implemented by task-parallelism, data sampling, and searching schema based on Matrix Factorization field combination, to optimize the performance and enhance the efficiency and scalability of the framework. Experiments conducted on some typical e-commerce datasets indicate that our method outperforms the classical machine learning models and state-of-the-art deep learning models.

[87]  arXiv:2110.09771 [pdf, ps, other]
Title: On Reward-Free RL with Kernel and Neural Function Approximations: Single-Agent MDP and Markov Game
Comments: ICML 2021
Subjects: Machine Learning (cs.LG); Computer Science and Game Theory (cs.GT); Machine Learning (stat.ML)

To achieve sample efficiency in reinforcement learning (RL), it necessitates efficiently exploring the underlying environment. Under the offline setting, addressing the exploration challenge lies in collecting an offline dataset with sufficient coverage. Motivated by such a challenge, we study the reward-free RL problem, where an agent aims to thoroughly explore the environment without any pre-specified reward function. Then, given any extrinsic reward, the agent computes the policy via a planning algorithm with offline data collected in the exploration phase. Moreover, we tackle this problem under the context of function approximation, leveraging powerful function approximators.
Specifically, we propose to explore via an optimistic variant of the value-iteration algorithm incorporating kernel and neural function approximations, where we adopt the associated exploration bonus as the exploration reward. Moreover, we design exploration and planning algorithms for both single-agent MDPs and zero-sum Markov games and prove that our methods can achieve $\widetilde{\mathcal{O}}(1 /\varepsilon^2)$ sample complexity for generating a $\varepsilon$-suboptimal policy or $\varepsilon$-approximate Nash equilibrium when given an arbitrary extrinsic reward. To the best of our knowledge, we establish the first provably efficient reward-free RL algorithm with kernel and neural function approximators.

[88]  arXiv:2110.09772 [pdf, other]
Title: Synergy between 3DMM and 3D Landmarks for Accurate 3D Facial Geometry
Comments: Accepted at 3DV 2021. arXiv admin note: substantial text overlap with arXiv:2104.08403
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)

This work studies learning from a synergy process of 3D Morphable Models (3DMM) and 3D facial landmarks to predict complete 3D facial geometry, including 3D alignment, face orientation, and 3D face modeling. Our synergy process leverages a representation cycle for 3DMM parameters and 3D landmarks. 3D landmarks can be extracted and refined from face meshes built by 3DMM parameters. We next reverse the representation direction and show that predicting 3DMM parameters from sparse 3D landmarks improves the information flow. Together we create a synergy process that utilizes the relation between 3D landmarks and 3DMM parameters, and they collaboratively contribute to better performance. We extensively validate our contribution on full tasks of facial geometry prediction and show our superior and robust performance on these tasks for various scenarios. Particularly, we adopt only simple and widely-used network operations to attain fast and accurate facial geometry prediction. Codes and data: https://choyingw.github.io/works/SynergyNet/

[89]  arXiv:2110.09773 [pdf]
Title: Effective calculation of the causal capacitance matrix of a multiconductor transmission line in the range of parameters by the method of moments
Comments: in Russian
Subjects: Numerical Analysis (math.NA)

The paper proposes a technique of non-uniform segmentation of multiconductor transmission lines with edge coupling. This technique allows obtaining causal capacitance matrices with the lowest computational costs on multivariate analysis in the range of the lines parameters. It was found that time savings could achieve 49%.

[90]  arXiv:2110.09775 [pdf, other]
Title: Aesthetic Photo Collage with Deep Reinforcement Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)

Photo collage aims to automatically arrange multiple photos on a given canvas with high aesthetic quality. Existing methods are based mainly on handcrafted feature optimization, which cannot adequately capture high-level human aesthetic senses. Deep learning provides a promising way, but owing to the complexity of collage and lack of training data, a solution has yet to be found. In this paper, we propose a novel pipeline for automatic generation of aspect ratio specified collage and the reinforcement learning technique is introduced in collage for the first time. Inspired by manual collages, we model the collage generation as sequential decision process to adjust spatial positions, orientation angles, placement order and the global layout. To instruct the agent to improve both the overall layout and local details, the reward function is specially designed for collage, considering subjective and objective factors. To overcome the lack of training data, we pretrain our deep aesthetic network on a large scale image aesthetic dataset (CPC) for general aesthetic feature extraction and propose an attention fusion module for structural collage feature representation. We test our model against competing methods on two movie datasets and our results outperform others in aesthetic quality evaluation. Further user study is also conducted to demonstrate the effectiveness.

[91]  arXiv:2110.09777 [pdf, other]
Title: Towards Toxic and Narcotic Medication Detection with Rotated Object Detector
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)

Recent years have witnessed the advancement of deep learning vision technologies and applications in the medical industry. Intelligent devices for special medication management are in great need of, which requires more precise detection algorithms to identify the specifications and locations. In this work, YOLO (You only look once) based object detectors are tailored for toxic and narcotic medications detection tasks. Specifically, a more flexible annotation with rotated degree ranging from $0^\circ$ to $90^\circ$ and a mask-mapping-based non-maximum suppression method are proposed to achieve a feasible and efficient medication detector aiming at arbitrarily oriented bounding boxes. Extensive experiments demonstrate that the rotated YOLO detectors are more suitable for identifying densely arranged drugs. The best shot mean average precision of the proposed network reaches 0.811 while the inference time is less than 300ms.

[92]  arXiv:2110.09778 [pdf, other]
Title: Explaining Deep Tractable Probabilistic Models: The sum-product network case
Comments: Main paper: 8 pages, references: 1 page. Main paper: 4 figures
Subjects: Machine Learning (cs.LG)

We consider the problem of explaining a tractable deep probabilistic model, the Sum-Product Networks (SPNs).To this effect, we define the notion of a context-specific independence tree and present an iterative algorithm that converts an SPN to a CSI-tree. The resulting CSI-tree is both interpretable and explainable to the domain expert. To further compress the tree, we approximate the CSIs by fitting a supervised classifier. Our extensive empirical evaluations on synthetic, standard, and real-world clinical data sets demonstrate that the resulting models exhibit superior explainability without loss in performance.

[93]  arXiv:2110.09779 [pdf, other]
Title: Open-domain clarification question generation without question examples
Comments: EMNLP 2021
Subjects: Computation and Language (cs.CL)

An overarching goal of natural language processing is to enable machines to communicate seamlessly with humans. However, natural language can be ambiguous or unclear. In cases of uncertainty, humans engage in an interactive process known as repair: asking questions and seeking clarification until their uncertainty is resolved. We propose a framework for building a visually grounded question-asking model capable of producing polar (yes-no) clarification questions to resolve misunderstandings in dialogue. Our model uses an expected information gain objective to derive informative questions from an off-the-shelf image captioner without requiring any supervised question-answer data. We demonstrate our model's ability to pose questions that improve communicative success in a goal-oriented 20 questions game with synthetic and human answerers.

[94]  arXiv:2110.09780 [pdf, other]
Title: Improving Emotional Speech Synthesis by Using SUS-Constrained VAE and Text Encoder Aggregation
Comments: submitted to ICASSP2022
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)

Learning emotion embedding from reference audio is a straightforward approach for multi-emotion speech synthesis in encoder-decoder systems. But how to get better emotion embedding and how to inject it into TTS acoustic model more effectively are still under investigation. In this paper, we propose an innovative constraint to help VAE extract emotion embedding with better cluster cohesion. Besides, the obtained emotion embedding is used as query to aggregate latent representations of all encoder layers via attention. Moreover, the queries from encoder layers themselves are also helpful. Experiments prove the proposed methods can enhance the encoding of comprehensive syntactic and semantic information and produce more expressive emotional speech.

[95]  arXiv:2110.09783 [pdf, other]
Title: Spatial-Temporal Transformer for 3D Point Cloud Sequences
Journal-ref: WACV2022
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Effective learning of spatial-temporal information within a point cloud sequence is highly important for many down-stream tasks such as 4D semantic segmentation and 3D action recognition. In this paper, we propose a novel framework named Point Spatial-Temporal Transformer (PST2) to learn spatial-temporal representations from dynamic 3D point cloud sequences. Our PST2 consists of two major modules: a Spatio-Temporal Self-Attention (STSA) module and a Resolution Embedding (RE) module. Our STSA module is introduced to capture the spatial-temporal context information across adjacent frames, while the RE module is proposed to aggregate features across neighbors to enhance the resolution of feature maps. We test the effectiveness our PST2 with two different tasks on point cloud sequences, i.e., 4D semantic segmentation and 3D action recognition. Extensive experiments on three benchmarks show that our PST2 outperforms existing methods on all datasets. The effectiveness of our STSA and RE modules have also been justified with ablation experiments.

[96]  arXiv:2110.09784 [pdf, other]
Title: SSAST: Self-Supervised Audio Spectrogram Transformer
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)

Recently, neural networks based purely on self-attention, such as the Vision Transformer (ViT), have been shown to outperform deep learning models constructed with convolutional neural networks (CNNs) on various vision tasks, thus extending the success of Transformers, which were originally developed for language processing, to the vision domain. A recent study showed that a similar methodology can also be applied to the audio domain. Specifically, the Audio Spectrogram Transformer (AST) achieves state-of-the-art results on various audio classification benchmarks. However, pure Transformer models tend to require more training data compared to CNNs, and the success of the AST relies on supervised pretraining that requires a large amount of labeled data and a complex training pipeline, thus limiting the practical usage of AST.
This paper focuses on audio and speech classification, and aims to alleviate the data requirement issues with the AST by leveraging self-supervised learning using unlabeled data. Specifically, we propose to pretrain the AST model with joint discriminative and generative masked spectrogram patch modeling (MSPM) using unlabeled audio from AudioSet and Librispeech. We evaluate our pretrained models on both audio and speech classification tasks including audio event classification, keyword spotting, emotion recognition, and speaker identification. The proposed self-supervised framework significantly boosts AST performance on all tasks, with an average improvement of 60.9%, leading to similar or even better results than a supervised pretrained AST. To the best of our knowledge, it is the first patch-based self-supervised learning framework in the audio and speech domain, and also the first self-supervised learning framework for AST.

[97]  arXiv:2110.09786 [pdf, other]
Title: Event-Triggered Tracking Control of Networked and Quantized Control Systems
Comments: Here is the 8-page version of the one accepted by ECC2021
Subjects: Systems and Control (eess.SY)

This paper studies the tracking control problem of networked and quantized control systems under both multiple networks and event-triggered mechanisms. Multiple networks are to connect the plant and reference system with decentralized controllers to guarantee their information transmission, whereas event-triggered mechanisms are to reduce the information transmission via multiple networks. In this paper, all networks are independent and asynchronous and have local event-triggered mechanisms, which are based on local measurements and determine whether the local measurements need to be transmitted. We first implement an emulation-based approach to develop a novel hybrid model for tracking control of networked and quantized control systems. Next, sufficient conditions are derived and decentralized event-triggered mechanisms are designed to ensure the tracking performance. Finally, a numerical example is given to illustrate the obtained results.

[98]  arXiv:2110.09788 [pdf, other]
Title: CIPS-3D: A 3D-Aware Generator of GANs Based on Conditionally-Independent Pixel Synthesis
Comments: 3D-aware GANs based on NeRF, this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)

The style-based GAN (StyleGAN) architecture achieved state-of-the-art results for generating high-quality images, but it lacks explicit and precise control over camera poses. The recently proposed NeRF-based GANs made great progress towards 3D-aware generators, but they are unable to generate high-quality images yet. This paper presents CIPS-3D, a style-based, 3D-aware generator that is composed of a shallow NeRF network and a deep implicit neural representation (INR) network. The generator synthesizes each pixel value independently without any spatial convolution or upsampling operation. In addition, we diagnose the problem of mirror symmetry that implies a suboptimal solution and solve it by introducing an auxiliary discriminator. Trained on raw, single-view images, CIPS-3D sets new records for 3D-aware image synthesis with an impressive FID of 6.97 for images at the $256\times256$ resolution on FFHQ. We also demonstrate several interesting directions for CIPS-3D such as transfer learning and 3D-aware face stylization. The synthesis results are best viewed as videos, so we recommend the readers to check our github project at https://github.com/PeterouZh/CIPS-3D

[99]  arXiv:2110.09789 [pdf, ps, other]
Title: Cyclic and Quasi-Cyclic DNA Codes
Comments: draft, 16 pages
Subjects: Information Theory (cs.IT); Rings and Algebras (math.RA)

In this paper, we discuss DNA codes that are cyclic or quasi-cyclic over $\Z_{4}+\omega \Z_{4}$, where $\omega^{2}=2+2\omega$ along with methods to construct these with combinatorial constraints. We also generalize results obtained for the ring $\Z_{4}+\omega \Z_{4}$, where $\omega^{2}=2+2\omega$, and some other rings to the sixteen rings $R_{\theta}=\Z_{4}+\omega \Z_{4}$, where $\omega^{2}=\theta\in \Z_{4}+\omega \Z_{4}$, using the generalized Gau map and Gau distance in \cite{3}. We determine a relationship between the Gau distance and Hamming distance for linear codes over the sixteen rings $R_{\theta}$ which enables us to attain an upper boundary for the Gau distance of free codes that are self-dual over the rings $R_{\theta}$.

[100]  arXiv:2110.09795 [pdf, other]
Title: Geo-DefakeHop: High-Performance Geographic Fake Image Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)

A robust fake satellite image detection method, called Geo-DefakeHop, is proposed in this work. Geo-DefakeHop is developed based on the parallel subspace learning (PSL) methodology. PSL maps the input image space into several feature subspaces using multiple filter banks. By exploring response differences of different channels between real and fake images for a filter bank, Geo-DefakeHop learns the most discriminant channels and uses their soft decision scores as features. Then, Geo-DefakeHop selects a few discriminant features from each filter bank and ensemble them to make a final binary decision. Geo-DefakeHop offers a light-weight high-performance solution to fake satellite images detection. Its model size is analyzed, which ranges from 0.8 to 62K parameters. Furthermore, it is shown by experimental results that it achieves an F1-score higher than 95\% under various common image manipulations such as resizing, compression and noise corruption.

[101]  arXiv:2110.09796 [pdf, other]
Title: Offline Reinforcement Learning with Value-based Episodic Memory
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)

Offline reinforcement learning (RL) shows promise of applying RL to real-world problems by effectively utilizing previously collected data. Most existing offline RL algorithms use regularization or constraints to suppress extrapolation error for actions outside the dataset. In this paper, we adopt a different framework, which learns the V-function instead of the Q-function to naturally keep the learning procedure within the support of an offline dataset. To enable effective generalization while maintaining proper conservatism in offline learning, we propose Expectile V-Learning (EVL), which smoothly interpolates between the optimal value learning and behavior cloning. Further, we introduce implicit planning along offline trajectories to enhance learned V-values and accelerate convergence. Together, we present a new offline method called Value-based Episodic Memory (VEM). We provide theoretical analysis for the convergence properties of our proposed VEM method, and empirical results in the D4RL benchmark show that our method achieves superior performance in most tasks, particularly in sparse-reward tasks.

[102]  arXiv:2110.09797 [pdf, other]
Title: An Interoperable Open Data Portal for Climate Analysis
Comments: Accepted in Proc. IEEE AP-S Symposium on Antennas and Propagation and USNC-URSI Radio Science Meeting, 2021
Subjects: Databases (cs.DB)

This work proposes an open interoperable data portal that offers access to a Web-wide climate domain knowledge graph created for Ireland and England's NOAA climate daily data. There are three main components contributing to this data portal: the first is the upper layer schema of the knowledge graph -- the climate analysis (CA) ontology -- the second is an ad hoc SPARQL server by which to store the graph data and provide public Web access, the last is a dereferencing engine deployed to resolve URIs for entity information. Our knowledge graph form of NOAA climate data facilitates the supply of semantic climate information to researchers and offers a variety of semantic applications that can be built on top of it.

[103]  arXiv:2110.09800 [pdf, other]
Title: Optimal Scheduling of Flexible Power-to-X Technologies in the Day-ahead Electricity Market
Subjects: Systems and Control (eess.SY); Computational Engineering, Finance, and Science (cs.CE)

The ambitious CO2 emission targets of the Paris agreements are achievable only with renewable energy, CO2-free power generation, new policies, and planning. The main motivation of this paper is that future green fuels from power-to-X assets should be produced from power with the lowest possible emissions while still keeping the cost of electricity low. To this end we propose a power-to-X scheduling framework that is capable of co-optimizing CO2 emission intensity and electricity prices in the day-ahead electricity market scheduling. Three realistic models for local production units are developed for flexible dispatch and the impact on electricity market scheduling is examined. Furthermore, the possible benefits of using CO2 emission intensity and electricity prices trade-off in scheduling are discussed. We find that there is a non-linear trade-off between CO2 emission intensity and cost, favoring a weighted optimization between the two objectives.

[104]  arXiv:2110.09803 [pdf, other]
Title: Latent reweighting, an almost free improvement for GANs
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)

Standard formulations of GANs, where a continuous function deforms a connected latent space, have been shown to be misspecified when fitting different classes of images. In particular, the generator will necessarily sample some low-quality images in between the classes. Rather than modifying the architecture, a line of works aims at improving the sampling quality from pre-trained generators at the expense of increased computational cost. Building on this, we introduce an additional network to predict latent importance weights and two associated sampling methods to avoid the poorest samples. This idea has several advantages: 1) it provides a way to inject disconnectedness into any GAN architecture, 2) since the rejection happens in the latent space, it avoids going through both the generator and the discriminator, saving computation time, 3) this importance weights formulation provides a principled way to reduce the Wasserstein's distance to the target distribution. We demonstrate the effectiveness of our method on several datasets, both synthetic and high-dimensional.

[105]  arXiv:2110.09813 [pdf, other]
Title: Multi-Objective Loss Balancing for Physics-Informed Deep Learning
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)

Physics Informed Neural Networks (PINN) are algorithms from deep learning leveraging physical laws by including partial differential equations (PDE) together with a respective set of boundary and initial conditions (BC / IC) as penalty terms into their loss function. As the PDE, BC and IC loss function parts can significantly differ in magnitudes, due to their underlying physical units or stochasticity of initialisation, training of PINNs may suffer from severe convergence and efficiency problems, causing PINNs to stay beyond desirable approximation quality. In this work, we observe the significant role of correctly weighting the combination of multiple competitive loss functions for training PINNs effectively. To that end, we implement and evaluate different methods aiming at balancing the contributions of multiple terms of the PINNs loss function and their gradients. After review of three existing loss scaling approaches (Learning Rate Annealing, GradNorm as well as SoftAdapt), we propose a novel self-adaptive loss balancing of PINNs called ReLoBRaLo (Relative Loss Balancing with Random Lookback). Finally, the performance of ReLoBRaLo is compared and verified against these approaches by solving both forward as well as inverse problems on three benchmark PDEs for PINNs: Burgers' equation, Kirchhoff's plate bending equation and Helmholtz's equation. Our simulation studies show that ReLoBRaLo training is much faster and achieves higher accuracy than training PINNs with other balancing methods and hence is very effective and increases sustainability of PINNs algorithms. The adaptability of ReLoBRaLo illustrates robustness across different PDE problem settings. The proposed method can also be employed to the wider class of penalised optimisation problems, including PDE-constrained and Sobolev training apart from the studied PINNs examples.

[106]  arXiv:2110.09814 [pdf, other]
Title: Speech Pattern based Black-box Model Watermarking for Automatic Speech Recognition
Comments: 5 pages, 2 figures
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Cryptography and Security (cs.CR); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)

As an effective method for intellectual property (IP) protection, model watermarking technology has been applied on a wide variety of deep neural networks (DNN), including speech classification models. However, how to design a black-box watermarking scheme for automatic speech recognition (ASR) models is still an unsolved problem, which is a significant demand for protecting remote ASR Application Programming Interface (API) deployed in cloud servers. Due to conditional independence assumption and label-detection-based evasion attack risk of ASR models, the black-box model watermarking scheme for speech classification models cannot apply to ASR models. In this paper, we propose the first black-box model watermarking framework for protecting the IP of ASR models. Specifically, we synthesize trigger audios by spreading the speech clips of model owners over the entire input audios and labeling the trigger audios with the stego texts, which hides the authorship information with linguistic steganography. Experiments on the state-of-the-art open-source ASR system DeepSpeech demonstrate the feasibility of the proposed watermarking scheme, which is robust against five kinds of attacks and has little impact on accuracy.

[107]  arXiv:2110.09817 [pdf, other]
Title: State-based Episodic Memory for Multi-Agent Reinforcement Learning
Authors: Xiao Ma, Wu-Jun Li
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Multiagent Systems (cs.MA)

Multi-agent reinforcement learning (MARL) algorithms have made promising progress in recent years by leveraging the centralized training and decentralized execution (CTDE) paradigm. However, existing MARL algorithms still suffer from the sample inefficiency problem. In this paper, we propose a simple yet effective approach, called state-based episodic memory (SEM), to improve sample efficiency in MARL. SEM adopts episodic memory (EM) to supervise the centralized training procedure of CTDE in MARL. To the best of our knowledge, SEM is the first work to introduce EM into MARL. We can theoretically prove that, when using for MARL, SEM has lower space complexity and time complexity than state and action based EM (SAEM), which is originally proposed for single-agent reinforcement learning. Experimental results on StarCraft multi-agent challenge (SMAC) show that introducing episodic memory into MARL can improve sample efficiency and SEM can reduce storage cost and time cost compared with SAEM.

[108]  arXiv:2110.09819 [pdf, other]
Title: LSTC: Boosting Atomic Action Detection with Long-Short-Term Context
Comments: ACM Multimedia 2021
Subjects: Computer Vision and Pattern Recognition (cs.CV)

In this paper, we place the atomic action detection problem into a Long-Short Term Context (LSTC) to analyze how the temporal reliance among video signals affect the action detection results. To do this, we decompose the action recognition pipeline into short-term and long-term reliance, in terms of the hypothesis that the two kinds of context are conditionally independent given the objective action instance. Within our design, a local aggregation branch is utilized to gather dense and informative short-term cues, while a high order long-term inference branch is designed to reason the objective action class from high-order interaction between actor and other person or person pairs. Both branches independently predict the context-specific actions and the results are merged in the end. We demonstrate that both temporal grains are beneficial to atomic action recognition. On the mainstream benchmarks of atomic action detection, our design can bring significant performance gain from the existing state-of-the-art pipeline. The code of this project can be found at [this url](https://github.com/TencentYoutuResearch/ActionDetection-LSTC)

[109]  arXiv:2110.09823 [pdf, other]
Title: Extensive Deep Temporal Point Process
Comments: 21 pages
Subjects: Machine Learning (cs.LG); Applications (stat.AP); Methodology (stat.ME)

Temporal point process as the stochastic process on continuous domain of time is usually used to model the asynchronous event sequence featuring with occurence timestamps. With the rise of deep learning, due to the strong expressivity of deep neural networks, they are emerging as a promising choice for capturing the patterns in asynchronous sequences, in the setting of temporal point process. In this paper, we first review recent research emphasis and difficulties in modeling asynchronous event sequences with deep temporal point process, which can be concluded into four fields: encoding of history sequence, formulation of conditional intensity function, relational discovery of events and learning approaches for optimization. We introduce most of recently proposed models by dismantling them as the four parts, and conduct experiments by remodularizing the first three parts with the same learning strategy for a fair empirical evaluation. Besides, we extend the history encoders and conditional intensity function family, and propose a Granger causality discovery framework for exploiting the relations among multi-types of events. Discrete graph structure learning in the framework of Variational Inference is employed to reveal latent structures of Granger causality graph, and further experiments shows the proposed framework with learned latent graph can both capture the relations and achieve an improved fitting and predicting performance.

[110]  arXiv:2110.09826 [pdf, ps, other]
Title: Distributed order estimation of ARX model under cooperative excitation condition
Authors: Die Gan, Zhixin Liu
Comments: 24 pages; This manuscript is submitted to SIAM Journal on Control and Optimization
Subjects: Systems and Control (eess.SY)

In this paper, we consider the distributed estimation problem of a linear stochastic system described by an autoregressive model with exogenous inputs (ARX) when both the system orders and parameters are unknown. We design distributed algorithms to estimate the unknown orders and parameters by combining the proposed local information criterion (LIC) with the distributed least squares method. The simultaneous estimation for both the system orders and parameters brings challenges for the theoretical analysis. Some analysis techniques, such as double array martingale limit theory, stochastic Lyapunov functions, and martingale convergence theorems are employed. For the case where the upper bounds of the true orders are available, we introduce a cooperative excitation condition, under which the strong consistency of the estimation for the orders and parameters is established. Moreover, for the case where the upper bounds of true orders are unknown, similar distributed algorithm is proposed to estimate both the orders and parameters, and the corresponding convergence analysis for the proposed algorithm is provided. We remark that our results are obtained without relying on the independency or stationarity assumptions of regression vectors, and the cooperative excitation conditions can show that all sensors can cooperate to fulfill the estimation task even though any individual sensor can not.

[111]  arXiv:2110.09829 [pdf, other]
Title: Towards Social Situation Awareness in Support Agents
Comments: 9 pages, 2 figures
Subjects: Artificial Intelligence (cs.AI)

Artificial agents that support people in their daily activities (e.g., virtual coaches and personal assistants) are increasingly prevalent. Since many daily activities are social in nature, support agents should understand a user's social situation to offer comprehensive support. However, there are no systematic approaches for developing support agents that are social situation aware. We identify key requirements for a support agent to be social situation aware and propose steps to realize those requirements. These steps are presented through a conceptual architecture that centers around two key ideas: (1) conceptualizing social situation awareness as an instantiation of `general' situation awareness, and (2) using situation taxonomies as the key element of such instantiation. This enables support agents to represent a user's social situation, comprehend its meaning, and assess its impact on the user's behavior. We discuss empirical results supporting that the proposed approach can be effective and illustrate how the architecture can be used in support agents through a use case.

[112]  arXiv:2110.09832 [pdf, other]
Title: The Impact of User Location on Cookie Notices (Inside and Outside of the European Union)
Comments: Peer-reviewed and presented at IEEE Workshop on Technology and Consumer Protection 2019 (ConPro '19)
Subjects: Computers and Society (cs.CY)

The web is global, but privacy laws differ by country. Which set of privacy rules do websites follow? We empirically study this question by detecting and analyzing cookie notices in an automated way. We crawl 1,500 European, American, and Canadian websites from each of 18 countries. We detect cookie notices on 40 percent of websites in our sample. We treat the presence or absence of cookie notices, as well as visual differences, as proxies for differences in privacy rules. Using a series of regression models, we find that the website's Top Level Domain explains a substantial portion of the variance in cookie notice metrics, but the user's vantage point does not. This suggests that websites follow one set of privacy rules for all their users. There is one exception to this finding: cookie notices differ when accessing .com domains from inside versus outside of the EU. We highlight ways in which future research could build on our preliminary findings.

[113]  arXiv:2110.09839 [pdf, other]
Title: Measuring Hidden Bias within Face Recognition via Racial Phenotypes
Comments: published in IEEE Winter Conference on Applications of Computer Vision, WACV, 2022
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

Recent work reports disparate performance for intersectional racial groups across face recognition tasks: face verification and identification. However, the definition of those racial groups has a significant impact on the underlying findings of such racial bias analysis. Previous studies define these groups based on either demographic information (e.g. African, Asian etc.) or skin tone (e.g. lighter or darker skins). The use of such sensitive or broad group definitions has disadvantages for bias investigation and subsequent counter-bias solutions design. By contrast, this study introduces an alternative racial bias analysis methodology via facial phenotype attributes for face recognition. We use the set of observable characteristics of an individual face where a race-related facial phenotype is hence specific to the human face and correlated to the racial profile of the subject. We propose categorical test cases to investigate the individual influence of those attributes on bias within face recognition tasks. We compare our phenotype-based grouping methodology with previous grouping strategies and show that phenotype-based groupings uncover hidden bias without reliance upon any potentially protected attributes or ill-defined grouping strategies. Furthermore, we contribute corresponding phenotype attribute category labels for two face recognition tasks: RFW for face verification and VGGFace2 (test set) for face identification.

[114]  arXiv:2110.09843 [pdf, other]
Title: AequeVox: Automated Fairness Testing of Speech Recognition Systems
Authors: Sai Sathiesh Rajan (1), Sakshi Udeshi (1), Sudipta Chattopadhyay (1) ((1) Singapore University of Technology and Design)
Comments: 31 pages, 9 figures, submitted to 25th International Conference on Fundamental Approaches to Software Engineering , for associated code, see this https URL
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL)

Automatic Speech Recognition (ASR) systems have become ubiquitous. They can be found in a variety of form factors and are increasingly important in our daily lives. As such, ensuring that these systems are equitable to different subgroups of the population is crucial. In this paper, we introduce, AequeVox, an automated testing framework for evaluating the fairness of ASR systems. AequeVox simulates different environments to assess the effectiveness of ASR systems for different populations. In addition, we investigate whether the chosen simulations are comprehensible to humans. We further propose a fault localization technique capable of identifying words that are not robust to these varying environments. Both components of AequeVox are able to operate in the absence of ground truth data.
We evaluated AequeVox on speech from four different datasets using three different commercial ASRs. Our experiments reveal that non-native English, female and Nigerian English speakers generate 109%, 528.5% and 156.9% more errors, on average than native English, male and UK Midlands speakers, respectively. Our user study also reveals that 82.9% of the simulations (employed through speech transformations) had a comprehensibility rating above seven (out of ten), with the lowest rating being 6.78. This further validates the fairness violations discovered by AequeVox. Finally, we show that the non-robust words, as predicted by the fault localization technique embodied in AequeVox, show 223.8% more errors than the predicted robust words across all ASRs.

[115]  arXiv:2110.09844 [pdf, other]
Title: Comonadic semantics for hybrid logic and bounded fragments
Subjects: Logic in Computer Science (cs.LO); Category Theory (math.CT)

In recent work, comonads and associated structures have been used to analyse a range of important notions in finite model theory, descriptive complexity and combinatorics. We extend this analysis to Hybrid logic, a widely-studied extension of basic modal logic, which corresponds to the bounded fragment of first-order logic. In addition to characterising the various resource-indexed equivalences induced by Hybrid logic and the bounded fragment, and the associated combinatorial decompositions of structures, we also give model-theoretic characterisations of bounded formulas in terms of invariance under generated substructures, in both the finite and infinite cases.

[116]  arXiv:2110.09846 [pdf]
Title: A Novel Recurrent Adaptive Backstepping Optimal Control Strategy for a Single Inverted Pendulum System
Authors: Mohammad Sarbaz
Subjects: Systems and Control (eess.SY)

In this paper, a novel recurrent adaptive backstepping optimal control strategy for a single inverted pendulum system is studied. By this method, an inverted pendulum is stabilized using projection recurrent neural network-based adaptive backstepping control (PRNN-ABC). The inverted pendulum is a popular nonlinear system that is used in both industry and academic and is applied various control approaches since it has many applications. Here, first of all, the backstepping control laws are investigated based on the nonlinear dynamic model of the system. Second, by considering control constrains and performance index, the constrained optimization problem is formulated. Later, the optimization problem will be converted to a constrained quadratic problem (QP). To study the recurrent neural network (RNN) according to the Karush- Kuhn-Tucker (KKT) optimization conditions and the variational inequality, the dynamic model of the RNN will be derived. At last, the stability analysis of the system is studied using Lyapunov function.

[117]  arXiv:2110.09848 [pdf, other]
Title: Self-Supervised Object Detection via Generative Image Synthesis
Subjects: Computer Vision and Pattern Recognition (cs.CV)

We present SSOD, the first end-to-end analysis-by synthesis framework with controllable GANs for the task of self-supervised object detection. We use collections of real world images without bounding box annotations to learn to synthesize and detect objects. We leverage controllable GANs to synthesize images with pre-defined object properties and use them to train object detectors. We propose a tight end-to-end coupling of the synthesis and detection networks to optimally train our system. Finally, we also propose a method to optimally adapt SSOD to an intended target data without requiring labels for it. For the task of car detection, on the challenging KITTI and Cityscapes datasets, we show that SSOD outperforms the prior state-of-the-art purely image-based self-supervised object detection method Wetectron. Even without requiring any 3D CAD assets, it also surpasses the state-of-the-art rendering based method Meta-Sim2. Our work advances the field of self-supervised object detection by introducing a successful new paradigm of using controllable GAN-based image synthesis for it and by significantly improving the baseline accuracy of the task. We open-source our code at https://github.com/NVlabs/SSOD.

[118]  arXiv:2110.09849 [pdf, other]
Title: Holistic Hardware Security Assessment Framework: A Microarchitectural Perspective
Comments: Appeared in the program of Energy-Secure System Architectures (ESSA) Workshop
Subjects: Hardware Architecture (cs.AR); Cryptography and Security (cs.CR)

Our goal is to enable holistic hardware security evaluation from the microarchitectural point of view. To achieve this, we propose a framework that categorizes threat models based on the microarchitectural components being targeted, and provides a generic security metric that can be used to assess the vulnerability of components, as well as the system as a whole.

[119]  arXiv:2110.09856 [pdf, other]
Title: Network Science Predicts Who Dies Next in Game of Thrones
Authors: Milan Janosov
Comments: 14 pages, 7 figures
Subjects: Social and Information Networks (cs.SI); Physics and Society (physics.soc-ph)

Social network analysis and machine learning have found countless applications in recent years. As an example, this short project was carried out in 2017 and was followed by some media attention, with the following goal: to bring network science and predictive modeling together on the subject of the popular TV and book series, Game of Thrones, and predict which key characters are likely to meet their ends.

[120]  arXiv:2110.09857 [pdf, other]
Title: Irrationality, Extortion, or Trusted Third-parties: Why it is Impossible to Buy and Sell Physical Goods Securely on the Blockchain
Comments: To appear in the IEEE International Conference on Blockchain (Blockchain 2021), Melbourne, Australia
Subjects: Cryptography and Security (cs.CR); Discrete Mathematics (cs.DM); Computer Science and Game Theory (cs.GT); Software Engineering (cs.SE)

Suppose that Alice plans to buy a physical good from Bob over a programmable Blockchain. Alice does not trust Bob, so she is not willing to pay before the good is delivered off-chain. Similarly, Bob does not trust Alice, so he is not willing to deliver the good before getting paid on-chain. Moreover, they are not inclined to use the services of a trusted third-party. Traditionally, such scenarios are handled by game-theoretic escrow smart contracts, such as BitHalo. In this work, we first show that the common method for this problem suffers from a major flaw which can be exploited by Bob in order to extort Alice. We also show that, unlike the case of auctions, this flaw cannot be addressed by a commitment-scheme-based approach. We then provide a much more general result: assuming that the two sides are rational actors and the smart contract language is Turing-complete, there is no escrow smart contract that can facilitate this exchange without either relying on third parties or enabling at least one side to extort the other.

[121]  arXiv:2110.09866 [pdf, other]
Title: Learning a self-supervised tone mapping operator via feature contrast masking loss
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)

High Dynamic Range (HDR) content is becoming ubiquitous due to the rapid development of capture technologies. Nevertheless, the dynamic range of common display devices is still limited, therefore tone mapping (TM) remains a key challenge for image visualization. Recent work has demonstrated that neural networks can achieve remarkable performance in this task when compared to traditional methods, however, the quality of the results of these learning-based methods is limited by the training data. Most existing works use as training set a curated selection of best-performing results from existing traditional tone mapping operators (often guided by a quality metric), therefore, the quality of newly generated results is fundamentally limited by the performance of such operators. This quality might be even further limited by the pool of HDR content that is used for training. In this work we propose a learning-based self-supervised tone mapping operator that is trained at test time specifically for each HDR image and does not need any data labeling. The key novelty of our approach is a carefully designed loss function built upon fundamental knowledge on contrast perception that allows for directly comparing the content in the HDR and tone mapped images. We achieve this goal by reformulating classic VGG feature maps into feature contrast maps that normalize local feature differences by their average magnitude in a local neighborhood, allowing our loss to account for contrast masking effects. We perform extensive ablation studies and exploration of parameters and demonstrate that our solution outperforms existing approaches with a single set of fixed parameters, as confirmed by both objective and subjective metrics.

[122]  arXiv:2110.09868 [pdf, other]
Title: Designing A Clinically Applicable Deep Recurrent Model to Identify Neuropsychiatric Symptoms in People Living with Dementia Using In-Home Monitoring Data
Comments: 13 pages, accepted to Research2Clinics WS @ NeurIPS 2021
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)

Agitation is one of the neuropsychiatric symptoms with high prevalence in dementia which can negatively impact the Activities of Daily Living (ADL) and the independence of individuals. Detecting agitation episodes can assist in providing People Living with Dementia (PLWD) with early and timely interventions. Analysing agitation episodes will also help identify modifiable factors such as ambient temperature and sleep as possible components causing agitation in an individual. This preliminary study presents a supervised learning model to analyse the risk of agitation in PLWD using in-home monitoring data. The in-home monitoring data includes motion sensors, physiological measurements, and the use of kitchen appliances from 46 homes of PLWD between April 2019-June 2021. We apply a recurrent deep learning model to identify agitation episodes validated and recorded by a clinical monitoring team. We present the experiments to assess the efficacy of the proposed model. The proposed model achieves an average of 79.78% recall, 27.66% precision and 37.64% F1 scores when employing the optimal parameters, suggesting a good ability to recognise agitation events. We also discuss using machine learning models for analysing the behavioural patterns using continuous monitoring data and explore clinical applicability and the choices between sensitivity and specificity in-home monitoring applications.

[123]  arXiv:2110.09869 [pdf, other]
Title: User-Centric Federated Learning
Comments: Accepted in Workshop on Wireless Communications For Distributed Intelligence, GLOBECOM 2021
Subjects: Machine Learning (cs.LG); Information Theory (cs.IT); Signal Processing (eess.SP)

Data heterogeneity across participating devices poses one of the main challenges in federated learning as it has been shown to greatly hamper its convergence time and generalization capabilities. In this work, we address this limitation by enabling personalization using multiple user-centric aggregation rules at the parameter server. Our approach potentially produces a personalized model for each user at the cost of some extra downlink communication overhead. To strike a trade-off between personalization and communication efficiency, we propose a broadcast protocol that limits the number of personalized streams while retaining the essential advantages of our learning scheme. Through simulation results, our approach is shown to enjoy higher personalization capabilities, faster convergence, and better communication efficiency compared to other competing baseline solutions.

[124]  arXiv:2110.09872 [pdf, other]
Title: Presentation and Publication: Loss and Slippage in Networks of Automated Market Makers
Comments: To appear in the Proceedings of **Tokenomics 2021** (3rd International Conference on Blockchain Economics, Security and Protocols)
Subjects: Other Computer Science (cs.OH)

Automated market makers (AMMs) are smart contracts that automatically trade electronic assets according to a mathematical formula. This paper investigates how an AMM's formula affects the interests of liquidity providers, who endow the AMM with assets, and traders, who exchange one asset for another at the AMM's rates. *Linear slippage* measures how a trade's size affects the trader's return, *angular slippage* measures how a trade's size affects the subsequent market price, *divergence loss* measures the opportunity cost of providers' investments, and *load* balances the costs to traders and providers. We give formal definitions for these costs, show that they obey certain conservation laws: these costs can be shifted around but never fully eliminated. We analyze how these costs behave under *composition*, when simple individual AMMs are linked to form more complex networks of AMMs.

[125]  arXiv:2110.09877 [pdf, other]
Title: Two-stage Voice Application Recommender System for Unhandled Utterances in Intelligent Personal Assistant
Comments: 9 pages, IRS KDD workshop 2021
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Information Retrieval (cs.IR)

Intelligent personal assistants (IPA) enable voice applications that facilitate people's daily tasks. However, due to the complexity and ambiguity of voice requests, some requests may not be handled properly by the standard natural language understanding (NLU) component. In such cases, a simple reply like "Sorry, I don't know" hurts the user's experience and limits the functionality of IPA. In this paper, we propose a two-stage shortlister-reranker recommender system to match third-party voice applications (skills) to unhandled utterances. In this approach, a skill shortlister is proposed to retrieve candidate skills from the skill catalog by calculating both lexical and semantic similarity between skills and user requests. We also illustrate how to build a new system by using observed data collected from a baseline rule-based system, and how the exposure biases can generate discrepancy between offline and human metrics. Lastly, we present two relabeling methods that can handle the incomplete ground truth, and mitigate exposure bias. We demonstrate the effectiveness of our proposed system through extensive offline experiments. Furthermore, we present online A/B testing results that show a significant boost on user experience satisfaction.

[126]  arXiv:2110.09881 [pdf, other]
Title: HM-Net: A Regression Network for Object Center Detection and Tracking on Wide Area Motion Imagery
Comments: 11 pages, 11 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Wide Area Motion Imagery (WAMI) yields high resolution images with a large number of extremely small objects. Target objects have large spatial displacements throughout consecutive frames. This nature of WAMI images makes object tracking and detection challenging. In this paper, we present our deep neural network-based combined object detection and tracking model, namely, Heat Map Network (HM-Net). HM-Net is significantly faster than state-of-the-art frame differencing and background subtraction-based methods, without compromising detection and tracking performances. HM-Net follows object center-based joint detection and tracking paradigm. Simple heat map-based predictions support unlimited number of simultaneous detections. The proposed method uses two consecutive frames and the object detection heat map obtained from the previous frame as input, which helps HM-Net monitor spatio-temporal changes between frames and keeps track of previously predicted objects. Although reuse of prior object detection heat map acts as a vital feedback-based memory element, it can lead to unintended surge of false positive detections. To increase robustness of the method against false positives and to eliminate low confidence detections, HM-Net employs novel feedback filters and advanced data augmentations. HM-Net outperforms state-of-the-art WAMI moving object detection and tracking methods on WPAFB dataset with its 96.2% F1 and 94.4% mAP detection scores, while achieving a 61.8% mAP tracking score on the same dataset.

[127]  arXiv:2110.09886 [pdf, other]
Title: The Footprint of Campaign Strategies in Farsi Twitter: A case for 2021 Iranian presidential election
Comments: 11 pages, 6 figures
Subjects: Social and Information Networks (cs.SI)

The rise of social media accompanied by the Covid-19 Pandemic has instigated a shift in paradigm in the presidential campaigns in Iran from the real world to social media. Unlike previous presidential elections, there was a decrease in physical events and advertisements for the candidates; in turn, the online presence of presidential candidates is significantly increased. Farsi Twitter played a specific role in this matter, as it became the platform for creating political content. In this study, we found traces of organizational activities in Farsi Twitter. Our investigations reveals that the discussion network of the 2021 election is heterogeneous and highly polarized. However, unlike other elections, candidates' supporters are very close, and "Anti-voters" who endorse boycotting the election is at the discussions opposite end. Furthermore, high presence of the bot activity is observed among the most influential users in all of the involved communities.

[128]  arXiv:2110.09887 [pdf, other]
Title: Time Series Analysis via Network Science: Concepts and Algorithms
Subjects: Social and Information Networks (cs.SI); Machine Learning (cs.LG)

There is nowadays a constant flux of data being generated and collected in all types of real world systems. These data sets are often indexed by time, space or both requiring appropriate approaches to analyze the data. In univariate settings, time series analysis is a mature and solid field. However, in multivariate contexts, time series analysis still presents many limitations. In order to address these issues, the last decade has brought approaches based on network science. These methods involve transforming an initial time series data set into one or more networks, which can be analyzed in depth to provide insight into the original time series. This review provides a comprehensive overview of existing mapping methods for transforming time series into networks for a wide audience of researchers and practitioners in machine learning, data mining and time series. Our main contribution is a structured review of existing methodologies, identifying their main characteristics and their differences. We describe the main conceptual approaches, provide authoritative references and give insight into their advantages and limitations in a unified notation and language. We first describe the case of univariate time series, which can be mapped to single layer networks, and we divide the current mappings based on the underlying concept: visibility, transition and proximity. We then proceed with multivariate time series discussing both single layer and multiple layer approaches. Although still very recent, this research area has much potential and with this survey we intend to pave the way for future research on the topic.

[129]  arXiv:2110.09888 [pdf, other]
Title: Novel Features for Time Series Analysis: A Complex Networks Approach
Subjects: Social and Information Networks (cs.SI); Machine Learning (cs.LG)

Time series data are ubiquitous in several domains as climate, economics and health care. Mining features from these time series is a crucial task with a multidisciplinary impact. Usually, these features are obtained from structural characteristics of time series, such as trend, seasonality and autocorrelation, sometimes requiring data transformations and parametric models. A recent conceptual approach relies on time series mapping to complex networks, where the network science methodologies can help characterize time series. In this paper, we consider two mapping concepts, visibility and transition probability and propose network topological measures as a new set of time series features. To evaluate the usefulness of the proposed features, we address the problem of time series clustering. More specifically, we propose a clustering method that consists in mapping the time series into visibility graphs and quantile graphs, calculating global topological metrics of the resulting networks, and using data mining techniques to form clusters. We apply this method to a data sets of synthetic and empirical time series. The results indicate that network-based features capture the information encoded in each of the time series models, resulting in high accuracy in a clustering task. Our results are promising and show that network analysis can be used to characterize different types of time series and that different mapping methods capture different characteristics of the time series.

[130]  arXiv:2110.09893 [pdf]
Title: Visualizing Collective Idea Generation and Innovation Processes in Social Networks
Subjects: Social and Information Networks (cs.SI)

Collective idea generation and innovation processes are complex and dynamic, involving a large amount of qualitative narrative information that is difficult to monitor, analyze and visualize using traditional methods. In this study, we developed three new visualization methods for collective idea generation and innovation processes and applied them to data from online collaboration experiments. The first visualization is the Idea Cloud, which helps monitor collective idea posting activity and intuitively tracks idea clustering and transition. The second visualization is the Idea Geography, which helps understand how the idea space and its utility landscape are structured and how collaboration was performed in that space. The third visualization is the Idea Network, which connects idea dynamics with the social structure of the people who generated them, displaying how social influence among neighbors may have affected collaborative activities and where innovative ideas arose and spread in the social network.

[131]  arXiv:2110.09898 [pdf, other]
Title: Dynamics and control of clustered tensegrity systems
Comments: 15pages, 24 figures
Subjects: Systems and Control (eess.SY); Applied Physics (physics.app-ph)

This paper presents the formulations of nonlinear and linearized statics, dynamics, and control for any clustered tensegrity system (CTS). Based on the Lagrangian method and FEM assumptions, the nonlinear clustered tensegrity dynamics with and without constraints are first derived. It is shown that the traditional tensegrity system (TTS), whose node to node strings are individual ones, yield to be a special case of the CTS. Then, equilibrium equations of the CTS in three standard forms (in terms of nodal coordinate, force density, and force vector) and the compatibility equation are given. Moreover, the linearized dynamics and modal analysis of the CTS with and without constraints are also derived. We also present a nonlinear shape control law for the control of any CTS. The control turns out to be a linear algebra problem in terms of the control variable, which is the force densities in the strings. The statics, dynamics, and control examples are carefully selected to demonstrate the developed principles. The presented approaches can boost the comprehensive studies of the statics, dynamics, and control for any CTS, as well as promoting the integration of structure and control design.

[132]  arXiv:2110.09899 [pdf, other]
Title: POLE: Polarized Embedding for Signed Networks
Comments: Accepted to WSDM 2022
Subjects: Social and Information Networks (cs.SI); Machine Learning (cs.LG)

From the 2016 U.S. presidential election to the 2021 Capitol riots to the spread of misinformation related to COVID-19, many have blamed social media for today's deeply divided society. Recent advances in machine learning for signed networks hold the promise to guide small interventions with the goal of reducing polarization in social media. However, existing models are especially ineffective in predicting conflicts (or negative links) among users. This is due to a strong correlation between link signs and the network structure, where negative links between polarized communities are too sparse to be predicted even by state-of-the-art approaches. To address this problem, we first design a partition-agnostic polarization measure for signed graphs based on the signed random-walk and show that many real-world graphs are highly polarized. Then, we propose POLE (POLarized Embedding for signed networks), a signed embedding method for polarized graphs that captures both topological and signed similarities jointly via signed autocovariance. Through extensive experiments, we show that POLE significantly outperforms state-of-the-art methods in signed link prediction, particularly for negative links with gains of up to one order of magnitude.

[133]  arXiv:2110.09902 [pdf, other]
Title: Understanding Convolutional Neural Networks from Theoretical Perspective via Volterra Convolution
Subjects: Machine Learning (cs.LG)

This study proposes a general and unified perspective of convolutional neural networks by exploring the relationship between (deep) convolutional neural networks and finite Volterra convolutions. It provides a novel approach to explain and study the overall characteristics of neural networks without being disturbed by the complex network architectures. Concretely, we examine the basic structures of finite term Volterra convolutions and convolutional neural networks. Our results show that convolutional neural network is an approximation of the finite term Volterra convolution, whose order increases exponentially with the number of layers and kernel size increases exponentially with the strides. With this perspective, the specialized perturbations are directly obtained from the approximated kernels rather than iterative generated adversarial examples. Extensive experiments on synthetic and real-world data sets show the correctness and effectiveness of our results.

[134]  arXiv:2110.09903 [pdf, other]
Title: Unrestricted Adversarial Attacks on ImageNet Competition
Comments: CVPR-2021 AIC Phase VI Track2: Unrestricted Adversarial Attacks on ImageNet
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Many works have investigated the adversarial attacks or defenses under the settings where a bounded and imperceptible perturbation can be added to the input. However in the real-world, the attacker does not need to comply with this restriction. In fact, more threats to the deep model come from unrestricted adversarial examples, that is, the attacker makes large and visible modifications on the image, which causes the model classifying mistakenly, but does not affect the normal observation in human perspective. Unrestricted adversarial attack is a popular and practical direction but has not been studied thoroughly. We organize this competition with the purpose of exploring more effective unrestricted adversarial attack algorithm, so as to accelerate the academical research on the model robustness under stronger unbounded attacks. The competition is held on the TianChi platform (\url{https://tianchi.aliyun.com/competition/entrance/531853/introduction}) as one of the series of AI Security Challengers Program.

[135]  arXiv:2110.09904 [pdf, other]
Title: Learning Robotic Manipulation Skills Using an Adaptive Force-Impedance Action Space
Subjects: Robotics (cs.RO); Machine Learning (cs.LG)

Intelligent agents must be able to think fast and slow to perform elaborate manipulation tasks. Reinforcement Learning (RL) has led to many promising results on a range of challenging decision-making tasks. However, in real-world robotics, these methods still struggle, as they require large amounts of expensive interactions and have slow feedback loops. On the other hand, fast human-like adaptive control methods can optimize complex robotic interactions, yet fail to integrate multimodal feedback needed for unstructured tasks. In this work, we propose to factor the learning problem in a hierarchical learning and adaption architecture to get the best of both worlds. The framework consists of two components, a slow reinforcement learning policy optimizing the task strategy given multimodal observations, and a fast, real-time adaptive control policy continuously optimizing the motion, stability, and effort of the manipulator. We combine these components through a bio-inspired action space that we call AFORCE. We demonstrate the new action space on a contact-rich manipulation task on real hardware and evaluate its performance on three simulated manipulation tasks. Our experiments show that AFORCE drastically improves sample efficiency while reducing energy consumption and improving safety.

[136]  arXiv:2110.09905 [pdf, other]
Title: Show Me the Whole World: Towards Entire Item Space Exploration for Interactive Personalized Recommendations
Comments: WSDM 2022
Subjects: Information Retrieval (cs.IR)

User interest exploration is an important and challenging topic in recommender systems, which alleviates the closed-loop effects between recommendation models and user-item interactions. Contextual bandit (CB) algorithms strive to make a good trade-off between exploration and exploitation so that users' potential interests have chances to expose. However, classical CB algorithms can only be applied to a small, sampled item set (usually hundreds), which forces the typical applications in recommender systems limited to candidate post-ranking, homepage top item ranking, ad creative selection, or online model selection (A/B test).
In this paper, we introduce two simple but effective hierarchical CB algorithms to make a classical CB model (such as LinUCB and Thompson Sampling) capable to explore users' interest in the entire item space without limiting it to a small item set. We first construct a hierarchy item tree via a bottom-up clustering algorithm to organize items in a coarse-to-fine manner. Then we propose a hierarchical CB (HCB) algorithm to explore users' interest in the hierarchy tree. HCB takes the exploration problem as a series of decision-making processes, where the goal is to find a path from the root to a leaf node, and the feedback will be back-propagated to all the nodes in the path. We further propose a progressive hierarchical CB (pHCB) algorithm, which progressively extends visible nodes which reach a confidence level for exploration, to avoid misleading actions on upper-level nodes in the sequential decision-making process. Extensive experiments on two public recommendation datasets demonstrate the effectiveness and flexibility of our methods.

[137]  arXiv:2110.09910 [pdf, other]
Title: FedHe: Heterogeneous Models and Communication-Efficient Federated Learning
Subjects: Machine Learning (cs.LG); Distributed, Parallel, and Cluster Computing (cs.DC)

Federated learning (FL) is able to manage edge devices to cooperatively train a model while maintaining the training data local and private. One common assumption in FL is that all edge devices share the same machine learning model in training, for example, identical neural network architecture. However, the computation and store capability of different devices may not be the same. Moreover, reducing communication overheads can improve the training efficiency though it is still a challenging problem in FL. In this paper, we propose a novel FL method, called FedHe, inspired by knowledge distillation, which can train heterogeneous models and support asynchronous training processes with significantly reduced communication overheads. Our analysis and experimental results demonstrate that the performance of our proposed method is better than the state-of-the-art algorithms in terms of communication overheads and model accuracy.

[138]  arXiv:2110.09911 [pdf, other]
Title: Coalgebraic modal logic and games for coalgebras with side effects
Subjects: Logic in Computer Science (cs.LO)

We study coalgebraic modal logic and games to characterise behavioural equivalence in the presence of side effects, i.e., when coalgebras live in a (co)Kleisli or an Eilenberg-Moore category. Our aim is to develop a general framework based on indexed categories/fibrations that is common, at least, to the aforementioned categories. In particular, we show how the coalgebraic notion of behavioural equivalence arises from a relation lifting (a special kind of indexed morphism) and we give a general recipe to construct such liftings in the above three cases. Lastly, we apply this framework to derive games and logical characterisations for (weighted) language equivalence and conditional bisimilarity.

[139]  arXiv:2110.09912 [pdf, other]
Title: Optimal control using flux potentials: A way to construct bound-preserving finite element schemes for conservation laws
Subjects: Numerical Analysis (math.NA)

To ensure preservation of local or global bounds for numerical solutions of conservation laws, we constrain a baseline finite element discretization using optimization-based (OB) flux correction. The main novelty of the proposed methodology lies in the use of flux potentials as control variables and targets of inequality-constrained optimization problems for numerical fluxes. In contrast to optimal control via general source terms, the discrete conservation property of flux-corrected finite element approximations is guaranteed without the need to impose additional equality constraints. Since the number of flux potentials is less than the number of fluxes in the multidimensional case, the potential-based version of optimal flux control involves fewer unknowns than direct calculation of optimal fluxes. We show that the feasible set of a potential-state potential-target (PP) optimization problem is nonempty and choose a primal-dual Newton method for calculating the optimal flux potentials. The results of numerical studies for linear advection and anisotropic diffusion problems in 2D demonstrate the superiority of the new OB-PP algorithms to closed-form flux limiting under worst-case assumptions.

[140]  arXiv:2110.09913 [pdf, other]
Title: Prepartition: Load Balancing Approach for Virtual Machine Reservations in a Cloud Data Center
Comments: 10 figures, 5 tables, 21 pages, accepted with minor in Journal of Computer Science and Technology
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

Load balancing is vital for the efficient and long-term operation of cloud data centers. With virtualization, post (reactive) migration of virtual machines after allocation is the traditional way for load balancing and consolidation. However, reactive migration is not easy to obtain predefined load balance objectives and may interrupt services and bring instability. Therefore, we provide a new approach, called Prepartition, for load balancing. It partitions a VM request into a few sub-requests sequentially with start time, end time and capacity demands, and treats each sub-request as a regular VM request. In this way, it can proactively set a bound for each VM request on each physical machine and makes the scheduler get ready before VM migration to obtain the predefined load balancing goal, which supports the resource allocation in a fine-grained manner. Simulations with real-world trace and synthetic data show that Prepartition for offline (PrepartitionOff) scheduling has 10%-20% better performance than the existing load balancing algorithms under several metrics, including average utilization, imbalance degree, makespan and Capacity_makespan. We also extend Prepartition to online load balancing. Evaluation results show that our proposed approach also outperforms existing online algorithms.

[141]  arXiv:2110.09915 [pdf, other]
Title: Entity Relation Extraction as Dependency Parsing in Visually Rich Documents
Comments: Accepted to EMNLP 2021 (main conference)
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

Previous works on key information extraction from visually rich documents (VRDs) mainly focus on labeling the text within each bounding box (i.e., semantic entity), while the relations in-between are largely unexplored. In this paper, we adapt the popular dependency parsing model, the biaffine parser, to this entity relation extraction task. Being different from the original dependency parsing model which recognizes dependency relations between words, we identify relations between groups of words with layout information instead. We have compared different representations of the semantic entity, different VRD encoders, and different relation decoders. The results demonstrate that our proposed model achieves 65.96% F1 score on the FUNSD dataset. As for the real-world application, our model has been applied to the in-house customs data, achieving reliable performance in the production setting.

[142]  arXiv:2110.09918 [pdf, other]
Title: Using RDMA for Efficient Index Replication in LSM Key-Value Stores
Subjects: Databases (cs.DB); Distributed, Parallel, and Cluster Computing (cs.DC)

Log-Structured Merge tree (LSM tree) Key-Value (KV) stores have become a foundational layer in the storage stacks of datacenter and cloud services. Current approaches for achieving reliability and availability avoid replication at the KV store level and instead perform these operations at higher layers, e.g., the DB layer that runs on top of the KV store. The main reason is that past designs for replicated KV stores favor reducing network traffic and increasing I/O size. Therefore, they perform costly compactions to reorganize data in both the primary and backup nodes, which hurts overall system performance.
In this paper, we design and implement Talos, an efficient rack-scale LSM-based KV store that aims to significantly reduce the I/O amplification and CPU overhead in backup nodes and make replication in the KV store practical. We rely on two observations: (a) the increased use of RDMA in the datacenter, which reduces CPU overhead for communication, and (b) the use of KV separation that is becoming prevalent in modern KV stores. We use a primary-backup replication scheme that performs compactions only on the primary nodes and sends the pre-built index to the backup nodes of the region, avoiding all compactions in backups. Our approach includes an efficient mechanism to deal with pointer translation across nodes in the region index. Our results show that Talos reduces in the backup nodes, I/O amplification by up to $3\times$, CPU overhead by up to $1.6\times$, and memory size needed for the write path by up to $2\times$, without increasing network bandwidth excessively, and by up to $1.3\times$. Overall, we show that our approach has benefits even when small KV pairs dominate in a workload (80%-90%). Finally, it enables KV stores to operate with larger growth factors (from 10 to 16) to reduce space amplification without sacrificing precious CPU cycles.

[143]  arXiv:2110.09919 [pdf, other]
Title: ToFFi -- Toolbox for Frequency-based Fingerprinting of Brain Signals
Comments: 21 pages, 10 figures
Subjects: Machine Learning (cs.LG); Neurons and Cognition (q-bio.NC)

Spectral fingerprints (SFs) are unique power spectra signatures of human brain regions of interest (ROIs, Keitel & Gross, 2016). SFs allow for accurate ROI identification and can serve as biomarkers of differences exhibited by non-neurotypical groups. At present, there are no open-source, versatile tools to calculate spectral fingerprints. We have filled this gap by creating a modular, highly-configurable MATLAB Toolbox for Frequency-based Fingerprinting (ToFFi). It can transform MEG/EEG signals into unique spectral representations using ROIs provided by anatomical (AAL, Desikan-Killiany), functional (Schaefer), or other custom volumetric brain parcellations. Toolbox design supports reproducibility and parallel computations.

[144]  arXiv:2110.09929 [pdf, other]
Title: Minimal Multi-Layer Modifications of Deep Neural Networks
Subjects: Machine Learning (cs.LG); Logic in Computer Science (cs.LO)

Deep neural networks (DNNs) have become increasingly popular in recent years. However, despite their many successes, DNNs may also err and produce incorrect and potentially fatal outputs in safety-critical settings, such as autonomous driving, medical diagnosis, and airborne collision avoidance systems. Much work has been put into detecting such erroneous behavior in DNNs, e.g., via testing or verification, but removing these errors after their detection has received lesser attention. We present here a new tool, called \textsc{3M-DNN}, for \emph{repairing} a given DNN, which is known to err on some set of inputs. The novel repair procedure implemented in \textsc{3M-DNN} computes a modification to the network's weights that corrects its behavior, and attempts to minimize this change via a sequence of calls to a backend, black-box DNN verification engine. To the best of our knowledge, our method is the first one that allows repairing the network by simultaneously modifying multiple layers. This is achieved by splitting the network into sub-networks, and applying a single-layer repairing technique to each component. We evaluated \textsc{3M-DNN} tool on an extensive set of benchmarks, obtaining promising results. Data Availability Statement: An artifact will be submitted to the AEC under EasyChair ID 60.

[145]  arXiv:2110.09934 [pdf, other]
Title: Elevating the future of mobility: UAV-enabled Intelligent Transportation Systems
Subjects: Networking and Internet Architecture (cs.NI)

Intelligent Transportation Systems (ITS) improve traffic efficiency, traffic management, driver's comfort, and safety. They consist of a broad range of components, including vehicles, sensors, Base Stations, Road Side Units, and road infrastructure (i.e., traffic signals). ITS of the near future will need to support multi-modal transportation schemes (including ground and aerial vehicles, so-called Urban Air Mobility). ITS will have to be integrated with Unmanned Aerial Systems Traffic Management (UTM) and rely on 3 Dimensional (3D) connectivity provided by Integrated Aerial-Terrestrial 6G networks to achieve this support. In other words, various types of Unmanned Aerial Vehicles (UAVs) will become integral parts of future ITS due to their mobility, autonomous operation, and communication/processing capabilities. This article presents our view on the future integration of ITS and UTM systems, enabling wireless technologies and open research questions. We also present how UAVs can be used to enhance the performance of the currently available ITS.

[146]  arXiv:2110.09935 [pdf, ps, other]
Title: Random Feature Approximation for Online Nonlinear Graph Topology Identification
Subjects: Machine Learning (cs.LG); Signal Processing (eess.SP)

Online topology estimation of graph-connected time series is challenging, especially since the causal dependencies in many real-world networks are nonlinear. In this paper, we propose a kernel-based algorithm for graph topology estimation. The algorithm uses a Fourier-based Random feature approximation to tackle the curse of dimensionality associated with the kernel representations. Exploiting the fact that the real-world networks often exhibit sparse topologies, we propose a group lasso based optimization framework, which is solve using an iterative composite objective mirror descent method, yielding an online algorithm with fixed computational complexity per iteration. The experiments conducted on real and synthetic data show that the proposed method outperforms its competitors.

[147]  arXiv:2110.09936 [pdf, other]
Title: NeuralDiff: Segmenting 3D objects that move in egocentric videos
Comments: 3DV2021. Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)

Given a raw video sequence taken from a freely-moving camera, we study the problem of decomposing the observed 3D scene into a static background and a dynamic foreground containing the objects that move in the video sequence. This task is reminiscent of the classic background subtraction problem, but is significantly harder because all parts of the scene, static and dynamic, generate a large apparent motion due to the camera large viewpoint change. In particular, we consider egocentric videos and further separate the dynamic component into objects and the actor that observes and moves them. We achieve this factorization by reconstructing the video via a triple-stream neural rendering network that explains the different motions based on corresponding inductive biases. We demonstrate that our method can successfully separate the different types of motion, outperforming recent neural rendering baselines at this task, and can accurately segment moving objects. We do so by assessing the method empirically on challenging videos from the EPIC-KITCHENS dataset which we augment with appropriate annotations to create a new benchmark for the task of dynamic object segmentation on unconstrained video sequences, for complex 3D environments.

[148]  arXiv:2110.09937 [pdf, other]
Title: Collective Shortest Paths for Minimizing Congestion on Temporal Load-Aware Road Networks
Comments: 10 pages, to appear at the IWCTS Workshop at SIGSPATIAL 2021
Subjects: Databases (cs.DB); Data Structures and Algorithms (cs.DS)

Shortest path queries over graphs are usually considered as isolated tasks, where the goal is to return the shortest path for each individual query. In practice, however, such queries are typically part of a system (e.g., a road network) and their execution dynamically affects other queries and network parameters, such as the loads on edges, which in turn affects the shortest paths. We study the problem of collectively processing shortest path queries, where the objective is to optimize a collective objective, such as minimizing the overall cost. We define a temporal load-aware network that dynamically tracks expected loads while satisfying the desirable `first in, first out' property. We develop temporal load-aware extensions of widely used shortest path algorithms, and a scalable collective routing solution that seeks to reduce system-wide congestion through dynamic path reassignment. Experiments illustrate that our collective approach to this NP-hard problem achieves improvements in a variety of performance measures, such as, i) reducing average travel times by up to 63%, ii) producing fairer suggestions across queries, and iii) distributing load across up to 97% of a city's road network capacity. The proposed approach is generalizable, which allows it to be adapted for other concurrent query processing tasks over networks.

[149]  arXiv:2110.09940 [pdf, other]
Title: Learning Representations that Support Robust Transfer of Predictors
Subjects: Machine Learning (cs.LG)

Ensuring generalization to unseen environments remains a challenge. Domain shift can lead to substantially degraded performance unless shifts are well-exercised within the available training environments. We introduce a simple robust estimation criterion -- transfer risk -- that is specifically geared towards optimizing transfer to new environments. Effectively, the criterion amounts to finding a representation that minimizes the risk of applying any optimal predictor trained on one environment to another. The transfer risk essentially decomposes into two terms, a direct transfer term and a weighted gradient-matching term arising from the optimality of per-environment predictors. Although inspired by IRM, we show that transfer risk serves as a better out-of-distribution generalization criterion, both theoretically and empirically. We further demonstrate the impact of optimizing such transfer risk on two controlled settings, each representing a different pattern of environment shift, as well as on two real-world datasets. Experimentally, the approach outperforms baselines across various out-of-distribution generalization tasks. Code is available at \url{https://github.com/Newbeeer/TRM}.

[150]  arXiv:2110.09943 [pdf, other]
Title: BAMLD: Bayesian Active Meta-Learning by Disagreement
Comments: submitted for publication
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

Data-efficient learning algorithms are essential in many practical applications for which data collection and labeling is expensive or infeasible, e.g., for autonomous cars. To address this problem, meta-learning infers an inductive bias from a set of meta-training tasks in order to learn new, but related, task using a small number of samples. Most studies assume the meta-learner to have access to labeled data sets from a large number of tasks. In practice, one may have available only unlabeled data sets from the tasks, requiring a costly labeling procedure to be carried out before use in standard meta-learning schemes. To decrease the number of labeling requests for meta-training tasks, this paper introduces an information-theoretic active task selection mechanism which quantifies the epistemic uncertainty via disagreements among the predictions obtained under different inductive biases. We detail an instantiation for nonparametric methods based on Gaussian Process Regression, and report its empirical performance results that compare favourably against existing heuristic acquisition mechanisms.

[151]  arXiv:2110.09947 [pdf, other]
Title: Using Program Synthesis and Inductive Logic Programming to solve Bongard Problems
Comments: Equal contribution from first two authors. Accepted at the 10th International Workshop on Approaches and Applications of Inductive Programming as a Work In Progress Report
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Programming Languages (cs.PL)

The ability to recognise and make analogies is often used as a measure or test of human intelligence. The ability to solve Bongard problems is an example of such a test. It has also been postulated that the ability to rapidly construct novel abstractions is critical to being able to solve analogical problems. Given an image, the ability to construct a program that would generate that image is one form of abstraction, as exemplified in the Dreamcoder project. In this paper, we present a preliminary examination of whether programs constructed by Dreamcoder can be used for analogical reasoning to solve certain Bongard problems. We use Dreamcoder to discover programs that generate the images in a Bongard problem and represent each of these as a sequence of state transitions. We decorate the states using positional information in an automated manner and then encode the resulting sequence into logical facts in Prolog. We use inductive logic programming (ILP), to learn an (interpretable) theory for the abstract concept involved in an instance of a Bongard problem. Experiments on synthetically created Bongard problems for concepts such as 'above/below' and 'clockwise/counterclockwise' demonstrate that our end-to-end system can solve such problems. We study the importance and completeness of each component of our approach, highlighting its current limitations and pointing to directions for improvement in our formulation as well as in elements of any Dreamcoder-like program synthesis system used for such an approach.

[152]  arXiv:2110.09951 [pdf, other]
Title: Talking Head Generation with Audio and Speech Related Facial Action Units
Comments: Accepted by BMVC 2021
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)

The task of talking head generation is to synthesize a lip synchronized talking head video by inputting an arbitrary face image and audio clips. Most existing methods ignore the local driving information of the mouth muscles. In this paper, we propose a novel recurrent generative network that uses both audio and speech-related facial action units (AUs) as the driving information. AU information related to the mouth can guide the movement of the mouth more accurately. Since speech is highly correlated with speech-related AUs, we propose an Audio-to-AU module in our system to predict the speech-related AU information from speech. In addition, we use AU classifier to ensure that the generated images contain correct AU information. Frame discriminator is also constructed for adversarial training to improve the realism of the generated face. We verify the effectiveness of our model on the GRID dataset and TCD-TIMIT dataset. We also conduct an ablation study to verify the contribution of each component in our model. Quantitative and qualitative experiments demonstrate that our method outperforms existing methods in both image quality and lip-sync accuracy.

[153]  arXiv:2110.09962 [pdf, other]
Title: PR-CIM: a Variation-Aware Binary-Neural-Network Framework for Process-Resilient Computation-in-memory
Comments: 8 pages, 11 figures
Subjects: Machine Learning (cs.LG)

Binary neural networks (BNNs) that use 1-bit weights and activations have garnered interest as extreme quantization provides low power dissipation. By implementing BNNs as computing-in-memory (CIM), which computes multiplication and accumulations on memory arrays in an analog fashion, namely analog CIM, we can further improve the energy efficiency to process neural networks. However, analog CIMs suffer from the potential problem that process variation degrades the accuracy of BNNs. Our Monte-Carlo simulations show that in an SRAM-based analog CIM of VGG-9, the classification accuracy of CIFAR-10 is degraded even below 20% under process variations of 65nm CMOS. To overcome this problem, we present a variation-aware BNN framework. The proposed framework is developed for SRAM-based BNN CIMs since SRAM is most widely used as on-chip memory, however easily extensible to BNN CIMs based on other memories. Our extensive experimental results show that under process variation of 65nm CMOS, our framework significantly improves the CIFAR-10 accuracies of SRAM-based BNN CIMs, from 10% and 10.1% to 87.76% and 77.74% for VGG-9 and RESNET-18 respectively.

[154]  arXiv:2110.09972 [pdf, ps, other]
Title: Exploring the Gap between Tolerant and Non-tolerant Distribution Testing
Comments: 22 pages
Subjects: Data Structures and Algorithms (cs.DS)

The framework of distribution testing is currently ubiquitous in the field of property testing. In this model, the input is a probability distribution accessible via independently drawn samples from an oracle. The testing task is to distinguish a distribution that satisfies some property from a distribution that is far from satisfying it in the $\ell_1$ distance. The task of tolerant testing imposes a further restriction, that distributions close to satisfying the property are also accepted. This work focuses on the connection of the sample complexities of non-tolerant ("traditional") testing of distributions and tolerant testing thereof. When limiting our scope to label-invariant (symmetric) properties of distribution, we prove that the gap is at most quadratic. Conversely, the property of being the uniform distribution is indeed known to have an almost-quadratic gap. When moving to general, not necessarily label-invariant properties, the situation is more complicated, and we show some partial results. We show that if a property requires the distributions to be non-concentrated, then it cannot be non-tolerantly tested with $o(\sqrt{n})$ many samples, where $n$ denotes the universe size. Clearly, this implies at most a quadratic gap, because a distribution can be learned (and hence tolerantly tested against any property) using $\mathcal{O}(n)$ many samples. Being non-concentrated is a strong requirement on the property, as we also prove a close to linear lower bound against their tolerant tests. To provide evidence for other general cases (where the properties are not necessarily label-invariant), we show that if an input distribution is very concentrated, in the sense that it is mostly supported on a subset of size $s$ of the universe, then it can be learned using only $\mathcal{O}(s)$ many samples. The learning procedure adapts to the input, and works without knowing $s$ in advance.

[155]  arXiv:2110.09974 [pdf, other]
Title: TsmoBN: Interventional Generalization for Unseen Clients in Federated Learning
Subjects: Machine Learning (cs.LG)

Generalizing federated learning (FL) models to unseen clients with non-iid data is a crucial topic, yet unsolved so far. In this work, we propose to tackle this problem from a novel causal perspective. Specifically, we form a training structural causal model (SCM) to explain the challenges of model generalization in a distributed learning paradigm. Based on this, we present a simple yet effective method using test-specific and momentum tracked batch normalization (TsmoBN) to generalize FL models to testing clients. We give a causal analysis by formulating another testing SCM and demonstrate that the key factor in TsmoBN is the test-specific statistics (i.e., mean and variance) of features. Such statistics can be seen as a surrogate variable for causal intervention. In addition, by considering generalization bounds in FL, we show that our TsmoBN method can reduce divergence between training and testing feature distributions, which achieves a lower generalization gap than standard model testing. Our extensive experimental evaluations demonstrate significant improvements for unseen client generalization on three datasets with various types of feature distributions and numbers of clients. It is worth noting that our proposed approach can be flexibly applied to different state-of-the-art federated learning algorithms and is orthogonal to existing domain generalization methods.

[156]  arXiv:2110.09977 [pdf, other]
Title: An Ultra-Reliable Low-Latency Non-Binary Polar Coded SCMA Scheme
Subjects: Information Theory (cs.IT); Networking and Internet Architecture (cs.NI)

The joint transmission scheme of polar codes and sparse code multiple access (SCMA) has been regarded as a promising technology for future wireless communication systems. However, most of the existing polar-coded SCMA (PC-SCMA) systems suffer from high latency caused by the feedback iteration and list decoding. In addition, the error performance of PC-SCMA systems is unsatisfactory for ultra-reliable transmission. Inspired by the compelling benefits of non-binary polar codes, in this paper, we design a non-binary polar-coded SCMA (NB-PC-SCMA) system with a free order matching strategy to address the issues of delay and reliability. Specifically, we first formulate a joint factor graph for NB-PC-SCMA and propose a non-binary successive cancellation list (NB-SCL) and damping based joint iterative detection and decoding (NSD-JIDD) multiuser receiver to improve the BER and latency performance. Then, a lazy-search based NB-SCL (L-NB-SCL) decoding is proposed to reduce the computational complexity by modifying the path search pattern of the list decoder. After that, we optimize the update of user nodes for SCMA detection to improve the convergence error and finally propose the optimized NSD-JIDD (OSD-JIDD) algorithm, which can avoid redundant operations by exploiting L-NB-SCL decoding. Simulation results show that the proposed NB-PC-SCMA system achieves better bit error rate (BER) performance and considerable latency gain when compared to its counterparts. In particular, the proposed OSD-JIDD can achieve similar BER performance of NSD-JIDD with less complexity.

[157]  arXiv:2110.09978 [pdf, other]
Title: What is Learned in Knowledge Graph Embeddings?
Comments: 16 pages
Subjects: Artificial Intelligence (cs.AI)

A knowledge graph (KG) is a data structure which represents entities and relations as the vertices and edges of a directed graph with edge types. KGs are an important primitive in modern machine learning and artificial intelligence. Embedding-based models, such as the seminal TransE [Bordes et al., 2013] and the recent PairRE [Chao et al., 2020] are among the most popular and successful approaches for representing KGs and inferring missing edges (link completion). Their relative success is often credited in the literature to their ability to learn logical rules between the relations.
In this work, we investigate whether learning rules between relations is indeed what drives the performance of embedding-based methods. We define motif learning and two alternative mechanisms, network learning (based only on the connectivity of the KG, ignoring the relation types), and unstructured statistical learning (ignoring the connectivity of the graph). Using experiments on synthetic KGs, we show that KG models can learn motifs and how this ability is degraded by non-motif (noise) edges. We propose tests to distinguish the contributions of the three mechanisms to performance, and apply them to popular KG benchmarks. We also discuss an issue with the standard performance testing protocol and suggest an improvement.
To appear in the proceedings of Complex Networks 2021.

[158]  arXiv:2110.09987 [pdf, other]
Title: Energy-based Accounting Model for Heterogeneous Supercomputers
Comments: 9 pages, 1 figure
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

In this paper we present a new accounting model for heterogeneous supercomputers. An increasing number of supercomputing centres adopt heterogeneous architectures consisting of CPUs and hardware accelerators for their systems. Accounting models using the core hour as unit of measure are redefined to provide an appropriate charging rate based on the computing performance of different processing elements, as well as their energy efficiency and purchase price. In this paper we provide an overview of existing models and define a new model that, while retaining the core hour as a fundamental concept, takes into account the interplay among resources such as CPUs and RAM, and that bases the GPU charging rate on energy consumption. We believe that this model, designed for Pawsey Supercomputing Research Centre's next supercomputer Setonix, has a lot of advantages compared to other models, introducing carbon footprint as a primary driver in determining the allocation of computational workflow on heterogeneous resources.

[159]  arXiv:2110.09991 [pdf, other]
Title: Towards Optimal Correlational Object Search
Comments: 10 pages, 4 figures, 3 tables
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)

In realistic applications of object search, robots will need to locate target objects in complex environments while coping with unreliable sensors, especially for small or hard-to-detect objects. In such settings, correlational information can be valuable for planning efficiently: when looking for a fork, the robot could start by locating the easier-to-detect refrigerator, since forks would probably be found nearby. Previous approaches to object search with correlational information typically resort to ad-hoc or greedy search strategies. In this paper, we propose the Correlational Object Search POMDP (COS-POMDP), which can be solved to produce search strategies that use correlational information. COS-POMDPs contain a correlation-based observation model that allows us to avoid the exponential blow-up of maintaining a joint belief about all objects, while preserving the optimal solution to this naive, exponential POMDP formulation. We propose a hierarchical planning algorithm to scale up COS-POMDP for practical domains. We conduct experiments using AI2-THOR, a realistic simulator of household environments, as well as YOLOv5, a widely-used object detector. Our results show that, particularly for hard-to-detect objects, such as scrub brush and remote control, our method offers the most robust performance compared to baselines that ignore correlations as well as a greedy, next-best view approach.

[160]  arXiv:2110.09993 [pdf, other]
Title: A Unified and Refined Convergence Analysis for Non-Convex Decentralized Learning
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Optimization and Control (math.OC)

We study the consensus decentralized optimization problem where the objective function is the average of $n$ agents private non-convex cost functions; moreover, the agents can only communicate to their neighbors on a given network topology. The stochastic online setting is considered in this paper where each agent can only access a noisy estimate of its gradient. Many decentralized methods can solve such problems including EXTRA, Exact-Diffusion/D$^2$, and gradient-tracking. Unlike the famed $\small \text{DSGD}$ algorithm, these methods have been shown to be robust to the heterogeneity of the local cost functions. However, the established convergence rates for these methods indicate that their sensitivity to the network topology is worse than $\small \text{DSGD}$. Such theoretical results imply that these methods can perform much worse than $\small \text{DSGD}$ over sparse networks, which, however, contradicts empirical experiments where $\small \text{DSGD}$ is observed to be more sensitive to the network topology.
In this work, we study a general stochastic unified decentralized algorithm ($\small\textbf{SUDA}$) that includes the above methods as special cases. We establish the convergence of $\small\textbf{SUDA}$ under both non-convex and the Polyak-Lojasiewicz condition settings. Our results provide improved network topology dependent bounds for these methods (such as Exact-Diffusion/D$^2$ and gradient-tracking) compared with existing literature. Moreover, our result shows that these method are less sensitive to the network topology compared to $\small \text{DSGD}$, which agrees with numerical experiments.

[161]  arXiv:2110.09994 [pdf, other]
Title: DPFM: Deep Partial Functional Maps
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)

We consider the problem of computing dense correspondences between non-rigid shapes with potentially significant partiality. Existing formulations tackle this problem through heavy manifold optimization in the spectral domain, given hand-crafted shape descriptors. In this paper, we propose the first learning method aimed directly at partial non-rigid shape correspondence. Our approach uses the functional map framework, can be trained in a supervised or unsupervised manner, and learns descriptors directly from the data, thus both improving robustness and accuracy in challenging cases. Furthermore, unlike existing techniques, our method is also applicable to partial-to-partial non-rigid matching, in which the common regions on both shapes are unknown a priori. We demonstrate that the resulting method is data-efficient, and achieves state-of-the-art results on several benchmark datasets. Our code and data can be found online: https://github.com/pvnieo/DPFM

[162]  arXiv:2110.09995 [pdf, other]
Title: Design of AoI-Aware 5G Uplink Scheduler UsingReinforcement Learning
Subjects: Networking and Internet Architecture (cs.NI)

Age of Information (AoI) reflects the time that is elapsed from the generation of a packet by a 5G user equipment(UE) to the reception of the packet by a controller. A design of an AoI-aware radio resource scheduler for UEs via reinforcement learning is proposed in this paper. In this paper, we consider a remote control environment in which a number of UEs are transmitting time-sensitive measurements to a remote controller. We consider the AoI minimization problem and formulate the problem as a trade-off between minimizing the sum of the expected AoI of all UEs and maximizing the throughput of the network. Inspired by the success of machine learning in solving large networking problems at low complexity, we develop a reinforcement learning-based method to solve the formulated problem. We used the state-of-the-art proximal policy optimization algorithm to solve this problem. Our simulation results showthat the proposed algorithm outperforms the considered baselines in terms of minimizing the expected AoI while maintaining the network throughput.

[163]  arXiv:2110.09996 [pdf, other]
Title: Switched Control Applied to a Totem-Pole Bridgeless Rectifier for Power Factor Correction
Comments: 7 pages, 6 figures, conference
Subjects: Systems and Control (eess.SY)

The wide range of operation of bridgeless rectifiers requires a control technique that guarantee robustness. Linear Power Factor Correction (PFC) control techniques, although effective, cannot guarantee such robustness. Nonlinear techniques such as one cycle control are more robust, but other options should be explored. In this work, an affine model is obtained for a Totem-Pole Bridgeless Rectifier (TPBR). An extension to an existing switched control design technique is presented in order to achieve PFC in a robust fashion for the TPBR. Simulations with nonideal components and distorted grid voltage show a precise, fast and robust performance of the switched controller. The effective reference following of the proposed method allows the user to define a current reference waveform that prioritize THD or power factor, depending on the application and norm requirements.

[164]  arXiv:2110.09998 [pdf, other]
Title: Watch out for the risky actors: Assessing risk in dynamic environments for safe driving
Comments: preprint version
Subjects: Artificial Intelligence (cs.AI); Robotics (cs.RO)

Driving in a dynamic environment that consists of other actors is inherently a risky task as each actor influences the driving decision and may significantly limit the number of choices in terms of navigation and safety plan. The risk encountered by the Ego actor depends on the driving scenario and the uncertainty associated with predicting the future trajectories of the other actors in the driving scenario. However, not all objects pose a similar risk. Depending on the object's type, trajectory, position, and the associated uncertainty with these quantities; some objects pose a much higher risk than others. The higher the risk associated with an actor, the more attention must be directed towards that actor in terms of resources and safety planning. In this paper, we propose a novel risk metric to calculate the importance of each actor in the world and demonstrate its usefulness through a case study.

[165]  arXiv:2110.10004 [pdf, other]
Title: Scalable Learning Environments for Teaching Cybersecurity Hands-on
Comments: 9 pages, 6 figures, 1 table
Subjects: Computers and Society (cs.CY)

This Innovative Practice full paper describes a technical innovation for scalable teaching of cybersecurity hands-on classes using interactive learning environments. Hands-on experience significantly improves the practical skills of learners. However, the preparation and delivery of hands-on classes usually do not scale. Teaching even small groups of students requires a substantial effort to prepare the class environment and practical assignments. Further issues are associated with teaching large classes, providing feedback, and analyzing learning gains. We present our research effort and practical experience in designing and using learning environments that scale up hands-on cybersecurity classes. The environments support virtual networks with full-fledged operating systems and devices that emulate real-world systems.
(...)
Using the presented environments KYPO Cyber Range Platform and Cyber Sandbox Creator, we delivered the classes on-site or remotely for various target groups of learners (K-12, university students, and professional learners). The learners value the realistic nature of the environments that enable exercising theoretical concepts and tools. The instructors value time-efficiency when preparing and deploying the hands-on activities. Engineering and computing educators can freely use our software, which we have released under an open-source license. We also provide detailed documentation and exemplary hands-on training to help other educators adopt our teaching innovations and enable sharing of reusable components within the community.

[166]  arXiv:2110.10007 [pdf, other]
Title: Gradient-Based Mixed Planning with Discrete and Continuous Actions
Comments: 36 pages, 20 figures
Subjects: Artificial Intelligence (cs.AI)

Dealing with planning problems with both discrete logical relations and continuous numeric changes in real-world dynamic environments is challenging. Existing numeric planning systems for the problem often discretize numeric variables or impose convex quadratic constraints on numeric variables, which harms the performance when solving the problem. In this paper, we propose a novel algorithm framework to solve the numeric planning problems mixed with discrete and continuous actions based on gradient descent. We cast the numeric planning with discrete and continuous actions as an optimization problem by integrating a heuristic function based on discrete effects. Specifically, we propose a gradient-based framework to simultaneously optimize continuous parameters and actions of candidate plans. The framework is combined with a heuristic module to estimate the best plan candidate to transit initial state to the goal based on relaxation. We repeatedly update numeric parameters and compute candidate plan until it converges to a valid plan to the planning problem. In the empirical study, we exhibit that our algorithm framework is both effective and efficient, especially when solving non-convex planning problems.

[167]  arXiv:2110.10009 [pdf, other]
Title: EEGminer: Discovering Interpretable Features of Brain Activity with Learnable Filters
Comments: 13 pages, 8 figures
Subjects: Machine Learning (cs.LG); Human-Computer Interaction (cs.HC)

Patterns of brain activity are associated with different brain processes and can be used to identify different brain states and make behavioral predictions. However, the relevant features are not readily apparent and accessible. To mine informative latent representations from multichannel EEG recordings, we propose a novel differentiable EEG decoding pipeline consisting of learnable filters and a pre-determined feature extraction module. Specifically, we introduce filters parameterized by generalized Gaussian functions that offer a smooth derivative for stable end-to-end model training and allow for learning interpretable features. For the feature module, we use signal magnitude and functional connectivity. We demonstrate the utility of our model towards emotion recognition from EEG signals on the SEED dataset, as well as on a new EEG dataset of unprecedented size (i.e., 763 subjects), where we identify consistent trends of music perception and related individual differences. The discovered features align with previous neuroscience studies and offer new insights, such as marked differences in the functional connectivity profile between left and right temporal areas during music listening. This agrees with the respective specialisation of the temporal lobes regarding music perception proposed in the literature.

[168]  arXiv:2110.10010 [pdf, other]
Title: Temporal separation of whale vocalizations from background oceanic noise using a power calculation
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)

The process of analyzing audio signals in search of cetacean vocalizations is in many cases a very arduous task, requiring many complex computations, a plethora of digital processing techniques and the scrutinization of an audio signal with a fine comb to determine where the vocalizations are located. To ease this process, a computationally efficient and noise-resistant method for determining whether an audio segment contains a potential cetacean call is developed here with the help of a robust power calculation for stationary Gaussian noise signals and a recursive method for determining the mean and variance of a given sample frame. The resulting detector is tested on audio recordings containing Southern Right whale sounds and its performance compared to that of an existing contemporary energy detector. The detector exhibits good performance at moderate-to-high signal-to-noise ratio values. The detector succeeds in being easy to implement, computationally efficient to use and robust enough to accurately detect whale vocalizations in a noisy underwater environment.

[169]  arXiv:2110.10011 [pdf, other]
Title: Riemannian classification of EEG signals with missing values
Subjects: Human-Computer Interaction (cs.HC); Signal Processing (eess.SP); Machine Learning (stat.ML)

This paper proposes two strategies to handle missing data for the classification of electroencephalograms using covariance matrices. The first approach estimates the covariance from imputed data with the $k$-nearest neighbors algorithm; the second relies on the observed data by leveraging the observed-data likelihood within an expectation-maximization algorithm. Both approaches are combined with the minimum distance to Riemannian mean classifier and applied to a classification task of event related-potentials, a widely known paradigm of brain-computer interface paradigms. As results show, the proposed strategies perform better than the classification based on observed data and allow to keep a high accuracy even when the missing data ratio increases.

[170]  arXiv:2110.10015 [pdf, other]
Title: DEEPAGÉ: Answering Questions in Portuguese about the Brazilian Environment
Comments: Accepted at BRACIS 2021
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)

The challenge of climate change and biome conservation is one of the most pressing issues of our time - particularly in Brazil, where key environmental reserves are located. Given the availability of large textual databases on ecological themes, it is natural to resort to question answering (QA) systems to increase social awareness and understanding about these topics. In this work, we introduce multiple QA systems that combine in novel ways the BM25 algorithm, a sparse retrieval technique, with PTT5, a pre-trained state-of-the-art language model. Our QA systems focus on the Portuguese language, thus offering resources not found elsewhere in the literature. As training data, we collected questions from open-domain datasets, as well as content from the Portuguese Wikipedia and news from the press. We thus contribute with innovative architectures and novel applications, attaining an F1-score of 36.2 with our best model.

[171]  arXiv:2110.10017 [pdf, other]
Title: Neural Network Compatible Off-Policy Natural Actor-Critic Algorithm
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)

Learning optimal behavior from existing data is one of the most important problems in Reinforcement Learning (RL). This is known as "off-policy control" in RL where an agent's objective is to compute an optimal policy based on the data obtained from the given policy (known as the behavior policy). As the optimal policy can be very different from the behavior policy, learning optimal behavior is very hard in the "off-policy" setting compared to the "on-policy" setting where new data from the policy updates will be utilized in learning. This work proposes an off-policy natural actor-critic algorithm that utilizes state-action distribution correction for handling the off-policy behavior and the natural policy gradient for sample efficiency. The existing natural gradient-based actor-critic algorithms with convergence guarantees require fixed features for approximating both policy and value functions. This often leads to sub-optimal learning in many RL applications. On the other hand, our proposed algorithm utilizes compatible features that enable one to use arbitrary neural networks to approximate the policy and the value function and guarantee convergence to a locally optimal policy. We illustrate the benefit of the proposed off-policy natural gradient algorithm by comparing it with the vanilla gradient actor-critic algorithm on benchmark RL tasks.

[172]  arXiv:2110.10018 [pdf, ps, other]
Title: Dynamic pricing and assortment under a contextual MNL demand
Subjects: Machine Learning (cs.LG)

We consider dynamic multi-product pricing and assortment problems under an unknown demand over T periods, where in each period, the seller decides on the price for each product or the assortment of products to offer to a customer who chooses according to an unknown Multinomial Logit Model (MNL). Such problems arise in many applications, including online retail and advertising. We propose a randomized dynamic pricing policy based on a variant of the Online Newton Step algorithm (ONS) that achieves a $O(d\sqrt{T}\log(T))$ regret guarantee under an adversarial arrival model. We also present a new optimistic algorithm for the adversarial MNL contextual bandits problem, which achieves a better dependency than the state-of-the-art algorithms in a problem-dependent constant $\kappa$ (potentially exponentially small). Our regret upper bounds scale as $\tilde{O}(d\sqrt{\kappa T}+ \log(T)/\kappa)$, which gives a significantly stronger bound than the existing $\tilde{O}(d\sqrt{T}/\kappa)$ guarantees.

[173]  arXiv:2110.10022 [pdf, other]
Title: Robust Control of a Multi-Axis Shape Memory Alloy-Driven Soft Manipulator
Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Subjects: Robotics (cs.RO)

Control of soft robotic manipulators remains a challenge for designs with advanced capabilities and novel actuation. Two significant limitations are multi-axis, three-dimensional motion of soft bodies alongside actuator dynamics and constraints, both of which are present in shape-memory-alloy (SMA)-powered soft robots. This article addresses both concerns with a robust feedback control scheme, demonstrating state tracking control for a soft robot manipulator of this type. Our controller uses a static beam bending model to approximate the soft limb as an LTI system, alongside a singular-value-decomposition compensator approach to decouple the multi-axial motion and an anti-windup element for the actuator saturation. We prove stability and verify robustness of our controller, with robustness intended to account for the unmodeled dynamics. Our implementation is verified in hardware tests of a soft SMA-powered limb, showing low tracking error, with promising results for future multi-limbed robots.

[174]  arXiv:2110.10024 [pdf, other]
Title: Risks of AI Foundation Models in Education
Subjects: Computers and Society (cs.CY); Artificial Intelligence (cs.AI)

If the authors of a recent Stanford report (Bommasani et al., 2021) on the opportunities and risks of "foundation models" are to be believed, these models represent a paradigm shift for AI and for the domains in which they will supposedly be used, including education. Although the name is new (and contested (Field, 2021)), the term describes existing types of algorithmic models that are "trained on broad data at scale" and "fine-tuned" (i.e., adapted) for particular downstream tasks, and is intended to encompass large language models such as BERT or GPT-3 and computer vision models such as CLIP. Such technologies have the potential for harm broadly speaking (e.g., Bender et al., 2021), but their use in the educational domain is particularly fraught, despite the potential benefits for learners claimed by the authors. In section 3.3 of the Stanford report, Malik et al. argue that achieving the goal of providing education for all learners requires more efficient computational approaches that can rapidly scale across educational domains and across educational contexts, for which they argue foundation models are uniquely well-suited. However, evidence suggests that not only are foundation models not likely to achieve the stated benefits for learners, but their use may also introduce new risks for harm.

[175]  arXiv:2110.10030 [pdf, other]
Title: Accelerating Framework of Transformer by Hardware Design and Model Compression Co-Optimization
Subjects: Machine Learning (cs.LG)

State-of-the-art Transformer-based models, with gigantic parameters, are difficult to be accommodated on resource constrained embedded devices. Moreover, with the development of technology, more and more embedded devices are available to run a Transformer model. For a Transformer model with different constraints (tight or loose), it can be deployed onto devices with different computing power. However, in previous work, designers did not choose the best device among multiple devices. Instead, they just used an existing device to deploy model, which was not necessarily the best fit and may lead to underutilization of resources. To address the deployment challenge of Transformer and the problem to select the best device, we propose an algorithm & hardware closed-loop acceleration framework. Given a dataset, a model, latency constraint LC and accuracy constraint AC, our framework can provide a best device satisfying both constraints. In order to generate a compressed model with high sparsity ratio, we propose a novel pruning technique, hierarchical pruning (HP). We optimize the sparse matrix storage format for HP matrix to further reduce memory usage for FPGA implementation. We design a accelerator that takes advantage of HP to solve the problem of concurrent random access. Experiments on Transformer and TinyBert model show that our framework can find different devices for various LC and AC, covering from low-end devices to high-end devices. Our HP can achieve higher sparsity ratio and is more flexible than other sparsity pattern. Our framework can achieve 37x, 1.9x, 1.7x speedup compared to CPU, GPU and FPGA, respectively.

[176]  arXiv:2110.10031 [pdf, other]
Title: Online Continual Learning on Class Incremental Blurry Task Configuration with Anytime Inference
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)

Despite rapid advances in continual learning, a large body of research is devoted to improving performance in the existing setups. While a handful of work do propose new continual learning setups, they still lack practicality in certain aspects. For better practicality, we first propose a novel continual learning setup that is online, task-free, class-incremental, of blurry task boundaries and subject to inference queries at any moment. We additionally propose a new metric to better measure the performance of the continual learning methods subject to inference queries at any moment. To address the challenging setup and evaluation protocol, we propose an effective method that employs a new memory management scheme and novel learning techniques. Our empirical validation demonstrates that the proposed method outperforms prior arts by large margins.

[177]  arXiv:2110.10034 [pdf, ps, other]
Title: Formal Power Series Approach to Nonlinear Systems with Additive Static Feedback
Comments: Submitted to International Journal of Control
Subjects: Systems and Control (eess.SY)

The goal of this paper is to compute the generating series of a closed-loop system when the plant is described in terms of a Chen-Fliess series and an additive static output feedback is applied. The first step is to consider the so called Wiener-Fliess connection consisting of a Chen-Fliess series followed by a memoryless function. Of particular importance will be the contractive nature of this map, which is needed to show that the closed-loop system has a Chen-Fliess series representation. To explicitly compute the generating series, two Hopf algebras are needed, the existing output feedback Hopf algebra used to describe dynamic output feedback, and the Hopf algebra of the shuffle group. These two combinatorial structures are combined to compute what will be called the Wiener-Fliess feedback product. It will be shown that this product has a natural interpretation as a transformation group acting on the plant and preserves the relative degree of the plant. The convergence of the Wiener-Fliess composition product and the additive static feedback product are completely characterized.

[178]  arXiv:2110.10035 [pdf]
Title: A Soft-Rigid Hybrid Gripper with Lateral Compliance and Dexterous In-hand Manipulation
Subjects: Robotics (cs.RO)

Soft grippers are receiving growing attention due to their compliance-based interactive safety and dexterity. Hybrid gripper (soft actuators enhanced by rigid constraints) is a new trend in soft gripper design. With right structural components actuated by soft actuators, they could achieve excellent grasping adaptability and payload, while also being easy to model and control with conventional kinematics. However, existing works were mostly focused on achieving superior payload and perception with simple planar workspaces, resulting in far less dexterity compared with conventional grippers. In this work, we took inspiration from the human Metacarpophalangeal (MCP) joint and proposed a new hybrid gripper design with 8 independent muscles. It was shown that adding the MCP complexity was critical in enabling a range of novel features in the hybrid gripper, including in-hand manipulation, lateral passive compliance, as well as new control modes. A prototype gripper was fabricated and tested on our proprietary dual-arm robot platform with vision guided grasping. With very lightweight pneumatic bellows soft actuators, the gripper could grasp objects over 25 times its own weight with lateral compliance. Using the dual-arm platform, highly anthropomorphic dexterous manipulations were demonstrated using two hybrid grippers, from Tug-of-war on a rigid rod, to passing a soft towel between two grippers using in-hand manipulation. Matching with the novel features and performance specifications of the proposed hybrid gripper, the underlying modeling, actuation, control, and experimental validation details were also presented, offering a promising approach to achieving enhanced dexterity, strength, and compliance in robotic grippers.

[179]  arXiv:2110.10037 [pdf, ps, other]
Title: Java Card Virtual Machine Memory Organization: a Design Proposal
Subjects: Cryptography and Security (cs.CR)

The Java Card Virtual Machine (JCVM) platform is widely deployed on security-oriented components. JCVM implementations are mainly evaluated under security schemes. However, existing implementation are close-source without detail. We believe studying how to design JCVM will improve them and it can be reused by the community to improve Java Card security.
In 2018, Bouffard et al. [6] introduced an Operating System (OS) which aims at running JCVM compatible implementation. This OS is compatible with several Commercially available Off-The-Shelf (COTS) components. This is a first step to design a secure JCVM platform.
However, some important details are missing to design a secure-oriented Java Card platform. In this article, we focus on the JCVM memory. This memory contains everything required to run JCVM and applets. Currently, JCVM memory is out of the Java Card specification and each JCVM developer use his own approach. Based on the existing tools and documentation, we explain how to extract from the Java Card toolchain every data required by applets and JCVM. When data to store in memory are identified, this article introduces how to organize required data onto JCVM memory.

[180]  arXiv:2110.10038 [pdf, other]
Title: Coalitional Bayesian Autoencoders -- Towards explainable unsupervised deep learning
Comments: Preprint submitted to Journal of Applied Soft Computing
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

This paper aims to improve the explainability of Autoencoder's (AE) predictions by proposing two explanation methods based on the mean and epistemic uncertainty of log-likelihood estimate, which naturally arise from the probabilistic formulation of the AE called Bayesian Autoencoders (BAE). To quantitatively evaluate the performance of explanation methods, we test them in sensor network applications, and propose three metrics based on covariate shift of sensors : (1) G-mean of Spearman drift coefficients, (2) G-mean of sensitivity-specificity of explanation ranking and (3) sensor explanation quality index (SEQI) which combines the two aforementioned metrics. Surprisingly, we find that explanations of BAE's predictions suffer from high correlation resulting in misleading explanations. To alleviate this, a "Coalitional BAE" is proposed, which is inspired by agent-based system theory. Our comprehensive experiments on publicly available condition monitoring datasets demonstrate the improved quality of explanations using the Coalitional BAE.

[181]  arXiv:2110.10041 [pdf, other]
Title: Learning-based Fast Path Planning in Complex Environments
Comments: Accepted by ROBIO2021
Subjects: Robotics (cs.RO)

In this paper, we present a novel path planning algorithm to achieve fast path planning in complex environments. Most existing path planning algorithms are difficult to quickly find a feasible path in complex environments or even fail. However, our proposed framework can overcome this difficulty by using a learning-based prediction module and a sampling-based path planning module. The prediction module utilizes an auto-encoder-decoder-like convolutional neural network (CNN) to output a promising region where the feasible path probably lies in. In this process, the environment is treated as an RGB image to feed in our designed CNN module, and the output is also an RGB image. No extra computation is required so that we can maintain a high processing speed of 60 frames-per-second (FPS). Incorporated with a sampling-based path planner, we can extract a feasible path from the output image so that the robot can track it from start to goal. To demonstrate the advantage of the proposed algorithm, we compare it with conventional path planning algorithms in a series of simulation experiments. The results reveal that the proposed algorithm can achieve much better performance in terms of planning time, success rate, and path length.

[182]  arXiv:2110.10048 [pdf, other]
Title: Improving Tail-Class Representation with Centroid Contrastive Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV)

In vision domain, large-scale natural datasets typically exhibit long-tailed distribution which has large class imbalance between head and tail classes. This distribution poses difficulty in learning good representations for tail classes. Recent developments have shown good long-tailed model can be learnt by decoupling the training into representation learning and classifier balancing. However, these works pay insufficient consideration on the long-tailed effect on representation learning. In this work, we propose interpolative centroid contrastive learning (ICCL) to improve long-tailed representation learning. ICCL interpolates two images from a class-agnostic sampler and a class-aware sampler, and trains the model such that the representation of the interpolative image can be used to retrieve the centroids for both source classes. We demonstrate the effectiveness of our approach on multiple long-tailed image classification benchmarks. Our result shows a significant accuracy gain of 2.8% on the iNaturalist 2018 dataset with a real-world long-tailed distribution.

[183]  arXiv:2110.10049 [pdf, other]
Title: Boosting Graph Embedding on a Single GPU
Comments: 12 pages, 11 tables, 6 figures, submitted for publication at Special Section on Parallel and Distributed Computing Techniques for AI, ML, and DL
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Machine Learning (cs.LG)

Graphs are ubiquitous, and they can model unique characteristics and complex relations of real-life systems. Although using machine learning (ML) on graphs is promising, their raw representation is not suitable for ML algorithms. Graph embedding represents each node of a graph as a d-dimensional vector which is more suitable for ML tasks. However, the embedding process is expensive, and CPU-based tools do not scale to real-world graphs. In this work, we present GOSH, a GPU-based tool for embedding large-scale graphs with minimum hardware constraints. GOSH employs a novel graph coarsening algorithm to enhance the impact of updates and minimize the work for embedding. It also incorporates a decomposition schema that enables any arbitrarily large graph to be embedded with a single GPU. As a result, GOSH sets a new state-of-the-art in link prediction both in accuracy and speed, and delivers high-quality embeddings for node classification at a fraction of the time compared to the state-of-the-art. For instance, it can embed a graph with over 65 million vertices and 1.8 billion edges in less than 30 minutes on a single GPU.

[184]  arXiv:2110.10052 [pdf, other]
Title: Optimal Grid-Forming Control of Battery Energy Storage Systems Providing Multiple Services: Modelling and Experimental Validation
Comments: 10 pages, 5 figures
Subjects: Systems and Control (eess.SY)

This paper proposes and experimentally validates a joint control and scheduling framework for a grid-forming converter-interfaced BESS providing multiple services to the electrical grid. The framework is designed to dispatch the operation of a distribution feeder hosting heterogeneous prosumers according to a dispatch plan and provide frequency containment reserve and voltage control as additional services. The framework consists of three phases. In the day-ahead scheduling phase, a robust optimization problem is solved to compute the optimal dispatch plan and frequency droop coefficient, accounting for the uncertainty of the aggregated prosumption. In the intra-day phase, a model predictive control algorithm is used to compute the power set-point for the BESS to achieve the tracking of the dispatch plan. Finally, in a real-time stage, the power setpoint originated by the dispatch tracking is converted into a feasible frequency set-point for the grid forming converter by means of a convex optimisation problem accounting for the capability curve of the power converter. The proposed framework is experimentally validated by using a grid-scale 720 kVA/560 kWh BESS connected to a 20 kV distribution feeder of the EPFL hosting stochastic prosumption and PV generation.

[185]  arXiv:2110.10053 [pdf, other]
Title: Energy Management System for Resilience-Oriented Operation of Ship Power Systems
Subjects: Systems and Control (eess.SY)

This paper proposes an original energy management methodology for enhancing the resilience of ship power systems considering multiple types of energy storage systems, including battery energy storage systems (BESS) and supercapacitor energy storage systems (SCESS). The primary function of the proposed EMS is to maximize the load operability while taking ramp-rate characteristics of energy storage systems (ESS) and generators into account innovatively. Balancing state-of-charge (SoC) of BESS and prioritizing the SoC level of SCESS are two additional objectives of the proposed EMS to manage energy storage systems. The receding horizon optimization (RHO) technique is proposed to reduce the computational burden, making the proposed method feasible for real-time applications. An all-electric MVDC ship power system is used to evaluate the performance of the proposed methodology. Simulation studies and results demonstrate the effectiveness of the proposed method in managing the ESS to ensure the system resilience under generation power shortage. In addition, the proposed RHO technique significantly reduces the computation burden seen in the FHO technique while maintaining an acceptable resilience performance.

[186]  arXiv:2110.10054 [pdf, other]
Title: Generating Symbolic Reasoning Problems with Transformer GANs
Subjects: Machine Learning (cs.LG)

Constructing training data for symbolic reasoning domains is challenging: Existing instances are typically hand-crafted and too few to be trained on directly and synthetically generated instances are often hard to evaluate in terms of their meaningfulness. We study the capabilities of GANs and Wasserstein GANs equipped with Transformer encoders to generate sensible and challenging training data for symbolic reasoning domains. We conduct experiments on two problem domains where Transformers have been successfully applied recently: symbolic mathematics and temporal specifications in verification. Even without autoregression, our GAN models produce syntactically correct instances. We show that the generated data can be used as a substitute for real training data when training a classifier, and, especially, that training data can be generated from a real dataset that is too small to be trained on directly. Using a GAN setting also allows us to alter the target distribution: We show that by adding a classifier uncertainty part to the generator objective, we obtain a dataset that is even harder to solve for a classifier than our original dataset.

[187]  arXiv:2110.10060 [pdf, ps, other]
Title: Hermite multiwavelets for manifold-valued data
Comments: 19 pages
Subjects: Numerical Analysis (math.NA)

In this paper we present a construction of interpolatory Hermite multiwavelets for functions that take values in nonlinear geometries such as Riemannian manifolds or Lie groups. We rely on the strong connection between wavelets and subdivision schemes to define a prediction-correction approach based on Hermite subdivision schemes that operate on manifold-valued data. The main result concerns the decay of the wavelet coefficients: We show that our manifold-valued construction essentially admits the same coefficient decay as linear Hermite wavelets, which also generalizes results on manifold-valued scalar wavelets.

[188]  arXiv:2110.10064 [pdf, ps, other]
Title: Idiomatic Expression Identification using Semantic Compatibility
Comments: Accepted at Transactions of the Association for Computational Linguistics (TACL)
Subjects: Computation and Language (cs.CL)

Idiomatic expressions are an integral part of natural language and constantly being added to a language. Owing to their non-compositionality and their ability to take on a figurative or literal meaning depending on the sentential context, they have been a classical challenge for NLP systems. To address this challenge, we study the task of detecting whether a sentence has an idiomatic expression and localizing it. Prior art for this task had studied specific classes of idiomatic expressions offering limited views of their generalizability to new idioms. We propose a multi-stage neural architecture with the attention flow mechanism for identifying these expressions. The network effectively fuses contextual and lexical information at different levels using word and sub-word representations. Empirical evaluations on three of the largest benchmark datasets with idiomatic expressions of varied syntactic patterns and degrees of non-compositionality show that our proposed model achieves new state-of-the-art results. A salient feature of the model is its ability to identify idioms unseen during training with gains from 1.4% to 30.8% over competitive baselines on the largest dataset.

[189]  arXiv:2110.10067 [pdf, other]
Title: CORA: Benchmarks, Baselines, and Metrics as a Platform for Continual Reinforcement Learning Agents
Comments: Repository available at this https URL
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)

Progress in continual reinforcement learning has been limited due to several barriers to entry: missing code, high compute requirements, and a lack of suitable benchmarks. In this work, we present CORA, a platform for Continual Reinforcement Learning Agents that provides benchmarks, baselines, and metrics in a single code package. The benchmarks we provide are designed to evaluate different aspects of the continual RL challenge, such as catastrophic forgetting, plasticity, ability to generalize, and sample-efficient learning. Three of the benchmarks utilize video game environments (Atari, Procgen, NetHack). The fourth benchmark, CHORES, consists of four different task sequences in a visually realistic home simulator, drawn from a diverse set of task and scene parameters. To compare continual RL methods on these benchmarks, we prepare three metrics in CORA: continual evaluation, forgetting, and zero-shot forward transfer. Finally, CORA includes a set of performant, open-source baselines of existing algorithms for researchers to use and expand on. We release CORA and hope that the continual RL community can benefit from our contributions, to accelerate the development of new continual RL algorithms.

[190]  arXiv:2110.10075 [pdf, other]
Title: Improving the Accuracy-Memory Trade-Off of Random Forests Via Leaf-Refinement
Comments: 2 algorithms, 6 tables, 4 plots and a very long appendix
Subjects: Machine Learning (cs.LG); Hardware Architecture (cs.AR)

Random Forests (RF) are among the state-of-the-art in many machine learning applications. With the ongoing integration of ML models into everyday life, the deployment and continuous application of models becomes more and more an important issue. Hence, small models which offer good predictive performance but use small amounts of memory are required. Ensemble pruning is a standard technique to remove unnecessary classifiers from an ensemble to reduce the overall resource consumption and sometimes even improve the performance of the original ensemble. In this paper, we revisit ensemble pruning in the context of `modernly' trained Random Forests where trees are very large. We show that the improvement effects of pruning diminishes for ensembles of large trees but that pruning has an overall better accuracy-memory trade-off than RF. However, pruning does not offer fine-grained control over this trade-off because it removes entire trees from the ensemble. To further improve the accuracy-memory trade-off we present a simple, yet surprisingly effective algorithm that refines the predictions in the leaf nodes in the forest via stochastic gradient descent. We evaluate our method against 7 state-of-the-art pruning methods and show that our method outperforms the other methods on 11 of 16 datasets with a statistically significant better accuracy-memory trade-off compared to most methods. We conclude our experimental evaluation with a case study showing that our method can be applied in a real-world setting.

[191]  arXiv:2110.10081 [pdf, other]
Title: Stateful Offline Contextual Policy Evaluation and Learning
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

We study off-policy evaluation and learning from sequential data in a structured class of Markov decision processes that arise from repeated interactions with an exogenous sequence of arrivals with contexts, which generate unknown individual-level responses to agent actions. This model can be thought of as an offline generalization of contextual bandits with resource constraints. We formalize the relevant causal structure of problems such as dynamic personalized pricing and other operations management problems in the presence of potentially high-dimensional user types. The key insight is that an individual-level response is often not causally affected by the state variable and can therefore easily be generalized across timesteps and states. When this is true, we study implications for (doubly robust) off-policy evaluation and learning by instead leveraging single time-step evaluation, estimating the expectation over a single arrival via data from a population, for fitted-value iteration in a marginal MDP. We study sample complexity and analyze error amplification that leads to the persistence, rather than attenuation, of confounding error over time. In simulations of dynamic and capacitated pricing, we show improved out-of-sample policy performance in this class of relevant problems.

[192]  arXiv:2110.10083 [pdf, other]
Title: Contrastive Active Inference
Comments: Accepted as a conference paper at 35th Conference on Neural Information Processing Systems (NeurIPS 2021)
Subjects: Machine Learning (cs.LG)

Active inference is a unifying theory for perception and action resting upon the idea that the brain maintains an internal model of the world by minimizing free energy. From a behavioral perspective, active inference agents can be seen as self-evidencing beings that act to fulfill their optimistic predictions, namely preferred outcomes or goals. In contrast, reinforcement learning requires human-designed rewards to accomplish any desired outcome. Although active inference could provide a more natural self-supervised objective for control, its applicability has been limited because of the shortcomings in scaling the approach to complex environments. In this work, we propose a contrastive objective for active inference that strongly reduces the computational burden in learning the agent's generative model and planning future actions. Our method performs notably better than likelihood-based active inference in image-based tasks, while also being computationally cheaper and easier to train. We compare to reinforcement learning agents that have access to human-designed reward functions, showing that our approach closely matches their performance. Finally, we also show that contrastive methods perform significantly better in the case of distractors in the environment and that our method is able to generalize goals to variations in the background.

[193]  arXiv:2110.10086 [pdf, ps, other]
Title: Three Attacks on Proof-of-Stake Ethereum
Subjects: Cryptography and Security (cs.CR)

Recently, two attacks were presented against Proof-of-Stake (PoS) Ethereum: one where short-range reorganizations of the underlying consensus chain are used to increase individual validators' profits and delay consensus decisions, and one where adversarial network delay is leveraged to stall consensus decisions indefinitely. We provide refined variants of these attacks, considerably relaxing the requirements on adversarial stake and network timing, and thus rendering the attacks more severe. Combining techniques from both refined attacks, we obtain a third attack which allows an adversary with vanishingly small fraction of stake and no control over network message propagation (assuming instead probabilistic message propagation) to cause even long-range consensus chain reorganizations. Honest-but-rational or ideologically motivated validators could use this attack to increase their profits or stall the protocol, threatening incentive alignment and security of PoS Ethereum. The attack can also lead to destabilization of consensus from congestion in vote processing.

[194]  arXiv:2110.10090 [pdf, other]
Title: Inductive Biases and Variable Creation in Self-Attention Mechanisms
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

Self-attention, an architectural motif designed to model long-range interactions in sequential data, has driven numerous recent breakthroughs in natural language processing and beyond. This work provides a theoretical analysis of the inductive biases of self-attention modules, where our focus is to rigorously establish which functions and long-range dependencies self-attention blocks prefer to represent. Our main result shows that bounded-norm Transformer layers create sparse variables: they can represent sparse functions of the input sequence, with sample complexity scaling only logarithmically with the context length. Furthermore, we propose new experimental protocols to support this analysis and to guide the practice of training Transformers, built around the large body of work on provably learning sparse Boolean functions.

[195]  arXiv:2110.10091 [pdf, ps, other]
Title: Approximating Local Graph Structure in Almost Random Order Streams
Subjects: Data Structures and Algorithms (cs.DS)

The random order streaming model has been very fruitful for graph streams, allowing for polylogarithmic or even constant space estimators for fundamental graph problems such as matching size estimation, counting the number of connected components and more. However, the assumption that there are no correlations between the order of arrival of edges in the stream is quite strong. In this paper we introduce (hidden) batch random order streams, where edges are grouped in "batches" (which are unknown to the algorithm) that arrive in a random order, as a natural framework for modelling hidden correlations in the arrival order of edges, and present algorithms and lower bounds for this model.
On the algorithmic side, we show how known techniques for connected component counting in constant space due to Peng and Sohler [SODA `18] easily translate from random order streams to our model with only a small loss in parameters. Our algorithm obtains an additive $\varepsilon n$ approximation to the number of connected components in the input graph using space $(1/\varepsilon)^{O(1/\varepsilon)}$ by building a representative sample of vertices in the graph that belong to $O(1/\varepsilon)$-size components to estimate the count. On the lower bound side, we show that $(1/\varepsilon)^{\Omega(1/\varepsilon)}$ space is necessary for finding a connected component of size $O(1/\varepsilon)$ even in graphs where most vertices reside in such components -- this makes progress towards an open problem of Peng and Sohler [SODA `18] and constitutes our main technical contribution. The lower bound uses Fourier analytic techniques inspired by the Boolean Hidden Matching problem. Our main innovation here is the first framework for applying such a Fourier analytic approach to a communication game with a polynomial number of players.

[196]  arXiv:2110.10097 [pdf, other]
Title: Data-Driven Predictive Control for Connected and Autonomous Vehicles in Mixed Traffic
Comments: 8 figures, 3 figures
Subjects: Systems and Control (eess.SY)

Cooperative control of Connected and Autonomous Vehicles (CAVs) promises great benefits for mixed traffic. Most existing research focuses on model-based control strategies, assuming that car-following dynamics of human-driven vehicles (HDVs) are explicitly known. In this paper, instead of relying on a parametric car-following model, we introduce a data-driven predictive control strategy to achieve safe and optimal control for CAVs in mixed traffic. We first present a linearized dynamical model for mixed traffic systems, and investigate its controllability and observability. Based on these control-theoretic properties, we then propose a novel DeeP-LCC (Data-EnablEd Predictive Leading Cruise Control) strategy for CAVs based on measurable driving data to smooth mixed traffic . Our method is implemented in a receding horizon manner, in which input/output constraints are incorporated to achieve collision-free guarantees. Nonlinear traffic simulations show that DeeP-LCC can save up to 24.96% fuel consumption during a braking scenario of Extra-Urban Driving Cycle while ensuring safety.

[197]  arXiv:2110.10099 [pdf, other]
Title: Matrix Discrepancy from Quantum Communication
Subjects: Data Structures and Algorithms (cs.DS)

We develop a novel connection between discrepancy minimization and (quantum) communication complexity. As an application, we resolve a substantial special case of the Matrix Spencer conjecture. In particular, we show that for every collection of symmetric $n \times n$ matrices $A_1,\ldots,A_n$ with $\|A_i\| \leq 1$ and $\|A_i\|_F \leq n^{1/4}$ there exist signs $x \in \{ \pm 1\}^n$ such that the maximum eigenvalue of $\sum_{i \leq n} x_i A_i$ is at most $O(\sqrt n)$. We give a polynomial-time algorithm based on partial coloring and semidefinite programming to find such $x$.
Our techniques open a new avenue to use tools from communication complexity and information theory to study discrepancy. The proof of our main result combines a simple compression scheme for transcripts of repeated (quantum) communication protocols with quantum state purification, the Holevo bound from quantum information, and tools from sketching and dimensionality reduction. Our approach also offers a promising avenue to resolve the Matrix Spencer conjecture completely -- we show it is implied by a natural conjecture in quantum communication complexity.

[198]  arXiv:2110.10101 [pdf, other]
Title: Domain Generalization through Audio-Visual Relative Norm Alignment in First Person Action Recognition
Comments: Accepted at WACV 2022
Subjects: Computer Vision and Pattern Recognition (cs.CV)

First person action recognition is becoming an increasingly researched area thanks to the rising popularity of wearable cameras. This is bringing to light cross-domain issues that are yet to be addressed in this context. Indeed, the information extracted from learned representations suffers from an intrinsic "environmental bias". This strongly affects the ability to generalize to unseen scenarios, limiting the application of current methods to real settings where labeled data are not available during training. In this work, we introduce the first domain generalization approach for egocentric activity recognition, by proposing a new audio-visual loss, called Relative Norm Alignment loss. It re-balances the contributions from the two modalities during training, over different domains, by aligning their feature norm representations. Our approach leads to strong results in domain generalization on both EPIC-Kitchens-55 and EPIC-Kitchens-100, as demonstrated by extensive experiments, and can be extended to work also on domain adaptation settings with competitive results.

[199]  arXiv:2110.10103 [pdf, other]
Title: Continual self-training with bootstrapped remixing for speech enhancement
Comments: Submitted to ICASSP 2022
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)

We propose RemixIT, a simple and novel self-supervised training method for speech enhancement. The proposed method is based on a continuously self-training scheme that overcomes limitations from previous studies including assumptions for the in-domain noise distribution and having access to clean target signals. Specifically, a separation teacher model is pre-trained on an out-of-domain dataset and is used to infer estimated target signals for a batch of in-domain mixtures. Next, we bootstrap the mixing process by generating artificial mixtures using permuted estimated clean and noise signals. Finally, the student model is trained using the permuted estimated sources as targets while we periodically update teacher's weights using the latest student model. Our experiments show that RemixIT outperforms several previous state-of-the-art self-supervised methods under multiple speech enhancement tasks. Additionally, RemixIT provides a seamless alternative for semi-supervised and unsupervised domain adaptation for speech enhancement tasks, while being general enough to be applied to any separation task and paired with any separation model.

[200]  arXiv:2110.10106 [pdf, other]
Title: Subframework-Based Rigidity Control in Multirobot Networks
Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Subjects: Systems and Control (eess.SY)

This paper presents an alternative approach for analyzing distance-based rigidity in networks of mobile agents, based on a subframework scheme. The advantage of this point of view lies in expressing framework rigidity, which is inherently global, as a localized property. Also, we show that a framework's normalized rigidity eigenvalue degrades as its graph diameter increases. Thus, the rigidity eigenvalues associated to the subframeworks arise naturally as localized rigidity metrics. A decentralized subframework-based controller for maintaining rigidity using only range measurements is developed, which is also aimed to minimize the network's communication load. Finally, we show that the information exchange required by the controller is completed in a finite number of iterations, showing the convenience of the proposed scheme.

[201]  arXiv:2110.10107 [pdf, other]
Title: An Unconstrained Convex Formulation of Compliant Contact
Comments: 20 pages with 22 figures. Submitted to IEEE Transactions on Robotics (T-RO). The supplemental video is available publicly at this https URL
Subjects: Computational Engineering, Finance, and Science (cs.CE)

We present a convex formulation of compliant frictional contact and a robust, performant method to solve it in practice. By analytically eliminating contact constraints, we obtain an unconstrained convex problem. Our solver has proven global convergence and warm-starts effectively, enabling simulation at interactive rates. We develop compact analytical expressions of contact forces allowing us to describe our model in clear physical terms and to rigorously characterize our approximations. Moreover, this enables us not only to model point contact, but also to incorporate sophisticated models of compliant contact patches. Our time stepping scheme includes the midpoint rule, which we demonstrate achieves second order accuracy even with frictional contact. We introduce a number of accuracy metrics and show our method outperforms existing commercial and open source alternatives without sacrificing accuracy. Finally, we demonstrate robust simulation of robotic manipulation tasks at interactive rates, with accurately resolved stiction and contact transitions, as required for meaningful sim-to-real transfer. Our method is implemented in the open source robotics toolkit Drake.

[202]  arXiv:2110.10108 [pdf, other]
Title: TESSERACT: Gradient Flip Score to Secure Federated Learning Against Model Poisoning Attacks
Comments: 12 pages
Subjects: Machine Learning (cs.LG)

Federated learning---multi-party, distributed learning in a decentralized environment---is vulnerable to model poisoning attacks, even more so than centralized learning approaches. This is because malicious clients can collude and send in carefully tailored model updates to make the global model inaccurate. This motivated the development of Byzantine-resilient federated learning algorithms, such as Krum, Bulyan, FABA, and FoolsGold. However, a recently developed untargeted model poisoning attack showed that all prior defenses can be bypassed. The attack uses the intuition that simply by changing the sign of the gradient updates that the optimizer is computing, for a set of malicious clients, a model can be diverted from the optima to increase the test error rate. In this work, we develop TESSERACT---a defense against this directed deviation attack, a state-of-the-art model poisoning attack. TESSERACT is based on a simple intuition that in a federated learning setting, certain patterns of gradient flips are indicative of an attack. This intuition is remarkably stable across different learning algorithms, models, and datasets. TESSERACT assigns reputation scores to the participating clients based on their behavior during the training phase and then takes a weighted contribution of the clients. We show that TESSERACT provides robustness against even a white-box version of the attack.

[203]  arXiv:2110.10110 [pdf, ps, other]
Title: Scheduling Improves the Performance of Belief Propagation for Noisy Group Testing
Subjects: Information Theory (cs.IT)

This paper considers the noisy group testing problem where among a large population of items some are defective. The goal is to identify all defective items by testing groups of items, with the minimum possible number of tests. The focus of this work is on the practical settings with a limited number of items rather than the asymptotic regime. In the current literature, belief propagation has been shown to be effective in recovering defective items from the test results. In this work, we adopt two variants of the belief propagation algorithm for the noisy group testing problem. These algorithms have been used successfully in the decoding of low-density parity-check codes. We perform an experimental study and using extensive simulations we show that these algorithms achieve higher success probability, lower false-negative, and false-positive rates compared to the traditional belief propagation algorithm. For instance, our results show that the proposed algorithms can reduce the false-negative rate by about $50\%$ (or more) when compared to the traditional BP algorithm, under the combinatorial model. Moreover, under the probabilistic model, this reduction in the false-negative rate increases to about $80\%$ for the tested cases.

[204]  arXiv:2110.10116 [pdf, ps, other]
Title: On the Global Convergence of Momentum-based Policy Gradient
Subjects: Machine Learning (cs.LG); Optimization and Control (math.OC)

Policy gradient (PG) methods are popular and efficient for large-scale reinforcement learning due to their relative stability and incremental nature. In recent years, the empirical success of PG methods has led to the development of a theoretical foundation for these methods. In this work, we generalize this line of research by studying the global convergence of stochastic PG methods with momentum terms, which have been demonstrated to be efficient recipes for improving PG methods. We study both the soft-max and the Fisher-non-degenerate policy parametrizations, and show that adding a momentum improves the global optimality sample complexity of vanilla PG methods by $\tilde{\mathcal{O}}(\epsilon^{-1.5})$ and $\tilde{\mathcal{O}}(\epsilon^{-1})$, respectively, where $\epsilon>0$ is the target tolerance. Our work is the first one that obtains global convergence results for the momentum-based PG methods. For the generic Fisher-non-degenerate policy parametrizations, our result is the first single-loop and finite-batch PG algorithm achieving $\tilde{O}(\epsilon^{-3})$ global optimality sample complexity. Finally, as a by-product, our methods also provide general framework for analyzing the global convergence rates of stochastic PG methods, which can be easily applied and extended to different PG estimators.

[205]  arXiv:2110.10117 [pdf, other]
Title: Beyond Exact Gradients: Convergence of Stochastic Soft-Max Policy Gradient Methods with Entropy Regularization
Subjects: Machine Learning (cs.LG)

Entropy regularization is an efficient technique for encouraging exploration and preventing a premature convergence of (vanilla) policy gradient methods in reinforcement learning (RL). However, the theoretical understanding of entropy regularized RL algorithms has been limited. In this paper, we revisit the classical entropy regularized policy gradient methods with the soft-max policy parametrization, whose convergence has so far only been established assuming access to exact gradient oracles. To go beyond this scenario, we propose the first set of (nearly) unbiased stochastic policy gradient estimators with trajectory-level entropy regularization, with one being an unbiased visitation measure-based estimator and the other one being a nearly unbiased yet more practical trajectory-based estimator. We prove that although the estimators themselves are unbounded in general due to the additional logarithmic policy rewards introduced by the entropy term, the variances are uniformly bounded. This enables the development of the first set of convergence results for stochastic entropy regularized policy gradient methods to both stationary points and globally optimal policies. We also develop some improved sample complexity results under a good initialization.

[206]  arXiv:2110.10122 [pdf, other]
Title: Electricity Tariff Design via Lens of Energy Justice
Subjects: Systems and Control (eess.SY)

Distributed Energy Resources (DERs) can significantly affect the net social benefit in power systems, raising concerns pertaining to distributive justice, equity, and fairness. Electricity tariff and DERs share a symbiotic relationship whereby the design of the former directly impacts the economic efficiency and equity in the system. Current tariff design approaches suffer from opaque efficiency-equity trade-offs and are also agnostic of the externalities that affect both economic efficiency and equity. Therefore, this paper develops a justice-cognizant tariff design framework that improves the economic efficiency of tariff without sacrificing its distributional equity, and encompasses economic welfare, social costs of environmental and public health impacts, and socio-economic and demographic characteristics of electricity consumers. The proposed framework is based on a Single Leader Single Follower (SLSF) game incorporating a multi-objective optimization problem, and is evaluated on four different tariff structures. The SLSF game is reformulated as a Multi-Objective Problem with Equilibrium Constraints (MOPEC) and is solved by integrating the objective sum method for multi-objective optimization and Scholtes's relaxation technique for equilibrium constraints. We compare the economic efficiency and equity of the proposed framework using the 11-zone New York ISO and 7-bus Manhattan power networks. The results demonstrate that spatially- and temporally-granular tariffs ensure equity and economic efficiency at a lower energy burden to consumers.

[207]  arXiv:2110.10123 [pdf, other]
Title: BlockIoT: Blockchain-based Health Data Integration using IoT Devices
Subjects: Human-Computer Interaction (cs.HC)

The development and adoption of Electronic Health Records (EHR) and health monitoring Internet of Things (IoT) Devices have enabled digitization of patient records and has also substantially transformed the healthcare delivery system in aspects such as remote patient monitoring, healthcare decision making, and medical research. However, data tends to be fragmented among health infrastructures and prevents interoperability of medical data at the point of care. In order to address this gap, we introduce BlockIoT that uses blockchain technology to transfer previously inaccessible and centralized data from medical devices to EHR systems, which provides greater insight to providers who can, in turn, provide better outcomes for patients. This notion of interoperability of medical device data is possible through an Application Programming Interface (API), which serves as a versatile endpoint for all incoming medical device data, a distributed file system that ensures data resilience, and knowledge templates that analyze, identify, and represent medical device data to providers. Our participatory design survey on BlockIoT demonstrates that BlockIoT is a suitable system to supplement physicians' clinical practice and increases efficiency in most healthcare specialties, including cardiology, pulmonology, endocrinology, and primary care.

[208]  arXiv:2110.10124 [pdf, other]
Title: A Numerical Scheme for Wave Turbulence: 3-Wave Kinetic Equations
Subjects: Numerical Analysis (math.NA); Analysis of PDEs (math.AP)

We introduce a finite volume scheme to solve isotropic 3-wave kinetic equations. We test our numerical solution against theoretical results concerning the long time behavior of the energy and observe that our solutions verify the energy cascade phenomenon. Up to our knowledge, this is the first numerical scheme that could capture the long time asymptotic behavior of solutions to isotropic 3-wave kinetic equations, where the energy cascade can be observed. Our numerical energy cascade rates are in good agreement with the theoretical one obtained by Soffer and Tran. Our finite volume algorithm relies on a new identity, that allows one to reduce the number of terms needed to be approximated in the collision operators.

[209]  arXiv:2110.10128 [pdf, other]
Title: Robust Event Classification Using Imperfect Real-world PMU Data
Subjects: Machine Learning (cs.LG)

This paper studies robust event classification using imperfect real-world phasor measurement unit (PMU) data. By analyzing the real-world PMU data, we find it is challenging to directly use this dataset for event classifiers due to the low data quality observed in PMU measurements and event logs. To address these challenges, we develop a novel machine learning framework for training robust event classifiers, which consists of three main steps: data preprocessing, fine-grained event data extraction, and feature engineering. Specifically, the data preprocessing step addresses the data quality issues of PMU measurements (e.g., bad data and missing data); in the fine-grained event data extraction step, a model-free event detection method is developed to accurately localize the events from the inaccurate event timestamps in the event logs; and the feature engineering step constructs the event features based on the patterns of different event types, in order to improve the performance and the interpretability of the event classifiers. Based on the proposed framework, we develop a workflow for event classification using the real-world PMU data streaming into the system in real-time. Using the proposed framework, robust event classifiers can be efficiently trained based on many off-the-shelf lightweight machine learning models. Numerical experiments using the real-world dataset from the Western Interconnection of the U.S power transmission grid show that the event classifiers trained under the proposed framework can achieve high classification accuracy while being robust against low-quality data.

[210]  arXiv:2110.10129 [pdf, other]
Title: Gummy Browsers: Targeted Browser Spoofing against State-of-the-Art Fingerprinting Techniques
Subjects: Cryptography and Security (cs.CR)

We present a simple yet potentially devastating and hard-to-detect threat, called Gummy Browsers, whereby the browser fingerprinting information can be collected and spoofed without the victim's awareness, thereby compromising the privacy and security of any application that uses browser fingerprinting. The idea is that attacker A first makes the user U connect to his website (or to a well-known site the attacker controls) and transparently collects the information from U that is used for fingerprinting purposes. Then, A orchestrates a browser on his own machine to replicate and transmit the same fingerprinting information when connecting to W, fooling W to think that U is the one requesting the service rather than A. This will allow the attacker to profile U and compromise U's privacy. We design and implement the Gummy Browsers attack using three orchestration methods based on script injection, browser settings and debugging tools, and script modification, that can successfully spoof a wide variety of fingerprinting features to mimic many different browsers (including mobile browsers and the Tor browser). We then evaluate the attack against two state-of-the-art browser fingerprinting systems, FPStalker and Panopticlick. Our results show that A can accurately match his own manipulated browser fingerprint with that of any targeted victim user U's fingerprint for a long period of time, without significantly affecting the tracking of U and when only collecting U's fingerprinting information only once. The TPR (true positive rate) for the tracking of the benign user in the presence of the attack is larger than 0.9 in most cases. The FPR (false positive rate) for the tracking of the attacker is also high, larger than 0.9 in all cases. We also argue that the attack can remain completely oblivious to the user and the website, thus making it extremely difficult to thwart in practice.

[211]  arXiv:2110.10131 [pdf, other]
Title: Personal Health Knowledge Graph for Clinically Relevant Diet Recommendations
Subjects: Human-Computer Interaction (cs.HC)

We propose a knowledge model for capturing dietary preferences and personal context to provide personalized dietary recommendations. We develop a knowledge model called the Personal Health Ontology, which is grounded in semantic technologies, and represents a patient's combined medical information, social determinants of health, and observations of daily living elicited from interviews with diabetic patients. We then generate a personal health knowledge graph that captures temporal patterns from synthetic food logs, annotated with concepts from the Personal Health Ontology. We further discuss how lifestyle guidelines grounded in semantic technologies can be reasoned with the generated personal health knowledge graph to provide appropriate dietary recommendations that satisfy the user's medical and other lifestyle needs.

[212]  arXiv:2110.10132 [pdf, other]
Title: FriendlyCore: Practical Differentially Private Aggregation
Subjects: Machine Learning (cs.LG); Data Structures and Algorithms (cs.DS)

Differentially private algorithms for common metric aggregation tasks, such as clustering or averaging, often have limited practicality due to their complexity or a large number of data points that is required for accurate results. We propose a simple and practical tool $\mathsf{FriendlyCore}$ that takes a set of points ${\cal D}$ from an unrestricted (pseudo) metric space as input. When ${\cal D}$ has effective diameter $r$, $\mathsf{FriendlyCore}$ returns a "stable" subset ${\cal D}_G\subseteq {\cal D}$ that includes all points, except possibly few outliers, and is {\em certified} to have diameter $r$. $\mathsf{FriendlyCore}$ can be used to preprocess the input before privately aggregating it, potentially simplifying the aggregation or boosting its accuracy. Surprisingly, $\mathsf{FriendlyCore}$ is light-weight with no dependence on the dimension. We empirically demonstrate its advantages in boosting the accuracy of mean estimation, outperforming tailored methods.

[213]  arXiv:2110.10133 [pdf, other]
Title: Locally Differentially Private Reinforcement Learning for Linear Mixture Markov Decision Processes
Comments: 25 pages, 2 figures
Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR); Optimization and Control (math.OC); Machine Learning (stat.ML)

Reinforcement learning (RL) algorithms can be used to provide personalized services, which rely on users' private and sensitive data. To protect the users' privacy, privacy-preserving RL algorithms are in demand. In this paper, we study RL with linear function approximation and local differential privacy (LDP) guarantees. We propose a novel $(\varepsilon, \delta)$-LDP algorithm for learning a class of Markov decision processes (MDPs) dubbed linear mixture MDPs, and obtains an $\tilde{\mathcal{O}}( d^{5/4}H^{7/4}T^{3/4}\left(\log(1/\delta)\right)^{1/4}\sqrt{1/\varepsilon})$ regret, where $d$ is the dimension of feature mapping, $H$ is the length of the planning horizon, and $T$ is the number of interactions with the environment. We also prove a lower bound $\Omega(dH\sqrt{T}/\left(e^{\varepsilon}(e^{\varepsilon}-1)\right))$ for learning linear mixture MDPs under $\varepsilon$-LDP constraint. Experiments on synthetic datasets verify the effectiveness of our algorithm. To the best of our knowledge, this is the first provable privacy-preserving RL algorithm with linear function approximation.

[214]  arXiv:2110.10136 [pdf, other]
Title: Activation Landscapes as a Topological Summary of Neural Network Performance
Comments: 4 pages, 5 figures
Subjects: Machine Learning (cs.LG); Algebraic Topology (math.AT)

We use topological data analysis (TDA) to study how data transforms as it passes through successive layers of a deep neural network (DNN). We compute the persistent homology of the activation data for each layer of the network and summarize this information using persistence landscapes. The resulting feature map provides both an informative visual- ization of the network and a kernel for statistical analysis and machine learning. We observe that the topological complexity often increases with training and that the topological complexity does not decrease with each layer.

[215]  arXiv:2110.10142 [pdf, ps, other]
Title: Model Predictive Control for Automotive Climate Control Systems via Value Function Approximation
Comments: 6 pages, 5 figures, Submitted to both L-CSS IEEE and ACC 2022 invited session (currently under review)
Subjects: Systems and Control (eess.SY)

Among the auxiliary loads in light-duty vehicles, the air conditioning system is the single largest energy consumer. For electrified vehicles, the impact of heating and cooling loads becomes even more significant, as they compete with the powertrain for battery energy use and can significantly reduce the range or performance. While considerable work has been made in the field of optimal energy management for electrified vehicles and optimization of vehicle velocity for eco-driving, few contributions have addressed the application of energy-optimal control for heating and cooling loads.
This paper proposes an energy management strategy for the thermal management system of an electrified powertrain, based on Model Predictive Control. Starting from a nonlinear model of the vapor compression refrigeration system that captures the dynamics of the refrigerant in the heat exchangers and the power consumption of the system, a constrained multi-objective optimal control problem is formulated to reduce energy consumption while tracking a desired thermal set point. An efficient implementation of MPC is proposed for real-time applications by introducing a terminal cost obtained from the approximation of the global optimal solution.

[216]  arXiv:2110.10144 [pdf, other]
Title: FaxPlainAC: A Fact-Checking Tool Based on EXPLAINable Models with HumAn Correction in the Loop
Comments: 5 pages, 4 figures, accepted as a DEMO paper in CIKM 2021
Journal-ref: CIKM 2021
Subjects: Artificial Intelligence (cs.AI)

Fact-checking on the Web has become the main mechanism through which we detect the credibility of the news or information. Existing fact-checkers verify the authenticity of the information (support or refute the claim) based on secondary sources of information. However, existing approaches do not consider the problem of model updates due to constantly increasing training data due to user feedback. It is therefore important to conduct user studies to correct models' inference biases and improve the model in a life-long learning manner in the future according to the user feedback. In this paper, we present FaxPlainAC, a tool that gathers user feedback on the output of explainable fact-checking models. FaxPlainAC outputs both the model decision, i.e., whether the input fact is true or not, along with the supporting/refuting evidence considered by the model. Additionally, FaxPlainAC allows for accepting user feedback both on the prediction and explanation. Developed in Python, FaxPlainAC is designed as a modular and easily deployable tool. It can be integrated with other downstream tasks and allowing for fact-checking human annotation gathering and life-long learning.

[217]  arXiv:2110.10149 [pdf, other]
Title: Continuous Control with Action Quantization from Demonstrations
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Robotics (cs.RO)

In Reinforcement Learning (RL), discrete actions, as opposed to continuous actions, result in less complex exploration problems and the immediate computation of the maximum of the action-value function which is central to dynamic programming-based methods. In this paper, we propose a novel method: Action Quantization from Demonstrations (AQuaDem) to learn a discretization of continuous action spaces by leveraging the priors of demonstrations. This dramatically reduces the exploration problem, since the actions faced by the agent not only are in a finite number but also are plausible in light of the demonstrator's behavior. By discretizing the action space we can apply any discrete action deep RL algorithm to the continuous control problem. We evaluate the proposed method on three different setups: RL with demonstrations, RL with play data --demonstrations of a human playing in an environment but not solving any specific task-- and Imitation Learning. For all three setups, we only consider human data, which is more challenging than synthetic data. We found that AQuaDem consistently outperforms state-of-the-art continuous control methods, both in terms of performance and sample efficiency. We provide visualizations and videos in the paper's website: https://google-research.github.io/aquadem.

Cross-lists for Wed, 20 Oct 21

[218]  arXiv:2109.08930 (cross-list from cs.DB) [pdf, other]
Title: Regular Sequential Serializability and Regular Sequential Consistency
Comments: 35 pages
Subjects: Databases (cs.DB); Distributed, Parallel, and Cluster Computing (cs.DC)

Strictly serializable (linearizable) services appear to execute transactions (operations) sequentially, in an order consistent with real time. This restricts a transaction's (operation's) possible return values and in turn, simplifies application programming. In exchange, strictly serializable (linearizable) services perform worse than those with weaker consistency. But switching to such services can break applications.
This work introduces two new consistency models to ease this trade-off: regular sequential serializability (RSS) and regular sequential consistency (RSC). They are just as strong for applications: we prove any application invariant that holds when using a strictly serializable (linearizable) service also holds when using an RSS (RSC) service. Yet they relax the constraints on services -- they allow new, better-performing designs. To demonstrate this, we design, implement, and evaluate variants of two systems, Spanner and Gryff, relaxing their consistency to RSS and RSC, respectively. The new variants achieve better read-only transaction and read tail latency than their counterparts.

[219]  arXiv:2110.09516 (cross-list from stat.ML) [pdf, ps, other]
Title: Kernel Minimum Divergence Portfolios
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Portfolio Management (q-fin.PM)

Portfolio optimization is a key challenge in finance with the aim of creating portfolios matching the investors' preference. The target distribution approach relying on the Kullback-Leibler or the $f$-divergence represents one of the most effective forms of achieving this goal. In this paper, we propose to use kernel and optimal transport (KOT) based divergences to tackle the task, which relax the assumptions and the optimization constraints of the previous approaches. In case of the kernel-based maximum mean discrepancy (MMD) we (i) prove the analytic computability of the underlying mean embedding for various target distribution-kernel pairs, (ii) show that such analytic knowledge can lead to faster convergence of MMD estimators, and (iii) extend the results to the unbounded exponential kernel with minimax lower bounds. Numerical experiments demonstrate the improved performance of our KOT estimators both on synthetic and real-world examples.

[220]  arXiv:2110.09541 (cross-list from eess.SP) [pdf, other]
Title: Wideband and Entropy-Aware Deep Soft Bit Quantization
Subjects: Signal Processing (eess.SP); Information Theory (cs.IT); Machine Learning (cs.LG)

Deep learning has been recently applied to physical layer processing in digital communication systems in order to improve end-to-end performance. In this work, we introduce a novel deep learning solution for soft bit quantization across wideband channels. Our method is trained end-to-end with quantization- and entropy-aware augmentations to the loss function and is used at inference in conjunction with source coding to achieve near-optimal compression gains over wideband channels. To efficiently train our method, we prove and verify that a fixed feature space quantization scheme is sufficient for efficient learning. When tested on channel distributions never seen during training, the proposed method achieves a compression gain of up to $10 \%$ in the high SNR regime versus previous state-of-the-art methods. To encourage reproducible research, our implementation is publicly available at https://github.com/utcsilab/wideband-llr-deep.

[221]  arXiv:2110.09594 (cross-list from econ.TH) [pdf, other]
Title: Bayesian Persuasion in Sequential Trials
Subjects: Theoretical Economics (econ.TH); Computer Science and Game Theory (cs.GT)

We consider a Bayesian persuasion or information design problem where the sender tries to persuade the receiver to take a particular action via a sequence of signals. This we model by considering multi-phase trials with different experiments conducted based on the outcomes of prior experiments. In contrast to most of the literature, we consider the problem with constraints on signals imposed on the sender. This we achieve by fixing some of the experiments in an exogenous manner; these are called determined experiments. This modeling helps us understand real-world situations where this occurs: e.g., multi-phase drug trials where the FDA determines some of the experiments, funding of a startup by a venture capital firm, start-up acquisition by big firms where late-stage assessments are determined by the potential acquirer, multi-round job interviews where the candidates signal initially by presenting their qualifications but the rest of the screening procedures are determined by the interviewer. The non-determined experiments (signals) in the multi-phase trial are to be chosen by the sender in order to persuade the receiver best. With a binary state of the world, we start by deriving the optimal signaling policy in the only non-trivial configuration of a two-phase trial with binary-outcome experiments. We then generalize to multi-phase trials with binary-outcome experiments where the determined experiments can be placed at any chosen node in the trial tree. Here we present a dynamic programming algorithm to derive the optimal signaling policy that uses the two-phase trial solution's structural insights. We also contrast the optimal signaling policy structure with classical Bayesian persuasion strategies to highlight the impact of the signaling constraints on the sender.

[222]  arXiv:2110.09616 (cross-list from eess.SP) [pdf, ps, other]
Title: Model Order Estimation for A Sum of Complex Exponentials
Comments: Submitted for possible publication
Subjects: Signal Processing (eess.SP); Information Theory (cs.IT)

In this paper, we present a new method for estimating the number of terms in a sum of exponentially damped sinusoids embedded in noise. In particular, we propose to combine the shift-invariance property of the Hankel matrix associated with the signal with a constraint over its singular values to penalize small order estimations. With this new methodology, the algebraic and statistical structures of the Hankel matrix are considered. The new order estimation technique shows significant improvements over subspace-based methods. In particular, when a good separation between the noise and the signal subspaces is not possible, the new methodology outperforms known techniques. We evaluate the performance of our method using numerical experiments and comparing its performance with previous results found in the literature.

[223]  arXiv:2110.09618 (cross-list from stat.ML) [pdf, other]
Title: Interpolating between sampling and variational inference with infinite stochastic mixtures
Comments: 8 pages, 4 figures. Submitted to AISTATS 2022; under double-blind review. Code available at this https URL
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Statistics Theory (math.ST); Computation (stat.CO)

Sampling and Variational Inference (VI) are two large families of methods for approximate inference with complementary strengths. Sampling methods excel at approximating arbitrary probability distributions, but can be inefficient. VI methods are efficient, but can fail when probability distributions are complex. Here, we develop a framework for constructing intermediate algorithms that balance the strengths of both sampling and VI. Both approximate a probability distribution using a mixture of simple component distributions: in sampling, each component is a delta-function and is chosen stochastically, while in standard VI a single component is chosen to minimize divergence. We show that sampling and VI emerge as special cases of an optimization problem over a mixing distribution, and intermediate approximations arise by varying a single parameter. We then derive closed-form sampling dynamics over variational parameters that stochastically build a mixture. Finally, we discuss how to select the optimal compromise between sampling and VI given a computational budget. This work is a first step towards a highly flexible yet simple family of inference methods that combines the complementary strengths of sampling and VI.

[224]  arXiv:2110.09620 (cross-list from stat.ME) [pdf, ps, other]
Title: Sufficient Dimension Reduction for High-Dimensional Regression and Low-Dimensional Embedding: Tutorial and Survey
Comments: To appear as a part of an upcoming textbook on dimensionality reduction and manifold learning
Subjects: Methodology (stat.ME); Machine Learning (cs.LG); Statistics Theory (math.ST); Machine Learning (stat.ML)

This is a tutorial and survey paper on various methods for Sufficient Dimension Reduction (SDR). We cover these methods with both statistical high-dimensional regression perspective and machine learning approach for dimensionality reduction. We start with introducing inverse regression methods including Sliced Inverse Regression (SIR), Sliced Average Variance Estimation (SAVE), contour regression, directional regression, Principal Fitted Components (PFC), Likelihood Acquired Direction (LAD), and graphical regression. Then, we introduce forward regression methods including Principal Hessian Directions (pHd), Minimum Average Variance Estimation (MAVE), Conditional Variance Estimation (CVE), and deep SDR methods. Finally, we explain Kernel Dimension Reduction (KDR) both for supervised and unsupervised learning. We also show that supervised KDR and supervised PCA are equivalent.

[225]  arXiv:2110.09625 (cross-list from eess.AS) [pdf, other]
Title: Personalized Speech Enhancement: New Models and Comprehensive Evaluation
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)

Personalized speech enhancement (PSE) models utilize additional cues, such as speaker embeddings like d-vectors, to remove background noise and interfering speech in real-time and thus improve the speech quality of online video conferencing systems for various acoustic scenarios. In this work, we propose two neural networks for PSE that achieve superior performance to the previously proposed VoiceFilter. In addition, we create test sets that capture a variety of scenarios that users can encounter during video conferencing. Furthermore, we propose a new metric to measure the target speaker over-suppression (TSOS) problem, which was not sufficiently investigated before despite its critical importance in deployment. Besides, we propose multi-task training with a speech recognition back-end. Our results show that the proposed models can yield better speech recognition accuracy, speech intelligibility, and perceptual quality than the baseline models, and the multi-task training can alleviate the TSOS issue in addition to improving the speech recognition accuracy.

[226]  arXiv:2110.09626 (cross-list from stat.ML) [pdf, other]
Title: A cautionary tale on fitting decision trees to data from additive models: generalization lower bounds
Subjects: Machine Learning (stat.ML); Information Theory (cs.IT); Machine Learning (cs.LG)

Decision trees are important both as interpretable models amenable to high-stakes decision-making, and as building blocks of ensemble methods such as random forests and gradient boosting. Their statistical properties, however, are not well understood. The most cited prior works have focused on deriving pointwise consistency guarantees for CART in a classical nonparametric regression setting. We take a different approach, and advocate studying the generalization performance of decision trees with respect to different generative regression models. This allows us to elicit their inductive bias, that is, the assumptions the algorithms make (or do not make) to generalize to new data, thereby guiding practitioners on when and how to apply these methods. In this paper, we focus on sparse additive generative models, which have both low statistical complexity and some nonparametric flexibility. We prove a sharp squared error generalization lower bound for a large class of decision tree algorithms fitted to sparse additive models with $C^1$ component functions. This bound is surprisingly much worse than the minimax rate for estimating such sparse additive models. The inefficiency is due not to greediness, but to the loss in power for detecting global structure when we average responses solely over each leaf, an observation that suggests opportunities to improve tree-based algorithms, for example, by hierarchical shrinkage. To prove these bounds, we develop new technical machinery, establishing a novel connection between decision tree estimation and rate-distortion theory, a sub-field of information theory.

[227]  arXiv:2110.09648 (cross-list from math.PR) [pdf, ps, other]
Title: Data Flow Dissemination in a Network
Comments: 14 pages, 1 figure
Subjects: Probability (math.PR); Networking and Internet Architecture (cs.NI)

We consider the following network model motivated, in particular, by blockchains and peer-to-peer live streaming. Data packet flows originate at the network nodes and need to be disseminated to all other nodes. Packets are relayed through the network via links of limited capacity. A packet leaves the network when it is disseminated to all nodes. The network is stable when it is positive recurrent; and when it is, the age of the oldest packet, referred to as Age-of-Information (AoI) is stochastically bounded. Under the Random-Useful (RU) discipline a node $u$ communicates on link $(u,v)$ a randomly chosen available packet not present at $v$. RU discipline is known to have the maximum stability region for a single flow; we show that this extends to arbitrary number of flows. Our main results concern the Oldest-Useful (OU) discipline, under which a node $u$ communicates on link $(u,v)$ the oldest available packet not present at $v$. OU discipline is a natural candidate for reducing the AoI. We show that, surprisingly, OU \emph{does not} provide the maximum stability region. As the main result of this paper, we prove that OU \emph{does} have the maximum stability region in the important special case of a complete graph network with equal capacities on all links and equal flow rates originating in all nodes. Simulation results show that, in the latter special case, OU out-performs RU in terms of AoI.

[228]  arXiv:2110.09662 (cross-list from eess.IV) [pdf, other]
Title: Osteoporosis Prescreening using Panoramic Radiographs through a Deep Convolutional Neural Network with Attention Mechanism
Comments: 9 pages
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

Objectives. The aim of this study was to investigate whether a deep convolutional neural network (CNN) with an attention module can detect osteoporosis on panoramic radiographs.
Study Design. A dataset of 70 panoramic radiographs (PRs) from 70 different subjects of age between 49 to 60 was used, including 49 subjects with osteoporosis and 21 normal subjects. We utilized the leave-one-out cross-validation approach to generate 70 training and test splits. Specifically, for each split, one image was used for testing and the remaining 69 images were used for training. A deep convolutional neural network (CNN) using the Siamese architecture was implemented through a fine-tuning process to classify an PR image using patches extracted from eight representative trabecula bone areas (Figure 1). In order to automatically learn the importance of different PR patches, an attention module was integrated into the deep CNN. Three metrics, including osteoporosis accuracy (OPA), non-osteoporosis accuracy (NOPA) and overall accuracy (OA), were utilized for performance evaluation.
Results. The proposed baseline CNN approach achieved the OPA, NOPA and OA scores of 0.667, 0.878 and 0.814, respectively. With the help of the attention module, the OPA, NOPA and OA scores were further improved to 0.714, 0.939 and 0.871, respectively.
Conclusions. The proposed method obtained promising results using deep CNN with an attention module, which might be applied to osteoporosis prescreening.

[229]  arXiv:2110.09671 (cross-list from eess.SP) [pdf, other]
Title: Coordinated Beamforming in Quantized Massive MIMO Systems with Per-Antenna Constraints
Comments: 6 pages, 3 figures, 1 table, submitted to WCNC 2022
Subjects: Signal Processing (eess.SP); Information Theory (cs.IT)

In this work, we present a solution for coordinated beamforming for large-scale downlink (DL) communication systems with low-resolution data converters when employing a per-antenna power constraint that limits the maximum antenna power to alleviate hardware cost. To this end, we formulate and solve the antenna power minimax problem for the coarsely quantized DL system with target signal-to-interference-plus-noise ratio requirements. We show that the associated Lagrangian dual with uncertain noise covariance matrices achieves zero duality gap and that the dual solution can be used to obtain the primal DL solution. Using strong duality, we propose an iterative algorithm to determine the optimal dual solution, which is used to compute the optimal DL beamformer. We further update the noise covariance matrices using the optimal DL solution with an associated subgradient and perform projection onto the feasible domain. Through simulation, we evaluate the proposed method in maximum antenna power consumption and peak-to-average power ratio which are directly related to hardware efficiency.

[230]  arXiv:2110.09678 (cross-list from math.OC) [pdf, ps, other]
Title: Convergence Rate of Accelerated Average Consensus with Local Node Memory: Optimization and Analytic Solutions
Comments: 30 pages, 2 figures
Subjects: Optimization and Control (math.OC); Systems and Control (eess.SY)

Previous researches have shown that adding local memory can accelerate the consensus. It is natural to ask questions like what is the fastest rate achievable by the $M$-tap memory acceleration, and what are the corresponding control parameters. This paper introduces a set of effective and previously unused techniques to analyze the convergence rate of accelerated consensus with $M$-tap memory of local nodes and to design the control protocols. These effective techniques, including the Kharitonov stability theorem, the Routh stability criterion and the robust stability margin, have led to the following new results: 1) the direct link between the convergence rate and the control parameters; 2) explicit formulas of the optimal convergence rate and the corresponding optimal control parameters for $M \leq 2$ on a given graph; 3) the optimal worst-case convergence rate and the corresponding optimal control parameters for the memory $M \geq 1$ on a set of uncertain graphs. We show that the acceleration with the memory $M = 1$ provides the optimal convergence rate in the sense of worst-case performance. Several numerical examples are given to demonstrate the validity and performance of the theoretical results.

[231]  arXiv:2110.09680 (cross-list from stat.ML) [pdf, other]
Title: Multilevel Stochastic Optimization for Imputation in Massive Medical Data Records
Comments: 18 pages, 2 figures
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Applications (stat.AP)

Exploration and analysis of massive datasets has recently generated increasing interest in the research and development communities. It has long been a recognized problem that many datasets contain significant levels of missing numerical data. We introduce a mathematically principled stochastic optimization imputation method based on the theory of Kriging. This is shown to be a powerful method for imputation. However, its computational effort and potential numerical instabilities produce costly and/or unreliable predictions, potentially limiting its use on large scale datasets. In this paper, we apply a recently developed multi-level stochastic optimization approach to the problem of imputation in massive medical records. The approach is based on computational applied mathematics techniques and is highly accurate. In particular, for the Best Linear Unbiased Predictor (BLUP) this multi-level formulation is exact, and is also significantly faster and more numerically stable. This permits practical application of Kriging methods to data imputation problems for massive datasets. We test this approach on data from the National Inpatient Sample (NIS) data records, Healthcare Cost and Utilization Project (HCUP), Agency for Healthcare Research and Quality. Numerical results show the multi-level method significantly outperforms current approaches and is numerically robust. In particular, it has superior accuracy as compared with methods recommended in the recent report from HCUP on the important problem of missing data, which could lead to sub-optimal and poorly based funding policy decisions. In comparative benchmark tests it is shown that the multilevel stochastic method is significantly superior to recommended methods in the report, including Predictive Mean Matching (PMM) and Predicted Posterior Distribution (PPD), with up to 75% reductions in error.

[232]  arXiv:2110.09687 (cross-list from eess.SP) [pdf]
Title: Data Driven Prediction of Battery Cycle Life Before Capacity Degradation
Comments: 28 pages, 11 figures, 1 table, 12 equations
Subjects: Signal Processing (eess.SP); Machine Learning (cs.LG)

Ubiquitous use of lithium-ion batteries across multiple industries presents an opportunity to explore cost saving initiatives as the price to performance ratio continually decreases in a competitive environment. Manufacturers using lithium-ion batteries ranging in applications from mobile phones to electric vehicles need to know how long batteries will last for a given service life. To understand this, expensive testing is required.
This paper utilizes the data and methods implemented by Kristen A. Severson, et al, to explore the methodologies that the research team used and presents another method to compare predicted results vs. actual test data for battery capacity fade. The fundamental effort is to find out if machine learning techniques may be trained to use early life cycle data in order to accurately predict battery capacity over the battery life cycle. Results show comparison of methods between Gaussian Process Regression (GPR) and Elastic Net Regression (ENR) and highlight key data features used from the extensive dataset found in the work of Severson, et al.

[233]  arXiv:2110.09693 (cross-list from eess.IV) [pdf, other]
Title: Cross-Vendor CT Image Data Harmonization Using CVH-CT
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

While remarkable advances have been made in Computed Tomography (CT), most of the existing efforts focus on imaging enhancement while reducing radiation dose. How to harmonize CT image data captured using different scanners is vital in cross-center large-scale radiomics studies but remains the boundary to explore. Furthermore, the lack of paired training image problem makes it computationally challenging to adopt existing deep learning models. %developed for CT image standardization. %this problem more challenging. We propose a novel deep learning approach called CVH-CT for harmonizing CT images captured using scanners from different vendors. The generator of CVH-CT uses a self-attention mechanism to learn the scanner-related information. We also propose a VGG feature-based domain loss to effectively extract texture properties from unpaired image data to learn the scanner-based texture distributions. The experimental results show that CVH-CT is clearly better than the baselines because of the use of the proposed domain loss, and CVH-CT can effectively reduce the scanner-related variability in terms of radiomic features.

[234]  arXiv:2110.09697 (cross-list from stat.ML) [pdf, other]
Title: abess: A Fast Best Subset Selection Library in Python and R
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Computation (stat.CO)

We introduce a new library named abess that implements a unified framework of best-subset selection for solving diverse machine learning problems, e.g., linear regression, classification, and principal component analysis. Particularly, the abess certifiably gets the optimal solution within polynomial times under the linear model. Our efficient implementation allows abess to attain the solution of best-subset selection problems as fast as or even 100x faster than existing competing variable (model) selection toolboxes. Furthermore, it supports common variants like best group subset selection and $\ell_2$ regularized best-subset selection. The core of the library is programmed in C++. For ease of use, a Python library is designed for conveniently integrating with scikit-learn, and it can be installed from the Python library Index. In addition, a user-friendly R library is available at the Comprehensive R Archive Network. The source code is available at: https://github.com/abess-team/abess.

[235]  arXiv:2110.09704 (cross-list from stat.ME) [pdf, other]
Title: Hybrid variable monitoring: An unsupervised process monitoring framework
Comments: This paper has been submitted to Automatica for potential publication
Subjects: Methodology (stat.ME); Systems and Control (eess.SY)

Traditional process monitoring methods, such as PCA, PLS, ICA, MD et al., are strongly dependent on continuous variables because most of them inevitably involve Euclidean or Mahalanobis distance. With industrial processes becoming more and more complex and integrated, binary variables also appear in monitoring variables besides continuous variables, which makes process monitoring more challenging. The aforementioned traditional approaches are incompetent to mine the information of binary variables, so that the useful information contained in them is usually discarded during the data preprocessing. To solve the problem, this paper focuses on the issue of hybrid variable monitoring (HVM) and proposes a novel unsupervised framework of process monitoring with hybrid variables. HVM is addressed in the probabilistic framework, which can effectively exploit the process information implicit in both continuous and binary variables at the same time. In HVM, the statistics and the monitoring strategy suitable for hybrid variables with only healthy state data are defined and the physical explanation behind the framework is elaborated. In addition, the estimation of parameters required in HVM is derived in detail and the detectable condition of the proposed method is analyzed. Finally, the superiority of HVM is fully demonstrated first on a numerical simulation and then on an actual case of a thermal power plant.

[236]  arXiv:2110.09732 (cross-list from math.CO) [pdf, other]
Title: Eternal Domination and Clique Covering
Comments: 20 pages, submitted for publication
Subjects: Combinatorics (math.CO); Discrete Mathematics (cs.DM)

We study the relationship between the eternal domination number of a graph and its clique covering number. Using computational methods, we show that the smallest graph having its eternal domination number less than its clique covering number has $10$ vertices. This answers a question of Klostermeyer and Mynhardt [Protecting a graph with mobile guards, Appl. Anal. Discrete Math. $10$ $(2016)$, no. $1$, $1-29$]. We also determine the complete set of $10$-vertex and $11$-vertex graphs having eternal domination numbers less than their clique covering numbers. In addition, we study the problem on triangle-free graphs, circulant graphs, planar graphs and cubic graphs. Our computations show that all triangle-free graphs and all circulant graphs of order $12$ or less have eternal domination numbers equal to their clique covering numbers, and exhibit $13$ triangle-free graphs and $2$ circulant graphs of order $13$ which do not have this property. Using these graphs, we describe a method to generate an infinite family of triangle-free graphs and an infinite family of circulant graphs with eternal domination numbers less than their clique covering numbers. Our computations also show that all planar graphs of order $11$ or less, all $3$-connected planar graphs of order $13$ or less and all cubic graphs of order less than $18$ have eternal domination numbers equal to their clique covering numbers. Finally, we show that for any integer $k \geq 2$ there exist infinitely many graphs having domination number and eternal domination number equal to $k$ containing dominating sets which are not eternal dominating sets. This answers another question of Klostermeyer and Mynhardt [Eternal and Secure Domination in Graphs, Topics in domination in graphs, Dev. Math. $64$ $(2020)$, $445-478$, Springer, Cham].

[237]  arXiv:2110.09738 (cross-list from math.OC) [pdf, other]
Title: Faster Rates for the Frank-Wolfe Algorithm Using Jacobi Polynomials
Subjects: Optimization and Control (math.OC); Machine Learning (cs.LG)

The Frank Wolfe algorithm (FW) is a popular projection-free alternative for solving large-scale constrained optimization problems. However, the FW algorithm suffers from a sublinear convergence rate when minimizing a smooth convex function over a compact convex set. Thus, exploring techniques that yield a faster convergence rate becomes crucial. A classic approach to obtain faster rates is to combine previous iterates to obtain the next iterate. In this work, we extend this approach to the FW setting and show that the optimal way to combine the past iterates is using a set of orthogonal Jacobi polynomials. We also a polynomial-based acceleration technique, referred to as Jacobi polynomial accelerated FW, which combines the current iterate with the past iterate using combing weights related to the Jacobi recursion. By carefully choosing parameters of the Jacobi polynomials, we obtain a faster sublinear convergence rate. We provide numerical experiments on real datasets to demonstrate the efficacy of the proposed algorithm.

[238]  arXiv:2110.09744 (cross-list from eess.IV) [pdf, ps, other]
Title: Spectral Variability Augmented Sparse Unmixing of Hyperspectral Images
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Signal Processing (eess.SP)

Spectral unmixing (SU) expresses the mixed pixels existed in hyperspectral images as the product of endmember and abundance, which has been widely used in hyperspectral imagery analysis. However, the influence of light, acquisition conditions and the inherent properties of materials, results in that the identified endmembers can vary spectrally within a given image (construed as spectral variability). To address this issue, recent methods usually use a priori obtained spectral library to represent multiple characteristic spectra of the same object, but few of them extracted the spectral variability explicitly. In this paper, a spectral variability augmented sparse unmixing model (SVASU) is proposed, in which the spectral variability is extracted for the first time. The variable spectra are divided into two parts of intrinsic spectrum and spectral variability for spectral reconstruction, and modeled synchronously in the SU model adding the regular terms restricting the sparsity of abundance and the generalization of the variability coefficient. It is noted that the spectral variability library and the intrinsic spectral library are all constructed from the In-situ observed image. Experimental results over both synthetic and real-world data sets demonstrate that the augmented decomposition by spectral variability significantly improves the unmixing performance than the decomposition only by spectral library, as well as compared to state-of-the-art algorithms.

[239]  arXiv:2110.09807 (cross-list from stat.ML) [pdf, other]
Title: Learning to Learn Graph Topologies
Comments: Accepted at NeurIPS 2021
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Social and Information Networks (cs.SI); Signal Processing (eess.SP)

Learning a graph topology to reveal the underlying relationship between data entities plays an important role in various machine learning and data analysis tasks. Under the assumption that structured data vary smoothly over a graph, the problem can be formulated as a regularised convex optimisation over a positive semidefinite cone and solved by iterative algorithms. Classic methods require an explicit convex function to reflect generic topological priors, e.g. the $\ell_1$ penalty for enforcing sparsity, which limits the flexibility and expressiveness in learning rich topological structures. We propose to learn a mapping from node data to the graph structure based on the idea of learning to optimise (L2O). Specifically, our model first unrolls an iterative primal-dual splitting algorithm into a neural network. The key structural proximal projection is replaced with a variational autoencoder that refines the estimated graph with enhanced topological properties. The model is trained in an end-to-end fashion with pairs of node data and graph samples. Experiments on both synthetic and real-world data demonstrate that our model is more efficient than classic iterative algorithms in learning a graph with specific topological properties.

[240]  arXiv:2110.09815 (cross-list from cond-mat.mtrl-sci) [pdf, other]
Title: Microstructure reconstruction via artificial neural networks: A combination of causal and non-causal approach
Comments: 6 pages, 4 figures, and 7 tables
Subjects: Materials Science (cond-mat.mtrl-sci); Computer Vision and Pattern Recognition (cs.CV)

We investigate the applicability of artificial neural networks (ANNs) in reconstructing a sample image of a sponge-like microstructure. We propose to reconstruct the image by predicting the phase of the current pixel based on its causal neighbourhood, and subsequently, use a non-causal ANN model to smooth out the reconstructed image as a form of post-processing. We also consider the impacts of different configurations of the ANN model (e.g. number of densely connected layers, number of neurons in each layer, the size of both the causal and non-causal neighbourhood) on the models' predictive abilities quantified by the discrepancy between the spatial statistics of the reference and the reconstructed sample.

[241]  arXiv:2110.09835 (cross-list from math.CA) [pdf, other]
Title: Generalised Wendland functions for the sphere
Subjects: Classical Analysis and ODEs (math.CA); Numerical Analysis (math.NA)

In this paper we compute the spherical Fourier expansions coefficients for the restriction of the generalised Wendland functions from $d-$dimensional Euclidean space to the (d-1)-dimensional unit sphere. The development required to derive these coefficients relies heavily upon known asymptotic results for hypergeometric functions and the final result shows that they can be expressed in closed form as a multiple of a certain $_{3}F_{2}$ hypergeometric function. Using the closed form expressions we are able to provide the precise asymptotic rates of decay for the spherical Fourier coefficients which we observe have a close connection to the asymptotic decay rate of the corresponding Euclidean Fourier transform.

[242]  arXiv:2110.09840 (cross-list from math.PR) [pdf, other]
Title: Stability analysis of two-class retrial systems with constant retrial rates and general service times
Subjects: Probability (math.PR); Networking and Internet Architecture (cs.NI); Performance (cs.PF)

We establish stability criterion for a two-class retrial system with Poisson inputs, general class-dependent service times and class-dependent constant retrial rates. We also characterise an interesting phenomenon of partial stability when one orbit is tight but the other orbit goes to infinity in probability. All theoretical results are illustrated by numerical experiments.

[243]  arXiv:2110.09841 (cross-list from eess.IV) [pdf, other]
Title: Cutting Voxel Projector a New Approach to Construct 3D Cone Beam CT Operator
Authors: Vojtěch Kulvait (1), Georg Rose (1) ((1) Institute for Medical Engineering and Research Campus STIMULATE, University of Magdeburg, Magdeburg, Germany)
Comments: 10 pages, 4 figures
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Numerical Analysis (math.NA); Medical Physics (physics.med-ph)

In this paper, we introduce a new class of projectors for 3D cone beam tomographic reconstruction. We find analytical formulas for the relationship between the voxel volume projected onto a given detector pixel and its contribution to the extinction value detected on that pixel. Using this approach, we construct a near-exact projector and backprojector that can be used especially for algebraic reconstruction techniques. We have implemented this cutting voxel projector and a less accurate, speed-optimized version of it together with two established projectors, a ray tracing projector based on Siddon's algorithm and a TT footprint projector. We show that the cutting voxel projector achieves, especially for large cone beam angles, noticeably higher accuracy than the TT projector. Moreover, our implementation of the relaxed version of the cutting voxel projector is significantly faster than current footprint projector implementations. We further show that Siddon's algorithm with comparable accuracy would be much slower than the cutting voxel projector. All algorithms are implemented within an open source framework for algebraic reconstruction in OpenCL 1.2 and C++ and are optimized for GPU computation. They are published as open-source software under the GNU GPL 3 license, see https://github.com/kulvait/KCT_cbct.

[244]  arXiv:2110.09854 (cross-list from math.OC) [pdf, ps, other]
Title: A New Extension of Chubanov's Method to Symmetric Cones
Comments: 47 pages
Subjects: Optimization and Control (math.OC); Numerical Analysis (math.NA)

We propose a new variant of Chubanov's method for solving the feasibility problem over the symmetric cone by extending Roos's method (2018) for the feasibility problem over the nonnegative orthant. The proposed method considers a feasibility problem associated with a norm induced by the maximum eigenvalue of an element and uses a rescaling focusing on the upper bound of the sum of eigenvalues of any feasible solution to the problem. Its computational bound is (i) equivalent to Roos's original method (2018) and superior to Louren\c{c}o et al.'s method (2019) when the symmetric cone is the nonnegative orthant, (ii) superior to Louren\c{c}o et al.'s method (2019) when the symmetric cone is a Cartesian product of second-order cones, and (iii) equivalent to Louren\c{c}o et al.'s method (2019) when the symmetric cone is the simple positive semidefinite cone, under the assumption that the costs of computing the spectral decomposition and the minimum eigenvalue are of the same order for any given symmetric matrix.
We also conduct numerical experiments that compare the performance of our method with existing methods by generating instance in three types: (i) strongly (but ill-conditioned) feasible instances, (ii) weakly feasible instances, and (iii) infeasible instances. For any of these instances, the proposed method is rather more efficient than the existing methods in terms of accuracy and execution time.

[245]  arXiv:2110.09860 (cross-list from eess.IV) [pdf, other]
Title: Bilateral-ViT for Robust Fovea Localization
Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

The fovea is an important anatomical landmark of the retina. Detecting the location of the fovea is essential for the analysis of many retinal diseases. However, robust fovea localization remains a challenging problem, as the fovea region often appears fuzzy, and retina diseases may further obscure its appearance. This paper proposes a novel vision transformer (ViT) approach that integrates information both inside and outside the fovea region to achieve robust fovea localization. Our proposed network named Bilateral-Vision-Transformer (Bilateral-ViT) consists of two network branches: a transformer-based main network branch for integrating global context across the entire fundus image and a vessel branch for explicitly incorporating the structure of blood vessels. The encoded features from both network branches are subsequently merged with a customized multi-scale feature fusion (MFF) module. Our comprehensive experiments demonstrate that the proposed approach is significantly more robust for diseased images and establishes the new state of the arts on both Messidor and PALM datasets.

[246]  arXiv:2110.09864 (cross-list from stat.ML) [pdf, other]
Title: Learning Pareto-Efficient Decisions with Confidence
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)

The paper considers the problem of multi-objective decision support when outcomes are uncertain. We extend the concept of Pareto-efficient decisions to take into account the uncertainty of decision outcomes across varying contexts. This enables quantifying trade-offs between decisions in terms of tail outcomes that are relevant in safety-critical applications. We propose a method for learning efficient decisions with statistical confidence, building on results from the conformal prediction literature. The method adapts to weak or nonexistent context covariate overlap and its statistical guarantees are evaluated using both synthetic and real data.

[247]  arXiv:2110.09890 (cross-list from eess.AS) [pdf, other]
Title: Multi-Modal Pre-Training for Automated Speech Recognition
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)

Traditionally, research in automated speech recognition has focused on local-first encoding of audio representations to predict the spoken phonemes in an utterance. Unfortunately, approaches relying on such hyper-local information tend to be vulnerable to both local-level corruption (such as audio-frame drops, or loud noises) and global-level noise (such as environmental noise, or background noise) that has not been seen during training. In this work, we introduce a novel approach which leverages a self-supervised learning technique based on masked language modeling to compute a global, multi-modal encoding of the environment in which the utterance occurs. We then use a new deep-fusion framework to integrate this global context into a traditional ASR method, and demonstrate that the resulting method can outperform baseline methods by up to 7% on Librispeech; gains on internal datasets range from 6% (on larger models) to 45% (on smaller models).

[248]  arXiv:2110.09891 (cross-list from quant-ph) [pdf]
Title: Development of Quantum Circuits for Perceptron Neural Network Training, Based on the Principles of Grover's Algorithm
Subjects: Quantum Physics (quant-ph); Emerging Technologies (cs.ET)

This paper highlights a practical research of the possibility of forming quantum circuits for training neural networks. The demonstrated quantum circuits were based on the principles of Grover's Search Algorithm. The perceptron was chosen as the architecture for the example neural network. The multilayer perceptron is a popular neural network architecture due to its scalability and applicability for solving a wide range of problems.

[249]  arXiv:2110.09916 (cross-list from physics.plasm-ph) [pdf, ps, other]
Title: Identification of high order closure terms from fully kinetic simulations using machine learning
Comments: 36 pages, 9 figures, 8 tables, submitted in AIP Physics of Plasma
Subjects: Plasma Physics (physics.plasm-ph); Machine Learning (cs.LG); Computational Physics (physics.comp-ph)

Simulations of large-scale plasma systems are typically based on fluid approximations. However, these methods do not capture the small-scale physical processes available to fully kinetic models. Traditionally, empirical closure terms are used to express high order moments of the Boltzmann equation, e.g. the pressure tensor and heat flux. In this paper, we propose different closure terms extracted using machine learning techniques as an alternative. We show in this work how two different machine learning models, a multi-layer perceptron and a gradient boosting regressor, can synthesize higher-order moments extracted from a fully kinetic simulation. The accuracy of the models and their ability to generalize are evaluated and compared to a baseline model. When trained from more extreme simulations, the models showed better extrapolation in comparison to traditional simulations, indicating the importance of outliers. We learn that both models can capture heat flux and pressure tensor very well, with the gradient boosting regressor being the most stable of the two models in terms of the accuracy. The performance of the tested models in the regression task opens the way for new experiments in multi-scale modelling.

[250]  arXiv:2110.09917 (cross-list from math.OC) [pdf, ps, other]
Title: Planning for Package Deliveries in Risky Environments Over Multiple Epochs
Subjects: Optimization and Control (math.OC); Discrete Mathematics (cs.DM); Probability (math.PR)

We study a risk-aware robot planning problem where a dispatcher must construct a package delivery plan that maximizes the expected reward for a robot delivering packages across multiple epochs. Each package has an associated reward for delivery and a risk of failure. If the robot fails while delivering a package, no future packages can be delivered and the cost of replacing the robot is incurred. The package delivery plan takes place over the course of either a finite or an infinite number of epochs, denoted as the finite horizon problem and infinite horizon problem, respectively. The dispatcher has to weigh the risk and reward of delivering packages during any given epoch against the potential loss of any future epoch's reward. By using the ratio between a package's reward and its risk of failure, we prove an optimal, greedy solution to both the infinite and finite horizon problems. The finite horizon problem can be solved optimally in $O(K n\log n)$ time where $K$ is the number of epochs and $n$ is the number of packages. We show an isomorphism between the infinite horizon problem and Markov Decision Processes to prove an optimal $O(n)$ time algorithm for the infinite horizon problem.

[251]  arXiv:2110.09921 (cross-list from math.LO) [pdf, ps, other]
Title: Normalisation and subformula property for a system of intuitionistic logic with general introduction and elimination rules
Authors: Nils Kürbis
Comments: arXiv admin note: substantial text overlap with arXiv:2108.03939
Journal-ref: Synthese 2021
Subjects: Logic (math.LO); Logic in Computer Science (cs.LO)

This paper studies a formalisation of intuitionistic logic by Negri and von Plato which has general introduction and elimination rules. The philosophical importance of the system is expounded. Definitions of `maximal formula', `segment' and `maximal segment' suitable to the system are formulated and corresponding reduction procedures for maximal formulas and permutative reduction procedures for maximal segments given. Alternatives to the main method used are also considered. It is shown that deductions in the system convert into normal form and that deductions in normal form have the subformula property.

[252]  arXiv:2110.09923 (cross-list from eess.AS) [pdf, other]
Title: Speech Enhancement-assisted Stargan Voice Conversion in Noisy Environments
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)

Numerous voice conversion (VC) techniques have been proposed for the conversion of voices among different speakers. Although the decent quality of converted speech can be observed when VC is applied in a clean environment, the quality will drop sharply when the system is running under noisy conditions. In order to address this issue, we propose a novel enhancement-based StarGAN (E-StarGAN) VC system, which leverages a speech enhancement (SE) technique for signal pre-processing. SE systems are generally used to reduce noise components in noisy speech and to generate enhanced speech for downstream application tasks. Therefore, we investigated the effectiveness of E-StarGAN, which combines VC and SE, and demonstrated the robustness of the proposed approach in various noisy environments. The results of VC experiments conducted on a Mandarin dataset show that when combined with SE, the proposed E-StarGAN VC model is robust to unseen noises. In addition, the subjective listening test results show that the proposed E-StarGAN model can improve the sound quality of speech signals converted from noise-corrupted source utterances.

[253]  arXiv:2110.09924 (cross-list from eess.AS) [pdf, ps, other]
Title: Speech Enhancement Based on Cyclegan with Noise-informed Training
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)

Speech enhancement (SE) approaches can be classified into supervised and unsupervised categories. For unsupervised SE, a well-known cycle-consistent generative adversarial network (CycleGAN) model, which comprises two generators and two discriminators, has been shown to provide a powerful nonlinear mapping ability and thus achieve a promising noise-suppression capability. However, a low-efficiency training process along with insufficient knowledge between noisy and clean speech may limit the enhancement performance of the CycleGAN SE at runtime. In this study, we propose a novel noise-informed-training CycleGAN approach that incorporates additional inputs into the generators and discriminators to assist the CycleGAN in learning a more accurate transformation of speech signals between the noise and clean domains. The additional input feature serves as an indicator that provides more information during the CycleGAN training stage. Experiment results confirm that the proposed approach can improve the CycleGAN SE model while achieving a better sound quality and fewer signal distortions.

[254]  arXiv:2110.09927 (cross-list from eess.IV) [pdf, other]
Title: Conditional De-Identification of 3D Magnetic Resonance Images
Subjects: Image and Video Processing (eess.IV); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

Privacy protection of medical image data is challenging. Even if metadata is removed, brain scans are vulnerable to attacks that match renderings of the face to facial image databases. Solutions have been developed to de-identify diagnostic scans by obfuscating or removing parts of the face. However, these solutions either fail to reliably hide the patient's identity or are so aggressive that they impair further analyses. We propose a new class of de-identification techniques that, instead of removing facial features, remodels them. Our solution relies on a conditional multi-scale GAN architecture. It takes a patient's MRI scan as input and generates a 3D volume conditioned on the patient's brain, which is preserved exactly, but where the face has been de-identified through remodeling. We demonstrate that our approach preserves privacy far better than existing techniques, without compromising downstream medical analyses. Analyses were run on the OASIS-3 and ADNI corpora.

[255]  arXiv:2110.09928 (cross-list from eess.AS) [pdf, other]
Title: CycleFlow: Purify Information Factors by Cycle Loss
Comments: Submitted to ICASSP 2022
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)

SpeechFlow is a powerful factorization model based on information bottleneck (IB), and its effectiveness has been reported by several studies. A potential problem of SpeechFlow, however, is that if the IB channels are not well designed, the resultant factors cannot be well disentangled. In this study, we propose a CycleFlow model that combines random factor substitution and cycle loss to solve this problem. Experiments on voice conversion tasks demonstrate that this simple technique can effectively reduce mutual information among individual factors, and produce clearly better conversion than the IB-based SpeechFlow. CycleFlow can also be used as a powerful tool for speech editing. We demonstrate this usage by an emotion perception experiment.

[256]  arXiv:2110.09930 (cross-list from eess.AS) [pdf, other]
Title: Speech Representation Learning Through Self-supervised Pretraining And Multi-task Finetuning
Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)

Speech representation learning plays a vital role in speech processing. Among them, self-supervised learning (SSL) has become an important research direction. It has been shown that an SSL pretraining model can achieve excellent performance in various downstream tasks of speech processing. On the other hand, supervised multi-task learning (MTL) is another representation learning paradigm, which has been proven effective in computer vision (CV) and natural language processing (NLP). However, there is no systematic research on the general representation learning model trained by supervised MTL in speech processing. In this paper, we show that MTL finetuning can further improve SSL pretraining. We analyze the generalizability of supervised MTL finetuning to examine if the speech representation learned by MTL finetuning can generalize to unseen new tasks.

[257]  arXiv:2110.09932 (cross-list from eess.SP) [pdf, other]
Title: Multipath-based Localization and Tracking considering Off-Body Channel Effects
Comments: 5 pages, 4 figures. Submitted to IEEE EuCAP-22
Subjects: Signal Processing (eess.SP); Systems and Control (eess.SY)

This paper deals with multipath-based positioning and tracking in off-body channels. An analysis of the effects introduced by the human body and the implications on positioning and tracking is presented based on channel measurements obtained in an indoor scenario. It shows the influence of the radio signal bandwidth on the human body induced field of view (FOV) and the number of multipath components (MPCs) detected and estimated by a deterministic maximum likelihood (ML) algorithm. A multipath-based positioning and tracking algorithm is proposed that associates these estimated MPC parameters with floor plan features and exploits a human body-dependent FOV function. The proposed algorithm is able to provide accurate position estimates even for an off-body radio channel in a multipath-prone environment with the signal bandwidth found to be a limiting factor.

[258]  arXiv:2110.09948 (cross-list from eess.SP) [pdf]
Title: Analysis of False Data Injection Impact on AI based Solar Photovoltaic Power Generation Forecasting
Comments: 6 pages,3 figures, 3 tables, 2021 International Conference & Exposition on Modern Energy and Power Systems (ICMEPS2021)
Subjects: Signal Processing (eess.SP); Machine Learning (cs.LG)

The use of solar photovoltaics (PV) energy provides additional resources to the electric power grid. The downside of this integration is that the solar power supply is unreliable and highly dependent on the weather condition. The predictability and stability of forecasting are critical for the full utilization of solar power. This study reviews and evaluates various machine learning-based models for solar PV power generation forecasting using a public dataset. Furthermore, The root mean squared error (RMSE), mean squared error (MSE), and mean average error (MAE) metrics are used to evaluate the results. Linear Regression, Gaussian Process Regression, K-Nearest Neighbor, Decision Trees, Gradient Boosting Regression Trees, Multi-layer Perceptron, and Support Vector Regression algorithms are assessed. Their responses against false data injection attacks are also investigated. The Multi-layer Perceptron Regression method shows robust prediction on both regular and noise injected datasets over other methods.

[259]  arXiv:2110.09955 (cross-list from eess.SP) [pdf, other]
Title: Positional-Spectral-Temporal Attention in 3D Convolutional Neural Networks for EEG Emotion Recognition
Subjects: Signal Processing (eess.SP); Computer Vision and Pattern Recognition (cs.CV)

Recognizing the feelings of human beings plays a critical role in our daily communication. Neuroscience has demonstrated that different emotion states present different degrees of activation in different brain regions, EEG frequency bands and temporal stamps. In this paper, we propose a novel structure to explore the informative EEG features for emotion recognition. The proposed module, denoted by PST-Attention, consists of Positional, Spectral and Temporal Attention modules to explore more discriminative EEG features. Specifically, the Positional Attention module is to capture the activate regions stimulated by different emotions in the spatial dimension. The Spectral and Temporal Attention modules assign the weights of different frequency bands and temporal slices respectively. Our method is adaptive as well as efficient which can be fit into 3D Convolutional Neural Networks (3D-CNN) as a plug-in module. We conduct experiments on two real-world datasets. 3D-CNN combined with our module achieves promising results and demonstrate that the PST-Attention is able to capture stable patterns for emotion recognition from EEG.

[260]  arXiv:2110.09956 (cross-list from eess.SP) [pdf, other]
Title: Food Odor Recognition via Multi-step Classification
Subjects: Signal Processing (eess.SP); Computers and Society (cs.CY)

Predicting food labels and freshness from its odor remains a decades-old task that requires a complicated algorithm combined with high sensitivity sensors. In this paper, we initiate a multi-step classifier, which firstly clusters food into four categories, then classifies the food label concerning the predicted category, and finally identifies the freshness. We use BME688 gas sensors packed with BME AI studio for data collection and feature extraction. The normalized dataset was preprocessed with PCA and LDA. We evaluated the effectiveness of algorithms such as tree methods, MLP, and CNN through assessment indexes at each stage. We also carried out an ablation experiment to show the necessity and feasibility of the multi-step classifier. The results demonstrated the robustness and adaptability of the multi-step classifier.

[261]  arXiv:2110.09958 (cross-list from eess.AS) [pdf, other]
Title: The Cocktail Fork Problem: Three-Stem Audio Separation for Real-World Soundtracks
Comments: Submitted to ICASSP2022. For resources and examples, see this https URL
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)

The cocktail party problem aims at isolating any source of interest within a complex acoustic scene, and has long inspired audio source separation research. Recent efforts have mainly focused on separating speech from noise, speech from speech, musical instruments from each other, or sound events from each other. However, separating an audio mixture (e.g., movie soundtrack) into the three broad categories of speech, music, and sound effects (here understood to include ambient noise and natural sound events) has been left largely unexplored, despite a wide range of potential applications. This paper formalizes this task as the cocktail fork problem, and presents the Divide and Remaster (DnR) dataset to foster research on this topic. DnR is built from three well-established audio datasets (LibriVox, FMA, FSD50k), taking care to reproduce conditions similar to professionally produced content in terms of source overlap and relative loudness, and made available at CD quality. We benchmark standard source separation algorithms on DnR, and further introduce a new mixed-STFT-resolution model to better address the variety of acoustic characteristics of the three source types. Our best model produces SI-SDR improvements over the mixture of 11.3 dB for music, 11.8 dB for speech, and 10.9 dB for sound effects.

[262]  arXiv:2110.09966 (cross-list from eess.SP) [pdf, other]
Title: SleepPriorCL: Contrastive Representation Learning with Prior Knowledge-based Positive Mining and Adaptive Temperature for Sleep Staging
Subjects: Signal Processing (eess.SP); Machine Learning (cs.LG)

The objective of this paper is to learn semantic representations for sleep stage classification from raw physiological time series. Although supervised methods have gained remarkable performance, they are limited in clinical situations due to the requirement of fully labeled data. Self-supervised learning (SSL) based on contrasting semantically similar (positive) and dissimilar (negative) pairs of samples have achieved promising success. However, existing SSL methods suffer the problem that many semantically similar positives are still uncovered and even treated as negatives. In this paper, we propose a novel SSL approach named SleepPriorCL to alleviate the above problem. Advances of our approach over existing SSL methods are two-fold: 1) by incorporating prior domain knowledge into the training regime of SSL, more semantically similar positives are discovered without accessing ground-truth labels; 2) via investigating the influence of the temperature in contrastive loss, an adaptive temperature mechanism for each sample according to prior domain knowledge is further proposed, leading to better performance. Extensive experiments demonstrate that our method achieves state-of-the-art performance and consistently outperforms baselines.

[263]  arXiv:2110.09968 (cross-list from eess.SP) [pdf, ps, other]
Title: Can Dynamic TDD Enabled Half-Duplex Cell-Free Massive MIMO Outperform Full-Duplex Cellular Massive MIMO?
Subjects: Signal Processing (eess.SP); Information Theory (cs.IT)

We consider a dynamic time division duplex (DTDD) enabled cell-free massive multiple-input multiple-output (CF-mMIMO) system, where each half-duplex (HD) access point (AP) is scheduled to operate in the uplink (UL) or downlink (DL) mode based on the data demands of the user equipments (UEs). The goal is to maximize the sum UL-DL spectral efficiency (SE). We theoretically establish the sub-modularity of the sum SE, which allows us to develop a new, low complexity, greedy algorithm for the combinatorial AP scheduling problem, with guaranteed optimality properties. We also consider pilot sequence reuse among the UEs to limit the channel estimation overhead. In CF systems, all the APs estimate the channel from every UE, making pilot allocation problem different from the cellular case. We develop a novel algorithm that iteratively minimizes the maximum pilot contamination across the UEs. We compare our solutions, both theoretically and via simulations, against a full duplex (FD) multi-cell mMIMO system. Our results show that, due to the joint processing of the signals at the central processing unit, CF-mMIMO with dynamic HD AP-scheduling significantly outperforms cellular FD-mMIMO in terms of the sum SE and 90% likely SE. Thus, DTDD enabled HD CF-mMIMO is a promising alternative to cellular FD-mMIMO, without the cost of hardware for self-interference suppression.

[264]  arXiv:2110.09971 (cross-list from stat.ME) [pdf, other]
Title: Fully Three-dimensional Radial Visualization
Comments: 10 pages, 7 figures, 1 table
Subjects: Methodology (stat.ME); Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Applications (stat.AP); Machine Learning (stat.ML)

We develop methodology for three-dimensional (3D) radial visualization (RadViz) of multidimensional datasets. The classical two-dimensional (2D) RadViz visualizes multivariate data in the 2D plane by mapping every observation to a point inside the unit circle. Our tool, RadViz3D, distributes anchor points uniformly on the 3D unit sphere. We show that this uniform distribution provides the best visualization with minimal artificial visual correlation for data with uncorrelated variables. However, anchor points can be placed exactly equi-distant from each other only for the five Platonic solids, so we provide equi-distant anchor points for these five settings, and approximately equi-distant anchor points via a Fibonacci grid for the other cases. Our methodology, implemented in the R package $radviz3d$, makes fully 3D RadViz possible and is shown to improve the ability of this nonlinear technique in more faithfully displaying simulated data as well as the crabs, olive oils and wine datasets. Additionally, because radial visualization is naturally suited for compositional data, we use RadViz3D to illustrate (i) the chemical composition of Longquan celadon ceramics and their Jingdezhen imitation over centuries, and (ii) US regional SARS-Cov-2 variants' prevalence in the Covid-19 pandemic during the summer 2021 surge of the Delta variant.

[265]  arXiv:2110.09983 (cross-list from eess.SP) [pdf, other]
Title: ECG-ATK-GAN: Robustness against Adversarial Attacks on ECG using Conditional Generative Adversarial Networks
Comments: 5 pages, 2 figures, 2 tables
Subjects: Signal Processing (eess.SP); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

Recently deep learning has reached human-level performance in classifying arrhythmia from Electrocardiogram (ECG). However, deep neural networks (DNN) are vulnerable to adversarial attacks, which can misclassify ECG signals by decreasing the model's precision. Adversarial attacks are crafted perturbations injected in data that manifest the conventional DNN models to misclassify the correct class. Thus, safety concerns arise as it becomes challenging to establish the system's reliability, given that clinical applications require high levels of trust. To mitigate this problem and make DNN models more robust in clinical and real-life settings, we introduce a novel Conditional Generative Adversarial Network (GAN), robust against adversarial attacked ECG signals and retaining high accuracy. Furthermore, we compared it with other state-of-art models to detect cardiac abnormalities from indistinguishable adversarial attacked ECGs. The experiment confirms, our model is more robust against adversarial attacks compared to other architectures.

[266]  arXiv:2110.09992 (cross-list from eess.IV) [pdf, other]
Title: ERQA: Edge-Restoration Quality Assessment for Video Super-Resolution
Comments: preprint
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

Despite the growing popularity of video super-resolution (VSR), there is still no good way to assess the quality of the restored details in upscaled frames. Some SR methods may produce the wrong digit or an entirely different face. Whether a method's results are trustworthy depends on how well it restores truthful details. Image super-resolution can use natural distributions to produce a high-resolution image that is only somewhat similar to the real one. VSR enables exploration of additional information in neighboring frames to restore details from the original scene. The ERQA metric, which we propose in this paper, aims to estimate a model's ability to restore real details using VSR. On the assumption that edges are significant for detail and character recognition, we chose edge fidelity as the foundation for this metric. Experimental validation of our work is based on the MSU Video Super-Resolution Benchmark, which includes the most difficult patterns for detail restoration and verifies the fidelity of details from the original frame. Code for the proposed metric is publicly available at https://github.com/msu-video-group/ERQA.

[267]  arXiv:2110.09997 (cross-list from eess.SP) [pdf, ps, other]
Title: Hybrid-Layers Neural Network Architectures for Modeling the Self-Interference in Full-Duplex Systems
Comments: 37 pages, 10 figures, to appear in the IEEE transactions on vehicular technology
Subjects: Signal Processing (eess.SP); Information Theory (cs.IT); Machine Learning (cs.LG)

Full-duplex (FD) systems have been introduced to provide high data rates for beyond fifth-generation wireless networks through simultaneous transmission of information over the same frequency resources. However, the operation of FD systems is practically limited by the self-interference (SI), and efficient SI cancelers are sought to make the FD systems realizable. Typically, polynomial-based cancelers are employed to mitigate the SI; nevertheless, they suffer from high complexity. This article proposes two novel hybrid-layers neural network (NN) architectures to cancel the SI with low complexity. The first architecture is referred to as hybrid-convolutional recurrent NN (HCRNN), whereas the second is termed as hybrid-convolutional recurrent dense NN (HCRDNN). In contrast to the state-of-the-art NNs that employ dense or recurrent layers for SI modeling, the proposed NNs exploit, in a novel manner, a combination of different hidden layers (e.g., convolutional, recurrent, and/or dense) in order to model the SI with lower computational complexity than the polynomial and the state-of-the-art NN-based cancelers. The key idea behind using hybrid layers is to build an NN model, which makes use of the characteristics of the different layers employed in its architecture. More specifically, in the HCRNN, a convolutional layer is employed to extract the input data features using a reduced network scale. Moreover, a recurrent layer is then applied to assist in learning the temporal behavior of the input signal from the localized feature map of the convolutional layer. In the HCRDNN, an additional dense layer is exploited to add another degree of freedom for adapting the NN settings in order to achieve the best compromise between the cancellation performance and computational complexity. Complexity analysis and numerical simulations are provided to prove the superiority of the proposed architectures.

[268]  arXiv:2110.10005 (cross-list from eess.SP) [pdf, other]
Title: Data-driven and Automatic Surface Texture Analysis Using Persistent Homology
Subjects: Signal Processing (eess.SP); Computational Geometry (cs.CG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

Surface roughness plays an important role in analyzing engineering surfaces. It quantifies the surface topography and can be used to determine whether the resulting surface finish is acceptable or not. Nevertheless, while several existing tools and standards are available for computing surface roughness, these methods rely heavily on user input thus slowing down the analysis and increasing manufacturing costs. Therefore, fast and automatic determination of the roughness level is essential to avoid costs resulting from surfaces with unacceptable finish, and user-intensive analysis. In this study, we propose a Topological Data Analysis (TDA) based approach to classify the roughness level of synthetic surfaces using both their areal images and profiles. We utilize persistent homology from TDA to generate persistence diagrams that encapsulate information on the shape of the surface. We then obtain feature matrices for each surface or profile using Carlsson coordinates, persistence images, and template functions. We compare our results to two widely used methods in the literature: Fast Fourier Transform (FFT) and Gaussian filtering. The results show that our approach yields mean accuracies as high as 97%. We also show that, in contrast to existing surface analysis tools, our TDA-based approach is fully automatable and provides adaptive feature extraction.

[269]  arXiv:2110.10026 (cross-list from eess.AS) [pdf, ps, other]
Title: Private Language Model Adaptation for Speech Recognition
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)

Speech model adaptation is crucial to handle the discrepancy between server-side proxy training data and actual data received on users' local devices. With the use of federated learning (FL), we introduce an efficient approach on continuously adapting neural network language models (NNLMs) on private devices with applications on automatic speech recognition (ASR). To address the potential speech transcription errors in the on-device training corpus, we perform empirical studies on comparing various strategies of leveraging token-level confidence scores to improve the NNLM quality in the FL settings. Experiments show that compared with no model adaptation, the proposed method achieves relative 2.6% and 10.8% word error rate (WER) reductions on two speech evaluation datasets, respectively. We also provide analysis in evaluating privacy guarantees of our presented procedure.

[270]  arXiv:2110.10027 (cross-list from q-bio.QM) [pdf]
Title: Clinical Trial Information Extraction with BERT
Comments: HealthNLP 2021, IEEE International Conference on Healthcare Informatics (ICHI 2021)
Subjects: Quantitative Methods (q-bio.QM); Computation and Language (cs.CL); Machine Learning (cs.LG)

Natural language processing (NLP) of clinical trial documents can be useful in new trial design. Here we identify entity types relevant to clinical trial design and propose a framework called CT-BERT for information extraction from clinical trial text. We trained named entity recognition (NER) models to extract eligibility criteria entities by fine-tuning a set of pre-trained BERT models. We then compared the performance of CT-BERT with recent baseline methods including attention-based BiLSTM and Criteria2Query. The results demonstrate the superiority of CT-BERT in clinical trial NLP.

[271]  arXiv:2110.10036 (cross-list from physics.soc-ph) [pdf, other]
Title: Coordination and equilibrium selection in games: the role of local effects
Comments: 17 pages, 9 figures
Subjects: Physics and Society (physics.soc-ph); Computers and Society (cs.CY); Computer Science and Game Theory (cs.GT)

We study the role of local effects and finite size effects in reaching coordination and in equilibrium selection in different types of two-player coordination games. We investigate three update rules -- the replicator dynamics (RD), the best response (BR), and the unconditional imitation (UI) -- for coordination games on random graphs. Local effects turn out to me significantly more important for the UI update rule. For the pure coordination game with two equivalent strategies we find a transition from a disordered state to a state of full coordination for a critical value of the network connectivity. The transition is system-size-independent for the BR and RD update rules. For the IU update rule it is system size dependent, but coordination can always be reached below the connectivity of a complete graph. We also consider the general coordination game which covers a range of games, such as the stag hunt. For these games there is a payoff-dominant strategy and a risk-dominant strategy with associated states of equilibrium coordination. We analyse equilibrium selection analytically and numerically. For the RD and BR update rules mean-field predictions agree with simulations and the risk-dominant strategy is evolutionary favoured independently of local effects. When players use the unconditional imitation, however, we observe coordination in the payoff-dominant strategy. Surprisingly, the selection of pay-off dominant equilibrium only occurs below a critical value of the network connectivity and it disappears in complete graphs. As we show, it is a combination of local effects and update rule that allows for coordination on the payoff-dominant strategy.

[272]  arXiv:2110.10040 (cross-list from q-bio.NC) [pdf, other]
Title: Spatial and color hallucinations in a mathematical model of primary visual cortex
Comments: 30 pages, 12 figures
Subjects: Neurons and Cognition (q-bio.NC); Dynamical Systems (math.DS); Numerical Analysis (math.NA); Pattern Formation and Solitons (nlin.PS)

We study a simplified model of the representation of colors in the primate primary cortical visual area V1. The model is described by an initial value problem related to a Hammerstein equation. The solutions to this problem represent the variation of the activity of populations of neurons in V1 as a function of space and color. The two space variables describe the spatial extent of the cortex while the two color variables describe the hue and the saturation represented at every location in the cortex. We prove the well-posedness of the initial value problem. We focus on its stationary, i.e. independent of time, and periodic in space solutions. We show that the model equation is equivariant with respect to the direct product G of the group of the Euclidean transformations of the planar lattice determined by the spatial periodicity and the group of color transformations, isomorphic to O(2), and study the equivariant bifurcations of its stationary solutions when some parameters in the model vary. Their variations may be caused by the consumption of drugs and the bifurcated solutions may represent visual hallucinations in space and color. Some of the bifurcated solutions can be determined by applying the Equivariant Branching Lemma (EBL) by determining the axial subgroups of G . These define bifurcated solutions which are invariant under the action of the corresponding axial subgroup. We compute analytically these solutions and illustrate them as color images. Using advanced methods of numerical bifurcation analysis we then explore the persistence and stability of these solutions when varying some parameters in the model. We conjecture that we can rely on the EBL to predict the existence of patterns that survive in large parameter domains but not to predict their stability. On our way we discover the existence of spatially localized stable patterns through the phenomenon of "snaking".

[273]  arXiv:2110.10055 (cross-list from math.GR) [pdf, other]
Title: The fully compressed subgroup membership problem
Authors: Marco Linton
Comments: 11 pages, 2 figures
Subjects: Group Theory (math.GR); Formal Languages and Automata Theory (cs.FL)

Suppose that $F$ is a free group and $k$ is a natural number. We show that the fully compressed membership problem for $k$-generated subgroups of $F$ is solvable in polynomial time. In order to do this, we adapt the theory of Stallings' foldings to handle edges with compressed labels. This partially answers a question of Markus Lohrey.

[274]  arXiv:2110.10059 (cross-list from stat.ML) [pdf, other]
Title: On Clustering Categories of Categorical Predictors in Generalized Linear Models
Journal-ref: CARRIZOSA, Emilio; GALVIS RESTREPO, Marcela; ROMERO MORALES, Dolores. On clustering categories of categorical predictors in generalized linear models. Expert Systems with Applications, 2021, p. 115245
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)

We propose a method to reduce the complexity of Generalized Linear Models in the presence of categorical predictors. The traditional one-hot encoding, where each category is represented by a dummy variable, can be wasteful, difficult to interpret, and prone to overfitting, especially when dealing with high-cardinality categorical predictors. This paper addresses these challenges by finding a reduced representation of the categorical predictors by clustering their categories. This is done through a numerical method which aims to preserve (or even, improve) accuracy, while reducing the number of coefficients to be estimated for the categorical predictors. Thanks to its design, we are able to derive a proximity measure between categories of a categorical predictor that can be easily visualized. We illustrate the performance of our approach in real-world classification and count-data datasets where we see that clustering the categorical predictors reduces complexity substantially without harming accuracy.

[275]  arXiv:2110.10069 (cross-list from nlin.PS) [pdf, other]
Title: Soliton propagation in lossy optical fibers
Comments: 11 pages, 16 figures
Journal-ref: Semina: Exact and Technological Sciences, v.40, n.2, p.97-106, July/Dec.2019
Subjects: Pattern Formation and Solitons (nlin.PS); Numerical Analysis (math.NA)

In this work, we study the propagation of solitons in lossy optical fibers. The main objective of this work is to study the loss of energy of the soliton wave during propagation and then to evaluate the impact of this loss on the transmission of the soliton signal. In this context, a numerical scheme was developed to solve a system of complex partial differential equations (CPDE) that describes the propagation of solitons in optical fibers with loss and nonlinear amplification mechanisms. The numerical procedure is based on the mathematical theory of Taylor series of complex functions. We adapted the Finite Difference Method (FDM) to approximate derivatives of complex functions. Then, we solve the algebraic system resulting from the discretization, implicitly, through the relaxation Gauss-Seidel method (RGSM). The numerical study of CPDE system with linear and cubic attenuation showed that soliton waves undergo attenuation, dispersion, and oscillation effects. On the other hand, we find that by considering the nonlinear term (cubic term) as an optical amplification, it is possible to partially compensate for the attenuation of the optical signal. Finally, we show that a gain of 9% triples the propagation distance of the fundamental soliton wave, when the dissipation rate is 1%.

[276]  arXiv:2110.10077 (cross-list from physics.geo-ph) [pdf, other]
Title: Deep Learning to Estimate Permeability using Geophysical Data
Comments: 22 pages
Subjects: Geophysics (physics.geo-ph); Machine Learning (cs.LG)

Time-lapse electrical resistivity tomography (ERT) is a popular geophysical method to estimate three-dimensional (3D) permeability fields from electrical potential difference measurements. Traditional inversion and data assimilation methods are used to ingest this ERT data into hydrogeophysical models to estimate permeability. Due to ill-posedness and the curse of dimensionality, existing inversion strategies provide poor estimates and low resolution of the 3D permeability field. Recent advances in deep learning provide us with powerful algorithms to overcome this challenge. This paper presents a deep learning (DL) framework to estimate the 3D subsurface permeability from time-lapse ERT data. To test the feasibility of the proposed framework, we train DL-enabled inverse models on simulation data. Subsurface process models based on hydrogeophysics are used to generate this synthetic data for deep learning analyses. Results show that proposed weak supervised learning can capture salient spatial features in the 3D permeability field. Quantitatively, the average mean squared error (in terms of the natural log) on the strongly labeled training, validation, and test datasets is less than 0.5. The R2-score (global metric) is greater than 0.75, and the percent error in each cell (local metric) is less than 10%. Finally, an added benefit in terms of computational cost is that the proposed DL-based inverse model is at least O(104) times faster than running a forward model. Note that traditional inversion may require multiple forward model simulations (e.g., in the order of 10 to 1000), which are very expensive. This computational savings (O(105) - O(107)) makes the proposed DL-based inverse model attractive for subsurface imaging and real-time ERT monitoring applications due to fast and yet reasonably accurate estimations of the permeability field.

[277]  arXiv:2110.10080 (cross-list from physics.geo-ph) [pdf]
Title: Surrogate and inverse modeling for two-phase flow in porous media via theory-guided convolutional neural network
Comments: 35 pages, 20 figures
Subjects: Geophysics (physics.geo-ph); Machine Learning (cs.LG); Computational Physics (physics.comp-ph); Fluid Dynamics (physics.flu-dyn)

The theory-guided convolutional neural network (TgCNN) framework, which can incorporate discretized governing equation residuals into the training of convolutional neural networks (CNNs), is extended to two-phase porous media flow problems in this work. The two principal variables of the considered problem, pressure and saturation, are approximated simultaneously with two CNNs, respectively. Pressure and saturation are coupled with each other in the governing equations, and thus the two networks are also mutually conditioned in the training process by the discretized governing equations, which also increases the difficulty of model training. The coupled and discretized equations can provide valuable information in the training process. With the assistance of theory-guidance, the TgCNN surrogates can achieve better accuracy than ordinary CNN surrogates in two-phase flow problems. Moreover, a piecewise training strategy is proposed for the scenario with varying well controls, in which the TgCNN surrogates are constructed for different segments on the time dimension and stacked together to predict solutions for the whole time-span. For scenarios with larger variance of the formation property field, the TgCNN surrogates can also achieve satisfactory performance. The constructed TgCNN surrogates are further used for inversion of permeability fields by combining them with the iterative ensemble smoother (IES) algorithm, and sufficient inversion accuracy is obtained with improved efficiency.

[278]  arXiv:2110.10082 (cross-list from stat.ML) [pdf, other]
Title: Nonparametric Sparse Tensor Factorization with Hierarchical Gamma Processes
Comments: 15 pages, 4 figures
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)

We propose a nonparametric factorization approach for sparsely observed tensors. The sparsity does not mean zero-valued entries are massive or dominated. Rather, it implies the observed entries are very few, and even fewer with the growth of the tensor; this is ubiquitous in practice. Compared with the existent works, our model not only leverages the structural information underlying the observed entry indices, but also provides extra interpretability and flexibility -- it can simultaneously estimate a set of location factors about the intrinsic properties of the tensor nodes, and another set of sociability factors reflecting their extrovert activity in interacting with others; users are free to choose a trade-off between the two types of factors. Specifically, we use hierarchical Gamma processes and Poisson random measures to construct a tensor-valued process, which can freely sample the two types of factors to generate tensors and always guarantees an asymptotic sparsity. We then normalize the tensor process to obtain hierarchical Dirichlet processes to sample each observed entry index, and use a Gaussian process to sample the entry value as a nonlinear function of the factors, so as to capture both the sparse structure properties and complex node relationships. For efficient inference, we use Dirichlet process properties over finite sample partitions, density transformations, and random features to develop a stochastic variational estimation algorithm. We demonstrate the advantage of our method in several benchmark datasets.

[279]  arXiv:2110.10093 (cross-list from eess.IV) [pdf, other]
Title: Stochastic Primal-Dual Deep Unrolling Networks for Imaging Inverse Problems
Authors: Junqi Tang
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Optimization and Control (math.OC)

In this work we present a new type of efficient deep-unrolling networks for solving imaging inverse problems. Classical deep-unrolling methods require full forward operator and its adjoint across each layer, and hence can be computationally more expensive than other end-to-end methods such as FBP-ConvNet, especially in 3D image reconstruction tasks. We propose a stochastic (ordered-subsets) extension of the Learned Primal-Dual (LPD) which is the state-of-the-art unrolling network. In our unrolling network, we only use a subset of the forward and adjoint operator, to achieve computational efficiency. We consider 3 ways of training the proposed network to cope with different scenarios of the availability of the training data, including (1) supervised training on paired data, (2) unsupervised adversarial training which enable us to train the network without paired ground-truth data, (3) equivariant self-supervised training approach, which utilizes equivariant structure which is prevalent in many imaging applications, and only requires measurement data. Our numerical results demonstrate the effectiveness of our approach in X-ray CT imaging task, showing that our networks achieve similar reconstruction accuracies as the full-batch LPD, while require only a fraction of the computation.

[280]  arXiv:2110.10139 (cross-list from eess.AS) [pdf, other]
Title: Chunked Autoregressive GAN for Conditional Waveform Synthesis
Comments: Under review as a conference paper at ICLR 2022
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)

Conditional waveform synthesis models learn a distribution of audio waveforms given conditioning such as text, mel-spectrograms, or MIDI. These systems employ deep generative models that model the waveform via either sequential (autoregressive) or parallel (non-autoregressive) sampling. Generative adversarial networks (GANs) have become a common choice for non-autoregressive waveform synthesis. However, state-of-the-art GAN-based models produce artifacts when performing mel-spectrogram inversion. In this paper, we demonstrate that these artifacts correspond with an inability for the generator to learn accurate pitch and periodicity. We show that simple pitch and periodicity conditioning is insufficient for reducing this error relative to using autoregression. We discuss the inductive bias that autoregression provides for learning the relationship between instantaneous frequency and phase, and show that this inductive bias holds even when autoregressively sampling large chunks of the waveform during each forward pass. Relative to prior state-of- the-art GAN-based models, our proposed model, Chunked Autoregressive GAN (CARGAN) reduces pitch error by 40-60%, reduces training time by 58%, maintains a fast generation speed suitable for real-time or interactive applications, and maintains or improves subjective quality.

Replacements for Wed, 20 Oct 21

[281]  arXiv:1610.05882 (replaced) [pdf, ps, other]
Title: Cognitive Indoor Positioning and Tracking using Multipath Channel Information
Comments: 11 pages, 9 figures
Subjects: Robotics (cs.RO)
[282]  arXiv:1801.03182 (replaced) [pdf]
Title: BUSIS: A Benchmark for Breast Ultrasound Image Segmentation
Comments: 27 pages, 4 figures, 3 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[283]  arXiv:1801.04819 (replaced) [pdf]
Title: Robots as Powerful Allies for the Study of Embodied Cognition from the Bottom Up
Comments: 21 pages, 3 figures
Journal-ref: in A. Newen, L. de Bruin; & S. Gallagher, ed., 'The Oxford Handbook 4e Cognition', Oxford University Press, pp. 841-861 (2018)
Subjects: Artificial Intelligence (cs.AI); Robotics (cs.RO); Neurons and Cognition (q-bio.NC)
[284]  arXiv:1808.10480 (replaced) [pdf, other]
Title: The number of crossings in multigraphs with no empty lens
Comments: Appears in the Proceedings of the 26th International Symposium on Graph Drawing and Network Visualization (GD 2018)
Subjects: Combinatorics (math.CO); Computational Geometry (cs.CG)
[285]  arXiv:1809.10756 (replaced) [pdf, other]
Title: An Introduction to Probabilistic Programming
Comments: Under review at Foundations and Trends in Machine Learning
Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Programming Languages (cs.PL)
[286]  arXiv:1812.09691 (replaced) [pdf, ps, other]
Title: Lower bounds on the chromatic number of random graphs
Subjects: Combinatorics (math.CO); Discrete Mathematics (cs.DM)
[287]  arXiv:1908.09757 (replaced) [pdf, other]
Title: API Beauty is in the eye of the Clients: 2.2 Million Maven Dependencies reveal the Spectrum of Client-API Usages
Comments: 15 pages, 10 figures, 3 tables, 2 listings
Journal-ref: Journal of Systems and Software 2021
Subjects: Software Engineering (cs.SE)
[288]  arXiv:1911.08593 (replaced) [pdf, other]
Title: Multi-attribute community detection in International Trade Network
Journal-ref: Netw Spat Econ 21, 707-733 (2021)
Subjects: Physics and Society (physics.soc-ph); Social and Information Networks (cs.SI)
[289]  arXiv:1912.09632 (replaced) [pdf, other]
Title: AutoScale: Learning to Scale for Crowd Counting and Localization
Comments: This work is accepted by IJCV. Code is available at \url{this https URL}
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[290]  arXiv:2003.00951 (replaced) [pdf, ps, other]
Title: DriverMHG: A Multi-Modal Dataset for Dynamic Recognition of Driver Micro Hand Gestures and a Real-Time Recognition Framework
Comments: Accepted to IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[291]  arXiv:2006.04101 (replaced) [src]
Title: Hybrid Model for Anomaly Detection on Call Detail Records by Time Series Forecasting
Comments: The Authors have changes and I am no more one of the authors in this manuscript
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
[292]  arXiv:2006.04845 (replaced) [pdf, ps, other]
Title: Adaptive Gradient Coding
Subjects: Information Theory (cs.IT)
[293]  arXiv:2006.06880 (replaced) [pdf, other]
Title: Reintroducing Straight-Through Estimators as Principled Methods for Stochastic Binary Networks
Comments: 33 pages, DAGM 2021 version (presented, to be published)
Subjects: Machine Learning (stat.ML); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
[294]  arXiv:2006.08161 (replaced) [pdf, other]
Title: Optimal Transport for Conditional Domain Matching and Label Shift
Authors: Alain Rakotomamonjy (Criteo AI Lab), Rémi Flamary (CMAP), Gilles Gasso (DocApp - LITIS), Mokhtar Z. Alaya (LMAC, Compiègne), Maxime Berar (DocApp - LITIS), Nicolas Courty (OBELIX)
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
[295]  arXiv:2006.12994 (replaced) [pdf, ps, other]
Title: On the flip graphs on perfect matchings of complete graphs and signed reversal graphs
Comments: 15 pages, 6 figures, 2 tables
Subjects: Combinatorics (math.CO); Discrete Mathematics (cs.DM)
[296]  arXiv:2006.13431 (replaced) [pdf, other]
Title: Multiscale Simulations of Complex Systems by Learning their Effective Dynamics
Comments: 39 pages (Appendix included)
Subjects: Computational Physics (physics.comp-ph); Machine Learning (cs.LG); Chaotic Dynamics (nlin.CD)
[297]  arXiv:2006.16433 (replaced) [pdf, ps, other]
Title: Fast OSCAR and OWL Regression via Safe Screening Rules
Comments: Correct the error of the optimality conditions
Subjects: Machine Learning (cs.LG); Methodology (stat.ME); Machine Learning (stat.ML)
[298]  arXiv:2007.06491 (replaced) [pdf, other]
Title: Mismatched Data Detection in Massive MU-MIMO
Comments: to appear in the IEEE Transactions on Signal Processing. arXiv admin note: text overlap with arXiv:1605.02324
Subjects: Information Theory (cs.IT); Signal Processing (eess.SP)
[299]  arXiv:2007.06968 (replaced) [pdf, other]
Title: Deep composition of tensor-trains using squared inverse Rosenblatt transports
Comments: Found Comput Math (2021)
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Numerical Analysis (math.NA); Computation (stat.CO)
[300]  arXiv:2008.02742 (replaced) [pdf, other]
Title: Compositional Networks Enable Systematic Generalization for Grounded Language Understanding
Comments: Accepted in Findings of EMNLP 2021
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[301]  arXiv:2008.04443 (replaced) [pdf, other]
Title: Some of Entity Resolution
Comments: 67 pages, includes supplementary materials
Subjects: Methodology (stat.ME); Databases (cs.DB); Machine Learning (stat.ML)
[302]  arXiv:2008.07575 (replaced) [pdf, other]
Title: Superconvergence of time invariants for the Gross-Pitaevskii equation
Subjects: Numerical Analysis (math.NA)
[303]  arXiv:2008.08219 (replaced) [pdf, ps, other]
Title: Monte Carlo construction of cubature on Wiener space
Comments: 25 pages
Subjects: Probability (math.PR); Numerical Analysis (math.NA)
[304]  arXiv:2008.09767 (replaced) [pdf, other]
Title: A Modified Orthogonal Matching Pursuit for Construction of Sparse Probabilistic Boolean Networks
Comments: 18 pages, 14 figures
Subjects: Numerical Analysis (math.NA)
[305]  arXiv:2008.10764 (replaced) [pdf, other]
Title: Work, entropy production, and thermodynamics of information under protocol constraints
Subjects: Statistical Mechanics (cond-mat.stat-mech); Information Theory (cs.IT)
[306]  arXiv:2008.11901 (replaced) [pdf, other]
Title: Multi-View Fusion of Sensor Data for Improved Perception and Prediction in Autonomous Driving
Comments: Accepted for publication at IEEE Winter Conference on Applications of Computer Vision (WACV) 2022
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[307]  arXiv:2008.12146 (replaced) [pdf, other]
Title: Avoiding Negative Side Effects due to Incomplete Knowledge of AI Systems
Comments: 9 pages
Subjects: Computers and Society (cs.CY); Artificial Intelligence (cs.AI)
[308]  arXiv:2009.00470 (replaced) [pdf, other]
Title: Data Anomaly Detection for Structural Health Monitoring of Bridges using Shapelet Transform
Comments: arXiv admin note: text overlap with arXiv:2004.11243
Subjects: Machine Learning (cs.LG); Signal Processing (eess.SP)
[309]  arXiv:2010.02709 (replaced) [pdf, other]
Title: An Infinite-Feature Extension for Bayesian ReLU Nets That Fixes Their Asymptotic Overconfidence
Comments: NeurIPS 2021
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
[310]  arXiv:2010.10216 (replaced) [pdf, other]
Title: Simulated Chats for Building Dialog Systems: Learning to Generate Conversations from Instructions
Journal-ref: Findings of EMNLP 2021
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
[311]  arXiv:2010.10923 (replaced) [pdf, other]
Title: Attention-based scaling adaptation for target speech extraction
Comments: 5 pages, 2 figures. Accepted by ASRU 2021
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[312]  arXiv:2010.15027 (replaced) [pdf, other]
Title: Solving generalized eigenvalue problems by ordinary differential equations on a quantum computer
Comments: 26 pages
Subjects: Quantum Physics (quant-ph); Numerical Analysis (math.NA)
[313]  arXiv:2011.01921 (replaced) [pdf, other]
Title: Optimizing Molecules using Efficient Queries from Property Evaluations
Comments: Preprint version to be published at Nature Machine Intelligence; Github: this https URL
Subjects: Machine Learning (cs.LG); Biomolecules (q-bio.BM)
[314]  arXiv:2011.07143 (replaced) [pdf, ps, other]
Title: Adaptive Learning of Compressible Strings
Comments: Accepted for publication in Theoretical Computer Science
Subjects: Data Structures and Algorithms (cs.DS)
[315]  arXiv:2011.08001 (replaced) [pdf, other]
Title: Deep-LIBRA: Artificial intelligence method for robust quantification of breast density with independent validation in breast cancer risk assessment
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[316]  arXiv:2011.10896 (replaced) [pdf, other]
Title: HALO 1.0: A Hardware-agnostic Accelerator Orchestration Framework for Enabling Hardware-agnostic Programming with True Performance Portability for Heterogeneous HPC
Comments: 21 pages
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Computation and Language (cs.CL); Performance (cs.PF)
[317]  arXiv:2011.14078 (replaced) [pdf, other]
Title: Unsupervised Constrained Community Detection via Self-Expressive Graph Neural Network
Comments: This paper has been accepted as a full research paper at UAI 2021
Subjects: Social and Information Networks (cs.SI); Machine Learning (cs.LG)
[318]  arXiv:2011.14399 (replaced) [pdf, other]
Title: Socioeconomic Impact of Emerging Mobility Markets and Implementation Strategies
Subjects: Systems and Control (eess.SY); Computer Science and Game Theory (cs.GT)
[319]  arXiv:2012.01768 (replaced) [pdf, other]
Title: Beyond Cats and Dogs: Semi-supervised Classification of fuzzy labels with overclustering
Comments: Reworked version available at arXiv:2110.06630, Published in Sensors 2021 (see DOI link)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[320]  arXiv:2012.02127 (replaced) [pdf, other]
Title: Security Proof Against Collective Attacks for an Experimentally Feasible Semi-Quantum Key Distribution Protocol
Comments: 15 pages; 3 figures
Subjects: Quantum Physics (quant-ph); Cryptography and Security (cs.CR)
[321]  arXiv:2012.05412 (replaced) [pdf, other]
Title: LaSeSOM: A Latent and Semantic Representation Framework for Soft Object Manipulation
Comments: 12 pages, 14 figures, 2 tables
Subjects: Robotics (cs.RO)
[322]  arXiv:2012.11532 (replaced) [pdf, other]
Title: Dual-CyCon Net: A Cycle Consistent Dual-Domain Convolutional Neural Network Framework for Detection of Partial Discharge
Subjects: Machine Learning (cs.LG); Signal Processing (eess.SP)
[323]  arXiv:2101.05549 (replaced) [pdf, ps, other]
Title: Spectral Clustering Oracles in Sublinear Time
Comments: Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA). Society for Industrial and Applied Mathematics, 2021
Subjects: Data Structures and Algorithms (cs.DS)
[324]  arXiv:2102.03613 (replaced) [pdf, ps, other]
Title: Linear Matrix Inequality Approaches to Koopman Operator Approximation
Comments: 13 pages
Subjects: Systems and Control (eess.SY); Machine Learning (cs.LG); Dynamical Systems (math.DS)
[325]  arXiv:2102.03783 (replaced) [pdf, other]
Title: Rotating shallow water flow under location uncertainty with a structure-preserving discretization
Subjects: Fluid Dynamics (physics.flu-dyn); Numerical Analysis (math.NA)
[326]  arXiv:2102.04022 (replaced) [pdf, other]
Title: Towards Hierarchical Task Decomposition using Deep Reinforcement Learning for Pick and Place Subtasks
Comments: This work has been accepted to the IEEE International Conference on Advanced Robotics (ICAR) 2021
Subjects: Robotics (cs.RO)
[327]  arXiv:2102.04170 (replaced) [pdf, other]
Title: Learning Task-Oriented Communication for Edge Inference: An Information Bottleneck Approach
Comments: This paper was accepted to the IEEE JSAC Series on Machine Learning for Communications and Networks and will be published in Jan 2022
Subjects: Signal Processing (eess.SP); Machine Learning (cs.LG)
[328]  arXiv:2102.04237 (replaced) [pdf, ps, other]
Title: Interval Analysis of Worst-case Stationary Moments for Stochastic Chemical Reactions with Uncertain Parameters
Subjects: Systems and Control (eess.SY); Quantitative Methods (q-bio.QM)
[329]  arXiv:2102.04523 (replaced) [pdf, other]
Title: Multi-Objective Learning to Predict Pareto Fronts Using Hypervolume Maximization
Comments: T.M.D. and M.G. contributed equally. Changes in new version (v2): improved method, added comparison to method described in old manuscript (v1), added experiments, added appendix, revised text
Subjects: Machine Learning (cs.LG)
[330]  arXiv:2102.04776 (replaced) [pdf, other]
Title: Generative Models as Distributions of Functions
Comments: Added experiments for learning distributions of functions on manifolds. Added more 3D experiments and comparisons to baselines
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
[331]  arXiv:2102.07280 (replaced) [pdf, other]
Title: 3D Fully Convolutional Neural Networks with Intersection Over Union Loss for Crop Mapping from Multi-Temporal Satellite Images
Comments: Accepted by IGARSS 2021
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[332]  arXiv:2102.09750 (replaced) [pdf, other]
Title: Symplectic Adjoint Method for Exact Gradient of Neural ODE with Minimal Memory
Comments: 19 pages
Journal-ref: 35th Conference on Neural Information Processing Systems (NeurIPS 2021)
Subjects: Machine Learning (cs.LG)
[333]  arXiv:2102.11991 (replaced) [pdf, other]
Title: Being correct is not enough: efficient verification using robust linear temporal logic
Comments: arXiv admin note: text overlap with arXiv:1510.08970. v2 notes: Proof on the complexity of translating rLTL formulae to LTL formulae via the rewriting approach. New case study on the scalability of rLTL formulae in the proposed fragment. Accepted to appear in ACM Transactions on Computational Logic
Subjects: Logic in Computer Science (cs.LO); Computational Complexity (cs.CC); Systems and Control (eess.SY)
[334]  arXiv:2102.12353 (replaced) [pdf, other]
Title: Nonlinear Invariant Risk Minimization: A Causal Approach
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
[335]  arXiv:2102.13566 (replaced) [pdf, other]
Title: Sparse approximation in learning via neural ODEs
Comments: 24 pages, 5 figures
Subjects: Machine Learning (cs.LG); Optimization and Control (math.OC); Machine Learning (stat.ML)
[336]  arXiv:2103.00370 (replaced) [pdf, other]
Title: Axiomatic Explanations for Visual Search, Retrieval, and Similarity Learning
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC); Information Retrieval (cs.IR)
[337]  arXiv:2103.00632 (replaced) [pdf, other]
Title: A weighted POD-reduction approach for parametrized PDE-constrained Optimal Control Problems with random inputs and applications to environmental sciences
Subjects: Numerical Analysis (math.NA)
[338]  arXiv:2103.02835 (replaced) [pdf, other]
Title: A Novel Application of Image-to-Image Translation: Chromosome Straightening Framework by Learning from a Single Image
Comments: This work has been accepted by CISP-BMEI2021
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[339]  arXiv:2103.07051 (replaced) [pdf, other]
Title: Dual Attention-in-Attention Model for Joint Rain Streak and Raindrop Removal
Comments: To appear in IEEE Transactions on Image Processing (TIP)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[340]  arXiv:2103.07368 (replaced) [pdf, other]
Title: Information Maximization Clustering via Multi-View Self-Labelling
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[341]  arXiv:2103.08908 (replaced) [pdf]
Title: Blockchain-assisted Undisclosed IIoT Vulnerabilities Trusted Sharing Protection with Dynamic Token
Comments: 10 pages,12 figures
Subjects: Cryptography and Security (cs.CR)
[342]  arXiv:2103.13279 (replaced) [pdf, other]
Title: FakeMix Augmentation Improves Transparent Object Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[343]  arXiv:2103.13389 (replaced) [pdf, other]
Title: Generating Novel Scene Compositions from Single Images and Videos
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[344]  arXiv:2103.14381 (replaced) [pdf, other]
Title: GNSS-denied geolocalization of UAVs by visual matching of onboard camera images with orthophotos
Comments: Accepted for publication at 20th International Conference on Advanced Robotics (ICAR 2021)
Subjects: Robotics (cs.RO)
[345]  arXiv:2103.16559 (replaced) [pdf, other]
Title: Broaden Your Views for Self-Supervised Video Learning
Comments: This paper is an extended version of our ICCV-21 paper. It includes more results as well as a minor architectural variation which improves results
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[346]  arXiv:2104.00552 (replaced) [pdf, ps, other]
Title: Using Graph Theory to Derive Inequalities for the Bell Numbers
Subjects: Discrete Mathematics (cs.DM); Combinatorics (math.CO)
[347]  arXiv:2104.04977 (replaced) [pdf, other]
Title: A tight negative example for MMS fair allocations
Subjects: Computer Science and Game Theory (cs.GT)
[348]  arXiv:2104.05702 (replaced) [pdf, other]
Title: Image-Level or Object-Level? A Tale of Two Resampling Strategies for Long-Tailed Detection
Comments: Accepted to ICML 2021
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[349]  arXiv:2104.06703 (replaced) [pdf, other]
Title: Deep Permutation Equivariant Structure from Motion
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[350]  arXiv:2104.07293 (replaced) [pdf, other]
Title: Sized Types with Usages for Parallel Complexity of Pi-Calculus Processes
Authors: Patrick Baillot (LIP), Alexis Ghyselen (LIP), Naoki Kobayashi (UTokyo)
Subjects: Computational Complexity (cs.CC); Distributed, Parallel, and Cluster Computing (cs.DC)
[351]  arXiv:2104.08013 (replaced) [pdf, other]
Title: Data-Driven 3D Reconstruction of Dressed Humans From Sparse Views
Comments: 3DV 2021
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[352]  arXiv:2104.09946 (replaced) [pdf, other]
Title: A cappella: Audio-visual Singing Voice Separation
Comments: Paper accepted at The 32nd British Machine Vision Conference, BMVC 2021
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[353]  arXiv:2104.09987 (replaced) [pdf, other]
Title: Differentiable Model Compression via Pseudo Quantization Noise
Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[354]  arXiv:2104.10873 (replaced) [pdf, other]
Title: Mosaic Flows: A Transferable Deep Learning Framework for Solving PDEs on Unseen Domains
Comments: 23 pages, 10 figures
Subjects: Machine Learning (cs.LG); Performance (cs.PF); Computational Physics (physics.comp-ph)
[355]  arXiv:2104.11061 (replaced) [pdf, other]
Title: Chasing Collective Variables using Autoencoders and biased trajectories
Comments: 57 pages, 15 figures
Subjects: Biological Physics (physics.bio-ph); Machine Learning (cs.LG); Computational Physics (physics.comp-ph); Machine Learning (stat.ML)
[356]  arXiv:2104.15099 (replaced) [pdf, other]
Title: Achieving Causality with Physical Clocks
Comments: 11 pages, preprint version of submission to ICDCN with Appendix
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
[357]  arXiv:2105.01593 (replaced) [pdf, ps, other]
Title: Regret Bounds for Stochastic Shortest Path Problems with Linear Function Approximation
Comments: This version removes most assumptions of the prior one
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
[358]  arXiv:2105.01648 (replaced) [pdf, other]
Title: On Lottery Tickets and Minimal Task Representations in Deep Reinforcement Learning
Comments: 20 pages, 15 figures
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
[359]  arXiv:2105.01922 (replaced) [pdf, other]
Title: SeaDronesSee: A Maritime Benchmark for Detecting Humans in Open Water
Comments: Leon Amadeus Varga, Benjamin Kiefer, Martin Messmer contributed equally to this work. The order of names is determined by coin flipping. Accepted WACV 2022
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[360]  arXiv:2105.02320 (replaced) [pdf, other]
Title: Iterative Human and Automated Identification of Wildlife Images
Comments: This preprint has not undergone peer review (when applicable) or any post-submission improvements or corrections. It is published in Nature Machine Intelligence: this https URL
Journal-ref: Nat Mach Intell 3, 885-895 (2021)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[361]  arXiv:2105.03655 (replaced) [pdf, other]
Title: FlingBot: The Unreasonable Effectiveness of Dynamic Manipulation for Cloth Unfolding
Authors: Huy Ha, Shuran Song
Comments: 11 pages, 6 figures. Code, data, and simulation environment publicly available at this https URL
Journal-ref: Conference on Robot Learning (CoRL 2021)
Subjects: Robotics (cs.RO)
[362]  arXiv:2105.06024 (replaced) [pdf, ps, other]
Title: Type-Based Termination for Futures
Comments: 21 pages. Extended version
Subjects: Programming Languages (cs.PL); Logic in Computer Science (cs.LO)
[363]  arXiv:2105.07237 (replaced) [pdf]
Title: Brain Inspired Face Recognition: A Computational Framework
Comments: 26 Pages, 16 Tables, 10 Figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[364]  arXiv:2105.08058 (replaced) [pdf, other]
Title: A parameter refinement method for Ptychography based on Deep Learning concepts
Journal-ref: Condens. Matter 2021, 6, 36
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Numerical Analysis (math.NA)
[365]  arXiv:2105.08990 (replaced) [pdf, other]
Title: Improved Exploring Starts by Kernel Density Estimation-Based State-Space Coverage Acceleration in Reinforcement Learning
Subjects: Machine Learning (cs.LG); Systems and Control (eess.SY)
[366]  arXiv:2105.10594 (replaced) [pdf, other]
Title: Privacy Amplification Via Bernoulli Sampling
Comments: 11 pages, 3 figures. Appeared in TPDP Workshop @ ICML 2021
Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR); Information Theory (cs.IT)
[367]  arXiv:2105.11724 (replaced) [pdf, other]
Title: SHAFF: Fast and consistent SHApley eFfect estimates via random Forests
Authors: Clément Bénard (LPSM (UMR\_8001)), Gérard Biau (LPSM (UMR\_8001)), Sébastien da Veiga, Erwan Scornet (CMAP)
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)
[368]  arXiv:2105.13077 (replaced) [pdf, other]
Title: Blind Motion Deblurring Super-Resolution: When Dynamic Spatio-Temporal Learning Meets Static Image Understanding
Comments: To appear in IEEE Transactions on Image Processing (TIP)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[369]  arXiv:2105.13493 (replaced) [pdf, other]
Title: Efficient and Accurate Gradients for Neural SDEs
Comments: Accepted at NeurIPS 2021
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Dynamical Systems (math.DS); Machine Learning (stat.ML)
[370]  arXiv:2105.13801 (replaced) [pdf, other]
Title: A Probabilistic Forecast-Driven Strategy for a Risk-Aware Participation in the Capacity Firming Market: extended version
Comments: Extended version of the paper accepted for publication in IEEE Transactions on Sustainable Energy
Subjects: Applications (stat.AP); Artificial Intelligence (cs.AI); Systems and Control (eess.SY)
[371]  arXiv:2105.15196 (replaced) [pdf, ps, other]
Title: A novel second-order nonstandard finite difference method for solving one-dimensional autonomous dynamical systems
Authors: Manh Tuan Hoang
Comments: 20 pages, 2 figure
Subjects: Numerical Analysis (math.NA)
[372]  arXiv:2106.02740 (replaced) [pdf, other]
Title: ZeroWaste Dataset: Towards Deformable Object Segmentation in Extreme Clutter
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[373]  arXiv:2106.02770 (replaced) [pdf, other]
Title: Accelerating Stochastic Simulation with Interactive Neural Processes
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
[374]  arXiv:2106.02811 (replaced) [pdf, other]
Title: Full-Dimensional Rate Enhancement for UAV-Enabled Communications via Intelligent Omni-Surface
Comments: 6 pages, 5 figures
Subjects: Systems and Control (eess.SY); Signal Processing (eess.SP)
[375]  arXiv:2106.06959 (replaced) [pdf, other]
Title: Do Not Escape From the Manifold: Discovering the Local Coordinates on the Latent Space of GANs
Comments: 23 pages, 19 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[376]  arXiv:2106.09305 (replaced) [pdf, other]
Title: Time Series is a Special Sequence: Forecasting with Sample Convolution and Interaction
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
[377]  arXiv:2106.10314 (replaced) [pdf, other]
Title: Differentiable Particle Filtering without Modifying the Forward Pass
Comments: 24 pages, 3 figures
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)
[378]  arXiv:2106.12535 (replaced) [pdf, other]
Title: Learning Stochastic Majority Votes by Minimizing a PAC-Bayes Generalization Bound
Journal-ref: Proceedings of the 35th Conference on Neural Information Processing Systems (NeurIPS 2021)
Subjects: Machine Learning (cs.LG); Methodology (stat.ME); Machine Learning (stat.ML)
[379]  arXiv:2106.15531 (replaced) [pdf, other]
Title: The Power of Word-Frequency Based Alignment-Free Functions: a Comprehensive Large-scale Experimental Analysis -- Version 3
Subjects: Genomics (q-bio.GN); Distributed, Parallel, and Cluster Computing (cs.DC)
[380]  arXiv:2107.02072 (replaced) [pdf, other]
Title: Selective decay for the rotating shallow-water equations with a structure-preserving discretization
Subjects: Numerical Analysis (math.NA)
[381]  arXiv:2107.02108 (replaced) [pdf, other]
Title: Can Super Resolution be used to improve Human Pose Estimation in Low Resolution Scenarios?
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[382]  arXiv:2107.02221 (replaced) [pdf, other]
Title: Nobody of the Crowd: An Empirical Investigation of Worker Communities in TopCoder
Comments: 10 pages, 7 figure, 4 tables
Subjects: Software Engineering (cs.SE); Human-Computer Interaction (cs.HC); Social and Information Networks (cs.SI)
[383]  arXiv:2107.02681 (replaced) [pdf, other]
Title: VidLanKD: Improving Language Understanding via Video-Distilled Knowledge Transfer
Comments: NeurIPS 2021 (19 pages)
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[384]  arXiv:2107.04174 (replaced) [pdf, other]
Title: EasyCom: An Augmented Reality Dataset to Support Algorithms for Easy Communication in Noisy Environments
Comments: Dataset is available at: this https URL
Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Signal Processing (eess.SP)
[385]  arXiv:2107.05612 (replaced) [pdf, other]
Title: A Persistent Spatial Semantic Representation for High-level Natural Language Instruction Execution
Comments: Presented at CoRL 2021
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[386]  arXiv:2107.07677 (replaced) [pdf, other]
Title: ECG-Adv-GAN: Detecting ECG Adversarial Examples with Conditional Generative Adversarial Networks
Comments: Accepted to ICMLA 2021
Subjects: Machine Learning (cs.LG)
[387]  arXiv:2107.11921 (replaced) [pdf, other]
Title: Compensation Learning
Subjects: Machine Learning (cs.LG)
[388]  arXiv:2107.13935 (replaced) [pdf, other]
Title: Break, Perturb, Build: Automatic Perturbation of Reasoning Paths Through Question Decomposition
Comments: Accepted for publication in Transactions of the Association for Computational Linguistics (TACL), 2021. Author's final version
Subjects: Computation and Language (cs.CL)
[389]  arXiv:2108.01595 (replaced) [pdf, other]
Title: Extending a Physics-Based Constitutive Model using Genetic Programming
Comments: Preprint submitted to Applications in Engineering Sciences
Subjects: Neural and Evolutionary Computing (cs.NE)
[390]  arXiv:2108.02707 (replaced) [pdf, other]
Title: Fairness Properties of Face Recognition and Obfuscation Systems
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[391]  arXiv:2108.03900 (replaced) [pdf, other]
Title: Multi-View TRGRU: Transformer based Spatiotemporal Model for Short-Term Metro Origin-Destination Matrix Prediction
Comments: 10 pages, 7 figures
Subjects: Artificial Intelligence (cs.AI)
[392]  arXiv:2108.04049 (replaced) [pdf, other]
Title: Multi-modal Retrieval of Tables and Texts Using Tri-encoder Models
Comments: Accepted at the 3rd Workshop on Machine Reading for Question Answering (MRQA) at EMNLP 2021
Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR)
[393]  arXiv:2108.04884 (replaced) [pdf, other]
Title: Retiring Adult: New Datasets for Fair Machine Learning
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
[394]  arXiv:2108.06230 (replaced) [pdf, other]
Title: Generative Zero-Shot Learning for Semantic Segmentation of 3D Point Cloud
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[395]  arXiv:2108.06743 (replaced) [pdf, other]
Title: Exploring Generalization Ability of Pretrained Language Models on Arithmetic and Logical Reasoning
Comments: Accepted by NLPCC2021
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
[396]  arXiv:2108.06959 (replaced) [pdf, other]
Title: WikiChurches: A Fine-Grained Dataset of Architectural Styles with Real-World Challenges
Comments: NeurIPS 2021 Track on Datasets and Benchmarks
Journal-ref: Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2021
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[397]  arXiv:2108.08423 (replaced) [pdf, ps, other]
Title: Second-Order Specifications and Quantifier Elimination for Consistent Query Answering in Databases
Comments: A couple of minor mistakes corrected, and some explanations added
Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI); Logic in Computer Science (cs.LO)
[398]  arXiv:2108.08597 (replaced) [pdf, other]
Title: Beyond NED: Fast and Effective Search Space Reduction for Complex Question Answering over Knowledge Bases
Comments: WSDM 2022
Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)
[399]  arXiv:2108.11645 (replaced) [pdf, other]
Title: Robust Model-based Reinforcement Learning for Autonomous Greenhouse Control
Subjects: Artificial Intelligence (cs.AI)
[400]  arXiv:2108.12284 (replaced) [pdf, other]
Title: The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers
Comments: Accepted to EMNLP 2021
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)
[401]  arXiv:2108.12334 (replaced) [pdf, ps, other]
Title: Explicit Good Subspace-metric Codes and Subset-metric Codes
Authors: Hao Chen
Comments: 26 pages,corrected version
Subjects: Information Theory (cs.IT)
[402]  arXiv:2108.12547 (replaced) [pdf, other]
Title: Self-fulfilling Bandits: Dynamic Selection in Algorithmic Decision-making
Comments: Main Body: 30 pages, 6 figures; Supplemental Material: 26 pages
Subjects: Econometrics (econ.EM); Machine Learning (cs.LG); Optimization and Control (math.OC); Methodology (stat.ME); Machine Learning (stat.ML)
[403]  arXiv:2108.13808 (replaced) [pdf, ps, other]
Title: On the formulation of fractional Adams-Bashforth method with Atangana-Baleanu-Caputo derivative to model chaotic problems
Comments: 11 figures, fractional Adams-Bashforth method and corrections
Journal-ref: Chaos, 29 023111 (2019)
Subjects: Numerical Analysis (math.NA)
[404]  arXiv:2108.13838 (replaced) [pdf, other]
Title: The Interaction Flow Editor: A New Human-Robot Interaction Rapid Prototyping Interface
Comments: 8 pages, 4 figures
Subjects: Robotics (cs.RO)
[405]  arXiv:2109.00957 (replaced) [pdf, other]
Title: Sk-Unet Model with Fourier Domain for Mitosis Detection
Comments: Win 1st place in the MICCAI2021 MIDOG Challenge
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[406]  arXiv:2109.01137 (replaced) [pdf, other]
Title: The Power of Points for Modeling Humans in Clothing
Comments: In ICCV 2021. Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[407]  arXiv:2109.01329 (replaced) [pdf, other]
Title: Achieving near native runtime performance and cross-platform performance portability for random number generation through SYCL interoperability
Comments: 24 pages, 5 figures, conference
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Mathematical Software (cs.MS); High Energy Physics - Experiment (hep-ex)
[408]  arXiv:2109.01352 (replaced) [pdf, ps, other]
Title: Betwixt Turing and Kleene
Comments: To appear in the LNCS proceedings of LFCS22 (Deerfield Beach, FLorida)
Subjects: Logic (math.LO); Logic in Computer Science (cs.LO)
[409]  arXiv:2109.01545 (replaced) [pdf, other]
Title: Large-Scale Learning with Fourier Features and Tensor Decompositions
Comments: 9 pages, 6 figures. Reviewed version after peer-review. To be published in the proceedings of the Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS 2021)
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
[410]  arXiv:2109.01745 (replaced) [pdf, other]
Title: A realistic approach to generate masked faces applied on two novel masked face recognition data sets
Comments: Accepted at NeurIPS 2021
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[411]  arXiv:2109.02818 (replaced) [pdf, ps, other]
Title: List-decodable Codes and Covering Codes
Authors: Hao Chen
Comments: 43 pages, extended to other metrics, a generalized Singleton upper bound for average-radius list-decodable codes added, McEliece-Rodemich-Rumsey-Welch bound compared, non-list-decodability result for AG codes added
Subjects: Information Theory (cs.IT)
[412]  arXiv:2109.03457 (replaced) [pdf, other]
Title: Uncertainty Quantification and Experimental Design for large-scale linear Inverse Problems under Gaussian Process Priors
Comments: under review
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Applications (stat.AP); Computation (stat.CO); Methodology (stat.ME)
[413]  arXiv:2109.03627 (replaced) [pdf, other]
Title: An Online Framework for Cognitive Load Assessment in Assembly Tasks
Subjects: Robotics (cs.RO)
[414]  arXiv:2109.04993 (replaced) [pdf, other]
Title: LAViTeR: Learning Aligned Visual and Textual Representations Assisted by Image and Caption Generation
Comments: 14 pages, 10 Figures, 5 Tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[415]  arXiv:2109.05019 (replaced) [pdf, other]
Title: Spike2Vec: An Efficient and Scalable Embedding Approach for COVID-19 Spike Sequences
Subjects: Genomics (q-bio.GN); Machine Learning (cs.LG)
[416]  arXiv:2109.05546 (replaced) [pdf, ps, other]
Title: Improved Algorithms for Misspecified Linear Markov Decision Processes
Comments: This version adds an intuitive explanation in Section 3
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
[417]  arXiv:2109.05948 (replaced) [pdf, other]
Title: A deep learning guided memetic framework for graph coloring problems
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
[418]  arXiv:2109.06777 (replaced) [pdf, other]
Title: PETGEN: Personalized Text Generation Attack on Deep Sequence Embedding-based Classification Models
Comments: Accepted for publication at: 2021 ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD'2021). Code and data at: this https URL
Subjects: Machine Learning (cs.LG); Social and Information Networks (cs.SI)
[419]  arXiv:2109.08930 (replaced) [pdf, other]
Title: Regular Sequential Serializability and Regular Sequential Consistency
Comments: 35 pages
Subjects: Databases (cs.DB); Distributed, Parallel, and Cluster Computing (cs.DC)
[420]  arXiv:2109.09304 (replaced) [pdf, other]
Title: Deformed semicircle law and concentration of nonlinear random matrices for ultra-wide neural networks
Comments: 46 pages, 5 figures
Subjects: Statistics Theory (math.ST); Machine Learning (cs.LG); Probability (math.PR); Machine Learning (stat.ML)
[421]  arXiv:2109.09977 (replaced) [pdf, ps, other]
Title: On Net Energy Metering X: Optimal Prosumer Decisions, Social Welfare, and Cross-subsidies
Comments: 14 pages, 6 figures, 3 tables
Subjects: Systems and Control (eess.SY); Theoretical Economics (econ.TH)
[422]  arXiv:2109.10601 (replaced) [pdf, other]
Title: Efficient Context-Aware Network for Abdominal Multi-organ Segmentation
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[423]  arXiv:2109.12021 (replaced) [pdf, other]
Title: Pythia: A Customizable Hardware Prefetching Framework Using Online Reinforcement Learning
Subjects: Hardware Architecture (cs.AR); Machine Learning (cs.LG)
[424]  arXiv:2109.12750 (replaced) [pdf, other]
Title: Learning Multimodal Rewards from Rankings
Comments: 17 pages, 12 figures, 2 tables. Published at Conference on Robot Learning (CoRL) 2021
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[425]  arXiv:2110.00239 (replaced) [pdf, other]
Title: Substructural fixed-point theorems and the diagonal argument: theme and variations
Comments: v1 20 pages; v2 22 pages, added additional final section on fixed-point operators
Subjects: Category Theory (math.CT); Logic in Computer Science (cs.LO); Logic (math.LO)
[426]  arXiv:2110.00809 (replaced) [pdf, other]
Title: Classifying COVID-19 Spike Sequences from Geographic Location Using Deep Learning
Subjects: Machine Learning (cs.LG); Quantitative Methods (q-bio.QM)
[427]  arXiv:2110.02388 (replaced) [pdf, other]
Title: Fast and Interpretable Consensus Clustering via Minipatch Learning
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Methodology (stat.ME)
[428]  arXiv:2110.03618 (replaced) [pdf, other]
Title: Causal Direction of Data Collection Matters: Implications of Causal and Anticausal Learning for NLP
Comments: EMNLP 2021 (Oral)
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[429]  arXiv:2110.04068 (replaced) [pdf]
Title: Measurement of In-Circuit Common-Mode Impedance at the AC Input of a Motor Drive System
Comments: This is a modified/final version of arXiv:2110.04068
Subjects: Systems and Control (eess.SY); Signal Processing (eess.SP); Classical Physics (physics.class-ph); Instrumentation and Detectors (physics.ins-det)
[430]  arXiv:2110.04507 (replaced) [pdf, other]
Title: TiKick: Towards Playing Multi-agent Football Full Games from Single-agent Demonstrations
Subjects: Artificial Intelligence (cs.AI)
[431]  arXiv:2110.05319 (replaced) [pdf, other]
Title: Efficient Training of 3D Seismic Image Fault Segmentation Network under Sparse Labels by Weakening Anomaly Annotation
Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV); Geophysics (physics.geo-ph)
[432]  arXiv:2110.05436 (replaced) [pdf, other]
Title: A differential approach to detecting projective equivalences and symmetries of rational 3D curves
Subjects: Algebraic Geometry (math.AG); Computational Geometry (cs.CG); Differential Geometry (math.DG)
[433]  arXiv:2110.05695 (replaced) [pdf, ps, other]
Title: The Mirrornet : Learning Audio Synthesizer Controls Inspired by Sensorimotor Interaction
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[434]  arXiv:2110.05994 (replaced) [pdf, other]
Title: Word Order Does Not Matter For Speech Recognition
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[435]  arXiv:2110.06475 (replaced) [pdf, other]
Title: SAR-Net: A Scenario-Aware Ranking Network for Personalized Fair Recommendation in Hundreds of Travel Scenarios
Comments: Accepted by CIKM 2021
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR)
[436]  arXiv:2110.06990 (replaced) [pdf, other]
Title: Scaling Laws for the Few-Shot Adaptation of Pre-trained Image Classifiers
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[437]  arXiv:2110.07034 (replaced) [pdf, other]
Title: How Does Momentum Benefit Deep Neural Networks Architecture Design? A Few Case Studies
Comments: 40 pages, 15 figures. arXiv admin note: substantial text overlap with arXiv:2006.06919, arXiv:2110.04840
Subjects: Machine Learning (cs.LG); Dynamical Systems (math.DS); Numerical Analysis (math.NA)
[438]  arXiv:2110.07342 (replaced) [pdf, other]
Title: FILM: Following Instructions in Language with Modular Methods
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)
[439]  arXiv:2110.08190 (replaced) [pdf, other]
Title: Sparse Progressive Distillation: Resolving Overfitting under Pretrain-and-Finetune Paradigm
Subjects: Computation and Language (cs.CL)
[440]  arXiv:2110.08270 (replaced) [pdf, other]
Title: From Multimodal to Unimodal Attention in Transformers using Knowledge Distillation
Comments: Preprint. Final paper accepted at the 17th IEEE International Conference on Advanced Video and Signal-based Surveillance, AVSS 2021, Virtual, November 16-19, 2021. 10 pages
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL)
[441]  arXiv:2110.08436 (replaced) [pdf, other]
Title: Reactive Task Allocation and Planning of A Heterogeneous Multi-Robot System
Subjects: Robotics (cs.RO)
[442]  arXiv:2110.08440 (replaced) [pdf, other]
Title: Online Target Q-learning with Reverse Experience Replay: Efficiently finding the Optimal Policy for Linear MDPs
Comments: Under Review, V2 has updated acknowledgements
Subjects: Machine Learning (cs.LG); Optimization and Control (math.OC)
[443]  arXiv:2110.08471 (replaced) [pdf, other]
Title: Fast Projection onto the Capped Simplex withApplications to Sparse Regression in Bioinformatics
Comments: 12 pages, 5 figures
Subjects: Optimization and Control (math.OC); Machine Learning (cs.LG); Genomics (q-bio.GN)
[444]  arXiv:2110.08517 (replaced) [pdf, other]
Title: Characterizing Improper Input Validation Vulnerabilities of Mobile Crowdsourcing Services
Journal-ref: Annual Computer Security Applications Conference (ACSAC '21), December 6--10, 2021, USA
Subjects: Cryptography and Security (cs.CR)
[445]  arXiv:2110.08557 (replaced) [pdf, other]
Title: DPNAS: Neural Architecture Search for Deep Learning with Differential Privacy
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR)
[446]  arXiv:2110.08619 (replaced) [pdf, other]
Title: SAGAN: Adversarial Spatial-asymmetric Attention for Noisy Nona-Bayer Reconstruction
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[447]  arXiv:2110.08710 (replaced) [pdf, ps, other]
Title: NeuralArTS: Structuring Neural Architecture Search with Type Theory
Subjects: Machine Learning (cs.LG); Logic in Computer Science (cs.LO); Programming Languages (cs.PL); Machine Learning (stat.ML)
[448]  arXiv:2110.08733 (replaced) [pdf, other]
Title: LoveDA: A Remote Sensing Land-Cover Dataset for Domain Adaptive Semantic Segmentation
Comments: Accepted by NeurIPS 2021 Datasets and Benchmarks Track
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[449]  arXiv:2110.08820 (replaced) [pdf, other]
Title: On-board Fault Diagnosis of a Laboratory Mini SR-30 Gas Turbine Engine
Authors: Richa Singh
Subjects: Machine Learning (cs.LG); Systems and Control (eess.SY)
[450]  arXiv:2110.08866 (replaced) [pdf, other]
Title: Alleviating Noisy-label Effects in Image Classification via Probability Transition Matrix
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[451]  arXiv:2110.08956 (replaced) [pdf, other]
Title: Improving Robustness of Reinforcement Learning for Power System Control with Adversarial Training
Comments: Published at 2021 ICML RL4RL Workshop; Submitted to 2022 PSCC
Subjects: Systems and Control (eess.SY); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[452]  arXiv:2110.08991 (replaced) [pdf, other]
Title: Dimensionality Reduction for Wasserstein Barycenter
Comments: Published as a conference paper in NeurIPS 2021
Subjects: Data Structures and Algorithms (cs.DS); Machine Learning (cs.LG); Probability (math.PR)
[453]  arXiv:2110.09069 (replaced) [pdf, other]
Title: Diameter constrained Steiner tree and related problems
Comments: 13 pages, 4 figures
Subjects: Data Structures and Algorithms (cs.DS); Discrete Mathematics (cs.DM)
[454]  arXiv:2110.09086 (replaced) [pdf, other]
Title: ViraPart: A Text Refinement Framework for ASR and NLP Tasks in Persian
Subjects: Computation and Language (cs.CL)
[455]  arXiv:2110.09113 (replaced) [pdf]
Title: Salt and pepper noise removal method based on stationary Framelet transform with non-convex sparsity regularization
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[456]  arXiv:2110.09188 (replaced) [pdf, ps, other]
Title: Ride Sharing & Data Privacy: An Analysis of the State of Practice
Subjects: Computers and Society (cs.CY); Social and Information Networks (cs.SI)
[457]  arXiv:2110.09197 (replaced) [pdf, ps, other]
Title: On the Completeness and Complexity of the Lifted Dynamic Junction Tree Algorithm
Authors: Marcel Gehrke
Comments: StaRAI 2021
Subjects: Artificial Intelligence (cs.AI)
[458]  arXiv:2110.09248 (replaced) [pdf, ps, other]
Title: Demographic Biases of Crowd Workers in Key Opinion Leaders Finding
Comments: 3 pages, CSCW 2021 Workshop - Investigating and Mitigating Biases in Crowdsourced Data
Subjects: Information Retrieval (cs.IR)
[459]  arXiv:2110.09304 (replaced) [pdf, other]
Title: Prediction of Occurrence of Extreme Events using Machine Learning
Subjects: Machine Learning (cs.LG); Chaotic Dynamics (nlin.CD)
[460]  arXiv:2110.09365 (replaced) [pdf, other]
Title: Optical Front/Mid-haul with Open Access-Edge Server Deployment Framework for Sliced O-RAN
Comments: 16 pages
Subjects: Networking and Internet Architecture (cs.NI)
[461]  arXiv:2110.09380 (replaced) [pdf, other]
Title: Learning multiplane images from single views with self-supervision
Comments: To appear on BMVC 2021
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[462]  arXiv:2110.09436 (replaced) [pdf]
Title: Early Diagnostic Prediction of Covid-19 using Gradient-Boosting Machine Model
Authors: Satvik Tripathi
Comments: Presented at the Drexel Society of Artificial Intelligence Research Conference, 2021 (arXiv:2110.05263)
Subjects: Machine Learning (cs.LG)
[ total of 462 entries: 1-462 ]
[ showing up to 2000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, recent, 2110, contact, help  (Access key information)