We gratefully acknowledge support from
the Simons Foundation and member institutions.

Computer Science

New submissions

[ total of 351 entries: 1-351 ]
[ showing up to 2000 entries per page: fewer | more ]

New submissions for Fri, 15 Jan 21

[1]  arXiv:2101.05272 [pdf, other]
Title: Real or Virtual? Using Brain Activity Patterns to differentiate Attended Targets during Augmented Reality Scenarios
Subjects: Human-Computer Interaction (cs.HC); Machine Learning (cs.LG)

Augmented Reality is the fusion of virtual components and our real surroundings. The simultaneous visibility of generated and natural objects often requires users to direct their selective attention to a specific target that is either real or virtual. In this study, we investigated whether this target is real or virtual by using machine learning techniques to classify electroencephalographic (EEG) data collected in Augmented Reality scenarios. A shallow convolutional neural net classified 3 second data windows from 20 participants in a person-dependent manner with an average accuracy above 70\% if the testing data and training data came from different trials. Person-independent classification was possible above chance level for 6 out of 20 participants. Thus, the reliability of such a Brain-Computer Interface is high enough for it to be treated as a useful input mechanism for Augmented Reality applications.

[2]  arXiv:2101.05273 [pdf, other]
Title: AutoDS: Towards Human-Centered Automation of Data Science
Subjects: Human-Computer Interaction (cs.HC); Machine Learning (cs.LG)

Data science (DS) projects often follow a lifecycle that consists of laborious tasks for data scientists and domain experts (e.g., data exploration, model training, etc.). Only till recently, machine learning(ML) researchers have developed promising automation techniques to aid data workers in these tasks. This paper introduces AutoDS, an automated machine learning (AutoML) system that aims to leverage the latest ML automation techniques to support data science projects. Data workers only need to upload their dataset, then the system can automatically suggest ML configurations, preprocess data, select algorithm, and train the model. These suggestions are presented to the user via a web-based graphical user interface and a notebook-based programming user interface.
We studied AutoDS with 30 professional data scientists, where one group used AutoDS, and the other did not, to complete a data science project. As expected, AutoDS improves productivity; Yet surprisingly, we find that the models produced by the AutoDS group have higher quality and less errors, but lower human confidence scores. We reflect on the findings by presenting design implications for incorporating automation techniques into human work in the data science lifecycle.

[3]  arXiv:2101.05278 [pdf, other]
Title: GAN Inversion: A Survey
Comments: papers on generative modeling: this https URL awesome gan-inversion papers: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)

GAN inversion aims to invert a given image back into the latent space of a pretrained GAN model, for the image to be faithfully reconstructed from the inverted code by the generator. As an emerging technique to bridge the real and fake image domains, GAN inversion plays an essential role in enabling the pretrained GAN models such as StyleGAN and BigGAN to be used for real image editing applications. Meanwhile, GAN inversion also provides insights on the interpretation of GAN's latent space and how the realistic images can be generated. In this paper, we provide an overview of GAN inversion with a focus on its recent algorithms and applications. We cover important techniques of GAN inversion and their applications to image restoration and image manipulation. We further elaborate on some trends and challenges for future directions.

[4]  arXiv:2101.05300 [pdf, other]
Title: Proxemics and Social Interactions in an Instrumented Virtual Reality Workshop
Comments: 20 pages, 9 figures, ACM CHI 2021
Subjects: Human-Computer Interaction (cs.HC)

Virtual environments (VEs) can create collaborative and social spaces, which are increasingly important in the face of remote work and travel reduction. Recent advances, such as more open and widely available platforms, create new possibilities to observe and analyse interaction in VEs. Using a custom instrumented build of Mozilla Hubs to measure position and orientation, we conducted an academic workshop to facilitate a range of typical workshop activities. We analysed social interactions during a keynote, small group breakouts, and informal networking/hallway conversations. Our mixed-methods approach combined environment logging, observations, and semi-structured interviews. The results demonstrate how small and large spaces influenced group formation, shared attention, and personal space, where smaller rooms facilitated more cohesive groups while larger rooms made small group formation challenging but personal space more flexible. Beyond our findings, we show how the combination of data and insights can fuel collaborative spaces' design and deliver more effective virtual workshops.

[5]  arXiv:2101.05303 [pdf, other]
Title: Understanding the Effect of Out-of-distribution Examples and Interactive Explanations on Human-AI Decision Making
Comments: 42 pages, 22 figures
Subjects: Artificial Intelligence (cs.AI); Computers and Society (cs.CY); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG)

Although AI holds promise for improving human decision making in societally critical domains, it remains an open question how human-AI teams can reliably outperform AI alone and human alone in challenging prediction tasks (also known as complementary performance). We explore two directions to understand the gaps in achieving complementary performance. First, we argue that the typical experimental setup limits the potential of human-AI teams. To account for lower AI performance out-of-distribution than in-distribution because of distribution shift, we design experiments with different distribution types and investigate human performance for both in-distribution and out-of-distribution examples. Second, we develop novel interfaces to support interactive explanations so that humans can actively engage with AI assistance. Using in-person user study and large-scale randomized experiments across three tasks, we demonstrate a clear difference between in-distribution and out-of-distribution, and observe mixed results for interactive explanations: while interactive explanations improve human perception of AI assistance's usefulness, they may magnify human biases and lead to limited performance improvement. Overall, our work points out critical challenges and future directions towards complementary performance.

[6]  arXiv:2101.05304 [pdf, other]
Title: Spatial-Temporal Convolutional Network for Spread Prediction of COVID-19
Comments: IEEE BigData 2020
Subjects: Machine Learning (cs.LG)

In this work we present a spatial-temporal convolutional neural network for predicting future COVID-19 related symptoms severity among a population, per region, given its past reported symptoms. This can help approximate the number of future Covid-19 patients in each region, thus enabling a faster response, e.g., preparing the local hospital or declaring a local lockdown where necessary. Our model is based on a national symptom survey distributed in Israel and can predict symptoms severity for different regions daily. The model includes two main parts - (1) learned region-based survey responders profiles used for aggregating questionnaires data into features (2) Spatial-Temporal 3D convolutional neural network which uses the above features to predict symptoms progression.

[7]  arXiv:2101.05307 [pdf, other]
Title: Explainability of vision-based autonomous driving systems: Review and challenges
Comments: submitted to IJCV
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Robotics (cs.RO)

This survey reviews explainability methods for vision-based self-driving systems. The concept of explainability has several facets and the need for explainability is strong in driving, a safety-critical application. Gathering contributions from several research fields, namely computer vision, deep learning, autonomous driving, explainable AI (X-AI), this survey tackles several points. First, it discusses definitions, context, and motivation for gaining more interpretability and explainability from self-driving systems. Second, major recent state-of-the-art approaches to develop self-driving systems are quickly presented. Third, methods providing explanations to a black-box self-driving system in a post-hoc fashion are comprehensively organized and detailed. Fourth, approaches from the literature that aim at building more interpretable self-driving systems by design are presented and discussed in detail. Finally, remaining open-challenges and potential future research directions are identified and examined.

[8]  arXiv:2101.05308 [pdf, other]
Title: Toward Data Cleaning with a Target Accuracy: A Case Study for Value Normalization
Subjects: Databases (cs.DB)

Many applications need to clean data with a target accuracy. As far as we know, this problem has not been studied in depth. In this paper we take the first step toward solving it. We focus on value normalization (VN), the problem of replacing all string that refer to the same entity with a unique string. VN is ubiquitous, and we often want to do VN with 100% accuracy. This is typically done today in industry by automatically clustering the strings then asking a user to verify and clean the clusters, until reaching 100% accuracy. This solution has significant limitations. It does not tell the users how to verify and clean the clusters. This part also often takes a lot of time, e.g., days. Further, there is no effective way for multiple users to collaboratively verify and clean. In this paper we address these challenges. Overall, our work advances the state of the art in data cleaning by introducing a novel cleaning problem and describing a promising solution template.

[9]  arXiv:2101.05314 [pdf, ps, other]
Title: EXMA: A Genomics Accelerator for Exact-Matching
Comments: IEEE International Symposium on High-Performance Computer Architecture, 2021
Subjects: Hardware Architecture (cs.AR)

Genomics is the foundation of precision medicine, global food security and virus surveillance. Exact-match is one of the most essential operations widely used in almost every step of genomics such as alignment, assembly, annotation, and compression. Modern genomics adopts Ferragina-Manzini Index (FM-Index) augmenting space-efficient Burrows-Wheeler transform (BWT) with additional data structures to permit ultra-fast exact-match operations. However, FM-Index is notorious for its poor spatial locality and random memory access pattern. Prior works create GPU-, FPGA-, ASIC- and even process-in-memory (PIM)-based accelerators to boost FM-Index search throughput. Though they achieve the state-of-the-art FM-Index search throughput, the same as all prior conventional accelerators, FM-Index PIMs process only one DNA symbol after each DRAM row activation, thereby suffering from poor memory bandwidth utilization.
In this paper, we propose a hardware accelerator, EXMA, to enhance FM-Index search throughput. We first create a novel EXMA table with a multi-task-learning (MTL)-based index to process multiple DNA symbols with each DRAM row activation. We then build an accelerator to search over an EXMA table. We propose 2-stage scheduling to increase the cache hit rate of our accelerator. We introduce dynamic page policy to improve the row buffer hit rate of DRAM main memory. We also present CHAIN compression to reduce the data structure size of EXMA tables. Compared to state-of-the-art FM-Index PIMs, EXMA improves search throughput by $4.9\times$, and enhances search throughput per Watt by $4.8\times$.

[10]  arXiv:2101.05317 [pdf, other]
Title: Learning and Fast Adaptation for Grid Emergency Control via Deep Meta Reinforcement Learning
Subjects: Machine Learning (cs.LG); Systems and Control (eess.SY)

As power systems are undergoing a significant transformation with more uncertainties, less inertia and closer to operation limits, there is increasing risk of large outages. Thus, there is an imperative need to enhance grid emergency control to maintain system reliability and security. Towards this end, great progress has been made in developing deep reinforcement learning (DRL) based grid control solutions in recent years. However, existing DRL-based solutions have two main limitations: 1) they cannot handle well with a wide range of grid operation conditions, system parameters, and contingencies; 2) they generally lack the ability to fast adapt to new grid operation conditions, system parameters, and contingencies, limiting their applicability for real-world applications. In this paper, we mitigate these limitations by developing a novel deep meta reinforcement learning (DMRL) algorithm. The DMRL combines the meta strategy optimization together with DRL, and trains policies modulated by a latent space that can quickly adapt to new scenarios. We test the developed DMRL algorithm on the IEEE 300-bus system. We demonstrate fast adaptation of the meta-trained DRL polices with latent variables to new operating conditions and scenarios using the proposed method and achieve superior performance compared to the state-of-the-art DRL and model predictive control (MPC) methods.

[11]  arXiv:2101.05323 [pdf, other]
Title: ZipLine: In-Network Compression at Line Speed
Journal-ref: 2020. Proceedings of the 16th International Conference on emerging Networking EXperiments and Technologies. Association for Computing Machinery, New York, NY, USA, 399-405
Subjects: Networking and Internet Architecture (cs.NI); Performance (cs.PF)

Network appliances continue to offer novel opportunities to offload processing from computing nodes directly into the data plane. One popular concern of network operators and their customers is to move data increasingly faster. A common technique to increase data throughput is to compress it before its transmission. However, this requires compression of the data -- a time and energy demanding pre-processing phase -- and decompression upon reception -- a similarly resource consuming operation. Moreover, if multiple nodes transfer similar data chunks across the network hop (e.g., a given pair of switches), each node effectively wastes resources by executing similar steps. This paper proposes ZipLine, an approach to design and implement (de)compression at line speed leveraging the Tofino hardware platform which is programmable using the P4_16 language. We report on lessons learned while building the system and show throughput, latency and compression measurements on synthetic and real-world traces, showcasing the benefits and trade-offs of our design.

[12]  arXiv:2101.05324 [pdf, ps, other]
Title: Multi-robot Symmetric Rendezvous Searchon the Line
Subjects: Robotics (cs.RO); Discrete Mathematics (cs.DM)

We study the Symmetric Rendezvous Search Problem for a multi-robot system. There are $n>2$ robots arbitrarily located on a line. Their goal is to meet somewhere on the line as quickly as possible. The robots do not know the initial location of any of the other robots or their own positions on the line. The symmetric version of the problem requires the robots to execute the same search strategy to achieve rendezvous. Therefore, we solve the problem in an online fashion with a randomized strategy. In this paper, we present a symmetric rendezvous algorithm which achieves a constant competitive ratio for the total distance traveled by the robots. We validate our theoretical results through simulations.

[13]  arXiv:2101.05325 [pdf, other]
Title: Learning Kinematic Feasibility for Mobile Manipulation through Deep Reinforcement Learning
Comments: Code and Models: this http URL
Subjects: Robotics (cs.RO)

Mobile manipulation tasks remain one of the critical challenges for the widespread adoption of autonomous robots in both service and industrial scenarios. While planning approaches are good at generating feasible whole-body robot trajectories, they struggle with dynamic environments as well as the incorporation of constraints given by the task and the environment. On the other hand, dynamic motion models in the action space struggle with generating kinematically feasible trajectories for mobile manipulation actions. We propose a deep reinforcement learning approach to learn feasible dynamic motions for a mobile base while the end-effector follows a trajectory in task space generated by an arbitrary system to fulfill the task at hand. This modular formulation has several benefits: it enables us to readily transform a broad range of end-effector motions into mobile applications, it allows us to use the kinematic feasibility of the end-effector trajectory as a dense reward signal and its modular formulation allows it to generalise to unseen end-effector motions at test time. We demonstrate the capabilities of our approach on multiple mobile robot platforms with different kinematic abilities and different types of wheeled platforms in extensive simulated as well as real-world experiments.

[14]  arXiv:2101.05328 [pdf, other]
Title: Uniform Error and Posterior Variance Bounds for Gaussian Process Regression with Application to Safe Control
Subjects: Machine Learning (cs.LG); Systems and Control (eess.SY); Machine Learning (stat.ML)

In application areas where data generation is expensive, Gaussian processes are a preferred supervised learning model due to their high data-efficiency. Particularly in model-based control, Gaussian processes allow the derivation of performance guarantees using probabilistic model error bounds. To make these approaches applicable in practice, two open challenges must be solved i) Existing error bounds rely on prior knowledge, which might not be available for many real-world tasks. (ii) The relationship between training data and the posterior variance, which mainly drives the error bound, is not well understood and prevents the asymptotic analysis. This article addresses these issues by presenting a novel uniform error bound using Lipschitz continuity and an analysis of the posterior variance function for a large class of kernels. Additionally, we show how these results can be used to guarantee safe control of an unknown dynamical system and provide numerical illustration examples.

[15]  arXiv:2101.05329 [pdf, other]
Title: Improving Run Length Encoding by Preprocessing
Subjects: Data Structures and Algorithms (cs.DS)

The Run Length Encoding (RLE) compression method is a long standing simple lossless compression scheme which is easy to implement and achieves a good compression on input data which contains repeating consecutive symbols. In its pure form RLE is not applicable on natural text or other input data with short sequences of identical symbols. We present a combination of preprocessing steps that turn arbitrary input data in a byte-wise encoding into a bit-string which is highly suitable for RLE compression. The main idea is to first read all most significant bits of the input byte-string, followed by the second most significant bit, and so on. We combine this approach by a dynamic byte remapping as well as a Burrows-Wheeler-Scott transform on a byte level. Finally, we apply a Huffman Encoding on the output of the bit-wise RLE encoding to allow for more dynamic lengths of code words encoding runs of the RLE. With our technique we can achieve a lossless average compression which is better than the standard RLE compression by a factor of 8 on average.

[16]  arXiv:2101.05337 [pdf, other]
Title: A Survey on Simulators for Testing Self-Driving Cars
Subjects: Robotics (cs.RO)

A rigorous and comprehensive testing plays a key role in training self-driving cars to handle variety of situations that they are expected to see on public roads.
The physical testing on public roads is unsafe, costly, and not always reproducible. This is where testing in simulation helps fill the gap, however, the problem with simulation testing is that it is only as good as the simulator used for testing and how representative the simulated scenarios are of the real environment. In this paper, we identify key requirements that a good simulator must have. Further, we provide a comparison of commonly used simulators. Our analysis shows that CARLA and LGSVL simulators are the current state-of-the-art simulators for end to end testing of self-driving cars for the reasons mentioned in this paper. Finally, we also present current challenges that simulation testing continues to face as we march towards building fully autonomous cars.

[17]  arXiv:2101.05346 [pdf, other]
Title: X-CAL: Explicit Calibration for Survival Analysis
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

Survival analysis models the distribution of time until an event of interest, such as discharge from the hospital or admission to the ICU. When a model's predicted number of events within any time interval is similar to the observed number, it is called well-calibrated. A survival model's calibration can be measured using, for instance, distributional calibration (D-CALIBRATION) [Haider et al., 2020] which computes the squared difference between the observed and predicted number of events within different time intervals. Classically, calibration is addressed in post-training analysis. We develop explicit calibration (X-CAL), which turns D-CALIBRATION into a differentiable objective that can be used in survival modeling alongside maximum likelihood estimation and other objectives. X-CAL allows practitioners to directly optimize calibration and strike a desired balance between predictive power and calibration. In our experiments, we fit a variety of shallow and deep models on simulated data, a survival dataset based on MNIST, on length-of-stay prediction using MIMIC-III data, and on brain cancer data from The Cancer Genome Atlas. We show that the models we study can be miscalibrated. We give experimental evidence on these datasets that X-CAL improves D-CALIBRATION without a large decrease in concordance or likelihood.

[18]  arXiv:2101.05348 [pdf, other]
Title: Gaussian Mixture Graphical Lasso with Application to Edge Detection in Brain Networks
Subjects: Machine Learning (cs.LG)

Sparse inverse covariance estimation (i.e., edge de-tection) is an important research problem in recent years, wherethe goal is to discover the direct connections between a set ofnodes in a networked system based upon the observed nodeactivities. Existing works mainly focus on unimodal distributions,where it is usually assumed that the observed activities aregenerated from asingleGaussian distribution (i.e., one graph).However, this assumption is too strong for many real-worldapplications. In many real-world applications (e.g., brain net-works), the node activities usually exhibit much more complexpatterns that are difficult to be captured by one single Gaussiandistribution. In this work, we are inspired by Latent DirichletAllocation (LDA) [4] and consider modeling the edge detectionproblem as estimating a mixture ofmultipleGaussian distribu-tions, where each corresponds to a separate sub-network. Toaddress this problem, we propose a novel model called GaussianMixture Graphical Lasso (MGL). It learns the proportionsof signals generated by each mixture component and theirparameters iteratively via an EM framework. To obtain moreinterpretable networks, MGL imposes a special regularization,called Mutual Exclusivity Regularization (MER), to minimize theoverlap between different sub-networks. MER also addresses thecommon issues in read-world data sets,i.e., noisy observationsand small sample size. Through the extensive experiments onsynthetic and real brain data sets, the results demonstrate thatMGL can effectively discover multiple connectivity structuresfrom the observed node activities

[19]  arXiv:2101.05349 [pdf, ps, other]
Title: On the Identification of Electrical Equivalent Circuit Models Based on Noisy Measurements
Subjects: Systems and Control (eess.SY)

Real-time identification of electrical equivalent circuit models is a critical requirement in many practical systems, such as batteries and electric motors. Significant work has been done in the past developing different types of algorithms for system identification using reduced equivalent circuit models. However, little work was done in analyzing the theoretical performance bounds of these approaches. Proper understanding of theoretical bounds will help in designing a system that is economical in cost and robust in performance. In this paper, we analyze the performance of a linear recursive least squares approach to equivalent circuit model identification and show that the least squares approach is both unbiased and efficient when the signal-to-noise ratio is high enough. However, we show that, when the signal-to-noise ratio is low - resembling the case in many practical applications - the least squares estimator becomes significantly biased. Consequently, we develop a parameter estimation approach based on total least squares method and show it to be asymptotically unbiased and efficient at practically low signal-to-noise ratio regions. Further, we develop a recursive implementation of the total least square algorithm and find it to be slow to converge; for this, we employ a Kalman filter to improve the convergence speed of the total least squares method. The resulting total Kalman filter is shown to be both unbiased and efficient in equivalent circuit model parameter identification. The performance of this filter is analyzed using real-world current profile under fluctuating signal-to-noise ratios. Finally, the applicability of the algorithms and analysis in this paper in identifying higher order electrical equivalent circuit models is explained.

[20]  arXiv:2101.05356 [pdf, other]
Title: Practical Face Reconstruction via Differentiable Ray Tracing
Comments: 16 pages, 14 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)

We present a differentiable ray-tracing based novel face reconstruction approach where scene attributes - 3D geometry, reflectance (diffuse, specular and roughness), pose, camera parameters, and scene illumination - are estimated from unconstrained monocular images. The proposed method models scene illumination via a novel, parameterized virtual light stage, which in-conjunction with differentiable ray-tracing, introduces a coarse-to-fine optimization formulation for face reconstruction. Our method can not only handle unconstrained illumination and self-shadows conditions, but also estimates diffuse and specular albedos. To estimate the face attributes consistently and with practical semantics, a two-stage optimization strategy systematically uses a subset of parametric attributes, where subsequent attribute estimations factor those previously estimated. For example, self-shadows estimated during the first stage, later prevent its baking into the personalized diffuse and specular albedos in the second stage. We show the efficacy of our approach in several real-world scenarios, where face attributes can be estimated even under extreme illumination conditions. Ablation studies, analyses and comparisons against several recent state-of-the-art methods show improved accuracy and versatility of our approach. With consistent face attributes reconstruction, our method leads to several style -- illumination, albedo, self-shadow -- edit and transfer applications, as discussed in the paper.

[21]  arXiv:2101.05357 [pdf, other]
Title: Towards Creating a Deployable Grasp Type Probability Estimator for a Prosthetic Hand
Journal-ref: CyPhy 2019, WESE 2019. Lecture Notes in Computer Science, vol 11971. Springer, Cham
Subjects: Machine Learning (cs.LG); Hardware Architecture (cs.AR)

For lower arm amputees, prosthetic hands promise to restore most of physical interaction capabilities. This requires to accurately predict hand gestures capable of grabbing varying objects and execute them timely as intended by the user. Current approaches often rely on physiological signal inputs such as Electromyography (EMG) signal from residual limb muscles to infer the intended motion. However, limited signal quality, user diversity and high variability adversely affect the system robustness. Instead of solely relying on EMG signals, our work enables augmenting EMG intent inference with physical state probability through machine learning and computer vision method. To this end, we: (1) study state-of-the-art deep neural network architectures to select a performant source of knowledge transfer for the prosthetic hand, (2) use a dataset containing object images and probability distribution of grasp types as a new form of labeling where instead of using absolute values of zero and one as the conventional classification labels, our labels are a set of probabilities whose sum is 1. The proposed method generates probabilistic predictions which could be fused with EMG prediction of probabilities over grasps by using the visual information from the palm camera of a prosthetic hand. Our results demonstrate that InceptionV3 achieves highest accuracy with 0.95 angular similarity followed by 1.4 MobileNetV2 with 0.93 at ~20% the amount of operations.

[22]  arXiv:2101.05360 [pdf, other]
Title: Preferential Mixture-of-Experts: Interpretable Models that Rely on Human Expertise as much as Possible
Comments: 10 pages, 5 figures, 4 tables, AMIA 2021 Virtual Informatics Summit
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)

We propose Preferential MoE, a novel human-ML mixture-of-experts model that augments human expertise in decision making with a data-based classifier only when necessary for predictive performance. Our model exhibits an interpretable gating function that provides information on when human rules should be followed or avoided. The gating function is maximized for using human-based rules, and classification errors are minimized. We propose solving a coupled multi-objective problem with convex subproblems. We develop approximate algorithms and study their performance and convergence. Finally, we demonstrate the utility of Preferential MoE on two clinical applications for the treatment of Human Immunodeficiency Virus (HIV) and management of Major Depressive Disorder (MDD).

[23]  arXiv:2101.05361 [pdf, other]
Title: Random Shadows and Highlights: A new data augmentation method for Extreme Lighting Conditions
Subjects: Computer Vision and Pattern Recognition (cs.CV)

In this paper, we propose a new data augmentation method, Random Shadows and Highlights (RSH) to acquire robustness against lighting perturbations. Our method creates random shadows and highlights on images, thus challenging the neural network during the learning process such that it acquires immunity against such input corruptions in real world applications. It is a parameter-learning free method which can be integrated into most vision related learning applications effortlessly. With extensive experimentation, we demonstrate that RSH not only increases the robustness of the models against lighting perturbations, but also reduces over-fitting significantly. Thus RSH should be considered essential for all vision related learning systems. Code is available at: https://github.com/OsamaMazhar/Random-Shadows-Highlights.

[24]  arXiv:2101.05362 [pdf, other]
Title: White-Box Analysis over Machine Learning: Modeling Performance of Configurable Systems
Comments: Accepted for publication at ICSE'21
Subjects: Software Engineering (cs.SE)

Performance-influence models can help stakeholders understand how and where configuration options and their interactions influence the performance of a system. With this understanding, stakeholders can debug performance behavior and make deliberate configuration decisions. Current black-box techniques to build such models combine various sampling and learning strategies, resulting in tradeoffs between measurement effort, accuracy, and interpretability. We present Comprex, a white-box approach to build performance-influence models for configurable systems, combining insights of local measurements, dynamic taint analysis to track options in the implementation, compositionality, and compression of the configuration space, without relying on machine learning to extrapolate incomplete samples. Our evaluation on 4 widely-used, open-source projects demonstrates that Comprex builds similarly accurate performance-influence models to the most accurate and expensive black-box approach, but at a reduced cost and with additional benefits from interpretable and local models.

[25]  arXiv:2101.05363 [pdf, other]
Title: NetCut: Real-Time DNN Inference Using Layer Removal
Subjects: Machine Learning (cs.LG); Performance (cs.PF)

Deep Learning plays a significant role in assisting humans in many aspects of their lives. As these networks tend to get deeper over time, they extract more features to increase accuracy at the cost of additional inference latency. This accuracy-performance trade-off makes it more challenging for Embedded Systems, as resource-constrained processors with strict deadlines, to deploy them efficiently. This can lead to selection of networks that can prematurely meet a specified deadline with excess slack time that could have potentially contributed to increased accuracy.
In this work, we propose: (i) the concept of layer removal as a means of constructing TRimmed Networks (TRNs) that are based on removing problem-specific features of a pretrained network used in transfer learning, and (ii) NetCut, a methodology based on an empirical or an analytical latency estimator, which only proposes and retrains TRNs that can meet the application's deadline, hence reducing the exploration time significantly.
We demonstrate that TRNs can expand the Pareto frontier that trades off latency and accuracy to provide networks that can meet arbitrary deadlines with potential accuracy improvement over off-the-shelf networks. Our experimental results show that such utilization of TRNs, while transferring to a simpler dataset, in combination with NetCut, can lead to the proposal of networks that can achieve relative accuracy improvement of up to 10.43% among existing off-the-shelf neural architectures while meeting a specific deadline, and 27x speedup in exploration time.

[26]  arXiv:2101.05371 [pdf]
Title: Anomaly Detection Support Using Process Classification
Comments: 14 pages, 6 figures
Journal-ref: Proceedings of the 5th International Conference on Software Security and Assurance (ICSSA 2019), 2019, 27-40
Subjects: Machine Learning (cs.LG)

Anomaly detection systems need to consider a lot of information when scanning for anomalies. One example is the context of the process in which an anomaly might occur, because anomalies for one process might not be anomalies for a different one. Therefore data -- such as system events -- need to be assigned to the program they originate from. This paper investigates whether it is possible to infer from a list of system events the program whose behavior caused the occurrence of these system events. To that end, we model transition probabilities between non-equivalent events and apply the $k$-nearest neighbors algorithm. This system is evaluated on non-malicious, real-world data using four different evaluation scores. Our results suggest that the approach proposed in this paper is capable of correctly inferring program names from system events.

[27]  arXiv:2101.05372 [pdf, other]
Title: A Deep Reinforcement Learning Framework for Eco-driving in Connected and Automated Hybrid Electric Vehicles
Comments: This work has been submitted to the IEEE for possible publication and is under review. Paper summary: 14 pages, 16 figures
Subjects: Systems and Control (eess.SY)

Connected and Automated Vehicles (CAVs), in particular those with multiple power sources, have the potential to significantly reduce fuel consumption and travel time in real-world driving conditions. In particular, the Eco-driving problem seeks to design optimal speed and power usage profiles based upon look-ahead information from connectivity and advanced mapping features, to minimize the fuel consumption over a given itinerary.
Due to the complexity of the problem and the limited on-board computational capability, the real-time implementation of many existing methods that rely on online trajectory optimization becomes infeasible. In this work, the Eco-driving problem is formulated as a Partially Observable Markov Decision Process (POMDP), which is then solved with a state-of-art Deep Reinforcement Learning (DRL) Actor Critic algorithm, Proximal Policy Optimization. An Eco-driving simulation environment is developed for training and testing purposes. To benchmark the performance of the DRL controller, a baseline controller representing the human driver and the wait-and-see deterministic optimal solution are presented. With minimal on-board computational requirement and comparable travel time, the DRL controller reduces the fuel consumption by more than 17% by modulating the vehicle velocity over the route and performing energy-efficient approach and departure at signalized intersections when compared against a baseline controller.

[28]  arXiv:2101.05373 [pdf, other]
Title: On a Class of Time-Varying Gaussian ISI Channels
Authors: Kamyar Moshksar
Subjects: Information Theory (cs.IT)

This paper studies a class of stochastic and time-varying Gaussian intersymbol interference (ISI) channels. The $i^{th}$ channel tap during time slot $t$ is uniformly distributed over an interval of centre $c_i$ and radius $ r_{i}$. The array of channel taps is independent along both $t$ and $i$. The channel state information is unavailable at both the transmitter and the receiver. Lower and upper bounds are derived on the White-Gaussian-Input (WGI) capacity $C_{\scriptscriptstyle{WGI}}$ for arbitrary values of the radii $ r_i$. It is shown that $C_{\scriptscriptstyle{WGI}}$ does not scale with the average input power. The proposed lower bound is achieved by a joint-typicality decoder that is tuned to a set of candidates for the channel matrix. This set forms a net that covers the range of the random channel matrix and its resolution is optimized in order to yield the largest achievable rate. Tools in matrix analysis such as Weyl's inequality on perturbation of eigenvalues of symmetric matrices are used in order to analyze the probability of error.

[29]  arXiv:2101.05385 [pdf]
Title: A Work-Centered Approach for Cyber-Physical-Social System Design: Applications in Aerospace Industrial Inspection
Subjects: Human-Computer Interaction (cs.HC)

Industrial inspection automation in aerospace presents numerous challenges due to the dynamic, information-rich and regulated aspects of the domain. To diagnose the condition of an aircraft component, expert inspectors rely on a significant amount of procedural and tacit knowledge (know-how). As systems capabilities do not match high level human cognitive functions, the role of humans in future automated work systems will remain important. A Cyber-Physical-Social System (CPSS) is a suitable solution that envisions humans and agents in a joint activity to enhance cognitive/computational capabilities and produce better outcomes. This paper investigates how a work-centred approach can support and guide the engineering process of a CPSS with an industrial use case. We present a robust methodology that combines fieldwork inquiries and model-based engineering to elicit and formalize rich mental models into exploitable design patterns. Our results exhibit how inspectors process and apply knowledge to diagnose the component`s condition, how they deal with the institution`s rules and operational constraints (norms, safety policies, standard operating procedures). We suggest how these patterns can be incorporated in software modules or can conceptualize Human-Agent Teaming requirements. We argue that this framework can corroborate the right fit between a system`s technical and ecological validity (system fit with operating context) that enhances data reliability, productivity-related factors and system acceptance by end-users.

[30]  arXiv:2101.05388 [pdf, other]
Title: Evaluating Soccer Player: from Live Camera to Deep Reinforcement Learning
Subjects: Machine Learning (cs.LG); Computer Science and Game Theory (cs.GT)

Scientifically evaluating soccer players represents a challenging Machine Learning problem. Unfortunately, most existing answers have very opaque algorithm training procedures; relevant data are scarcely accessible and almost impossible to generate. In this paper, we will introduce a two-part solution: an open-source Player Tracking model and a new approach to evaluate these players based solely on Deep Reinforcement Learning, without human data training nor guidance. Our tracking model was trained in a supervised fashion on datasets we will also release, and our Evaluation Model relies only on simulations of virtual soccer games. Combining those two architectures allows one to evaluate Soccer Players directly from a live camera without large datasets constraints. We term our new approach Expected Discounted Goal (EDG), as it represents the number of goals a team can score or concede from a particular state. This approach leads to more meaningful results than the existing ones that are based on real-world data, and could easily be extended to other sports.

[31]  arXiv:2101.05390 [pdf, ps, other]
Title: Quantitative Rates and Fundamental Obstructions to Non-Euclidean Universal Approximation with Deep Narrow Feed-Forward Networks
Subjects: Machine Learning (cs.LG); Functional Analysis (math.FA); General Topology (math.GN); Geometric Topology (math.GT)

By incorporating structured pairs of non-trainable input and output layers, the universal approximation property of feed-forward have recently been extended across a broad range of non-Euclidean input spaces X and output spaces Y. We quantify the number of narrow layers required for these "deep geometric feed-forward neural networks" (DGNs) to approximate any continuous function in $C(X,Y)$, uniformly on compacts. The DGN architecture is then extended to accommodate complete Riemannian manifolds, where the input and output layers are only defined locally, and we obtain local analogs of our results. In this case, we find that both the global and local universal approximation guarantees can only coincide when approximating null-homotopic functions. Consequently, we show that if Y is a compact Riemannian manifold, then there exists a function that cannot be uniformly approximated on large compact subsets of X. Nevertheless, we obtain lower-bounds of the maximum diameter of any geodesic ball in X wherein our local universal approximation results hold. Applying our results, we build universal approximators between spaces of non-degenerate Gaussian measures. We also obtain a quantitative version of the universal approximation theorem for classical deep narrow feed-forward networks with general activation functions.

[32]  arXiv:2101.05397 [pdf, other]
Title: Should Ensemble Members Be Calibrated?
Authors: Xixin Wu, Mark Gales
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

Underlying the use of statistical approaches for a wide range of applications is the assumption that the probabilities obtained from a statistical model are representative of the "true" probability that event, or outcome, will occur. Unfortunately, for modern deep neural networks this is not the case, they are often observed to be poorly calibrated. Additionally, these deep learning approaches make use of large numbers of model parameters, motivating the use of Bayesian, or ensemble approximation, approaches to handle issues with parameter estimation. This paper explores the application of calibration schemes to deep ensembles from both a theoretical perspective and empirically on a standard image classification task, CIFAR-100. The underlying theoretical requirements for calibration, and associated calibration criteria, are first described. It is shown that well calibrated ensemble members will not necessarily yield a well calibrated ensemble prediction, and if the ensemble prediction is well calibrated its performance cannot exceed that of the average performance of the calibrated ensemble members. On CIFAR-100 the impact of calibration for ensemble prediction, and associated calibration is evaluated. Additionally the situation where multiple different topologies are combined together is discussed.

[33]  arXiv:2101.05399 [pdf, other]
Title: Act to Reason: A Dynamic Game Theoretical Model of Driving
Comments: 17 pages, 10 figures
Subjects: Multiagent Systems (cs.MA); Systems and Control (eess.SY)

The focus of this paper is to propose a driver model that incorporates human reasoning levels as actions during interactions with other drivers. Different from earlier work using game theoretical human reasoning levels, we propose a dynamic approach, where the actions are the levels themselves, instead of conventional driving actions such as accelerating or braking. This results in a dynamic behavior, where the agent adapts to its environment by exploiting different behavior models as available moves to choose from, depending on the requirements of the traffic situation. The bounded rationality assumption is preserved since the selectable strategies are designed by adhering to the fact that humans are cognitively limited in their understanding and decision making. Using a highway merging scenario, it is demonstrated that the proposed dynamic approach produces more realistic outcomes compared to the conventional method that employs fixed human reasoning levels.

[34]  arXiv:2101.05400 [pdf, other]
Title: Machine-Assisted Script Curation
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

We describe Machine-Aided Script Curator (MASC), a system for human-machine collaborative script authoring. Scripts produced with MASC include (1) English descriptions of sub-events that comprise a larger, complex event; (2) event types for each of those events; (3) a record of entities expected to participate in multiple sub-events; and (4) temporal sequencing between the sub-events. MASC automates portions of the script creation process with suggestions for event types, links to Wikidata, and sub-events that may have been forgotten. We illustrate how these automations are useful to the script writer with a few case-study scripts.

[35]  arXiv:2101.05403 [pdf]
Title: Image deblurring based on lightweight multi-information fusion network
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

Recently, deep learning based image deblurring has been well developed. However, exploiting the detailed image features in a deep learning framework always requires a mass of parameters, which inevitably makes the network suffer from high computational burden. To solve this problem, we propose a lightweight multiinformation fusion network (LMFN) for image deblurring. The proposed LMFN is designed as an encoder-decoder architecture. In the encoding stage, the image feature is reduced to various smallscale spaces for multi-scale information extraction and fusion without a large amount of information loss. Then, a distillation network is used in the decoding stage, which allows the network benefit the most from residual learning while remaining sufficiently lightweight. Meanwhile, an information fusion strategy between distillation modules and feature channels is also carried out by attention mechanism. Through fusing different information in the proposed approach, our network can achieve state-of-the-art image deblurring result with smaller number of parameters and outperforms existing methods in model complexity.

[36]  arXiv:2101.05405 [pdf, other]
Title: Privacy Analysis in Language Models via Training Data Leakage Report
Subjects: Cryptography and Security (cs.CR); Computation and Language (cs.CL); Machine Learning (cs.LG)

Recent advances in neural network based language models lead to successful deployments of such models, improving user experience in various applications. It has been demonstrated that strong performance of language models may come along with the ability to memorize rare training samples, which poses serious privacy threats in case the model training is conducted on confidential user content. This necessitates privacy monitoring techniques to minimize the chance of possible privacy breaches for the models deployed in practice. In this work, we introduce a methodology that investigates identifying the user content in the training data that could be leaked under a strong and realistic threat model. We propose two metrics to quantify user-level data leakage by measuring a model's ability to produce unique sentence fragments within training data. Our metrics further enable comparing different models trained on the same data in terms of privacy. We demonstrate our approach through extensive numerical studies on real-world datasets such as email and forum conversations. We further illustrate how the proposed metrics can be utilized to investigate the efficacy of mitigations like differentially private training or API hardening.

[37]  arXiv:2101.05408 [pdf, other]
Title: The full approximation storage multigrid scheme: A 1D finite element example
Authors: Ed Bueler
Comments: 18 pages, 6 figures
Subjects: Numerical Analysis (math.NA)

This note describes the full approximation storage (FAS) multigrid scheme for an easy one-dimensional nonlinear boundary value problem. The problem is discretized by a simple finite element (FE) scheme. We apply both FAS V-cycles and F-cycles, with a nonlinear Gauss-Seidel smoother, to solve the resulting finite-dimensional problem. The mathematics of the FAS restriction and prolongation operators, in the FE case, are explained. A self-contained Python program implements the scheme, and its optimal performance is demonstrated.

[38]  arXiv:2101.05412 [pdf, other]
Title: Interval centred form for proving stability of non-linear discrete-time systems
Authors: Auguste Bourgois (ENSTA Bretagne, Brest, France), Luc Jaulin (ENSTA Bretagne, Brest, France)
Comments: In Proceedings SNR 2020, arXiv:2101.05256
Journal-ref: EPTCS 331, 2021, pp. 1-17
Subjects: Systems and Control (eess.SY)

In this paper, we propose a new approach to prove stability of non-linear discrete-time systems. After introducing the new concept of stability contractor, we show that the interval centred form plays a fundamental role in this context and makes it possible to easily prove asymptotic stability of a discrete system. Then, we illustrate the principle of our approach through theoretical examples. Finally, we provide two practical examples using our method : proving stability of a localisation system and that of the trajectory of a robot.

[39]  arXiv:2101.05414 [pdf, other]
Title: Verification and Reachability Analysis of Fractional-Order Differential Equations Using Interval Analysis
Authors: Andreas Rauh (Lab-STICC, ENSTA Bretagne, Brest, France), Julia Kersten (University of Rostock, Chair of Mechatronics, Rostock, Germany)
Comments: In Proceedings SNR 2020, arXiv:2101.05256
Journal-ref: EPTCS 331, 2021, pp. 18-32
Subjects: Systems and Control (eess.SY)

Interval approaches for the reachability analysis of initial value problems for sets of classical ordinary differential equations have been investigated and implemented by many researchers during the last decades. However, there exist numerous applications in computational science and engineering, where continuous-time system dynamics cannot be described adequately by integer-order differential equations. Especially in cases in which long-term memory effects are observed, fractional-order system representations are promising to describe the dynamics, on the one hand, with sufficient accuracy and, on the other hand, to limit the number of required state variables and parameters to a reasonable amount. Real-life applications for such fractional-order models can, among others, be found in the field of electrochemistry, where methods for impedance spectroscopy are typically used to identify fractional-order models for the charging/discharging behavior of batteries or for the dynamic relation between voltage and current in fuel cell systems if operated in a non-stationary state. This paper aims at presenting an iterative method for reachability analysis of fractional-order systems that is based on an interval arithmetic extension of Mittag-Leffler functions. An illustrating example, inspired by a low-order model of battery systems concludes this contribution.

[40]  arXiv:2101.05415 [pdf, other]
Title: Analysis of E-commerce Ranking Signals via Signal Temporal Logic
Authors: Tommaso Dreossi (Amazon Search), Giorgio Ballardin (Amazon Search), Parth Gupta (Amazon Search), Jan Bakus (Amazon Search), Yu-Hsiang Lin (Amazon Search), Vamsi Salaka (Amazon Search)
Comments: In Proceedings SNR 2020, arXiv:2101.05256
Journal-ref: EPTCS 331, 2021, pp. 33-42
Subjects: Logic in Computer Science (cs.LO); Formal Languages and Automata Theory (cs.FL); Information Retrieval (cs.IR); Machine Learning (cs.LG)

The timed position of documents retrieved by learning to rank models can be seen as signals. Signals carry useful information such as drop or rise of documents over time or user behaviors. In this work, we propose to use the logic formalism called Signal Temporal Logic (STL) to characterize document behaviors in ranking accordingly to the specified formulas. Our analysis shows that interesting document behaviors can be easily formalized and detected thanks to STL formulas. We validate our idea on a dataset of 100K product signals. Through the presented framework, we uncover interesting patterns, such as cold start, warm start, spikes, and inspect how they affect our learning to ranks models.

[41]  arXiv:2101.05417 [pdf, other]
Title: High-order FDTD schemes for Maxwell's interface problems with discontinuous coefficients and complex interfaces based on the Correction Function Method
Comments: 27 pages, 12 figures
Subjects: Numerical Analysis (math.NA)

We propose high-order FDTD schemes based on the Correction Function Method (CFM) for Maxwell's interface problems with discontinuous coefficients and complex interfaces. The key idea of the CFM is to model the correction function near an interface to retain the order of a finite difference approximation. For this, we solve a system of PDEs based on the original problem by minimizing an energy functional. The CFM is applied to the standard Yee scheme and a fourth-order FDTD scheme. The proposed CFM-FDTD schemes are verified in 2-D using the transverse magnetic mode (TM$_z$). Numerical examples include scattering of magnetic and non-magnetic dielectric cylinders, and problems with manufactured solutions using various complex interfaces and discontinuous piecewise varying coefficients. Long-time simulations are also performed to provide numerical evidences of the stability of the proposed numerical approach. The proposed CFM-FDTD schemes achieve up to fourth-order convergence in $L^2$-norm and provide approximations devoid of spurious oscillations.

[42]  arXiv:2101.05418 [pdf, other]
Title: Enclosing the Sliding Surfaces of a Controlled Swing
Authors: Luc Jaulin (Robex, Lab-STICC), Benoît Desrochers (DGA-TN)
Comments: In Proceedings SNR 2020, arXiv:2101.05256
Journal-ref: EPTCS 331, 2021, pp. 43-55
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Logic in Computer Science (cs.LO)

When implementing a non-continuous controller for a cyber-physical system, it may happen that the evolution of the closed-loop system is not anymore piecewise differentiable along the trajectory, mainly due to conditional statements inside the controller. This may lead to some unwanted chattering effects than may damage the system. This behavior is difficult to observe even in simulation. In this paper, we propose an interval approach to characterize the sliding surface which corresponds to the set of all states such that the state trajectory may jump indefinitely between two distinct behaviors. We show that the recent notion of thick sets will allows us to compute efficiently an outer approximation of the sliding surface of a given class of hybrid system taking into account all set-membership uncertainties. An application to the verification of the controller of a child swing is considered to illustrate the principle of the approach.

[43]  arXiv:2101.05419 [pdf, other]
Title: DAIL: Dataset-Aware and Invariant Learning for Face Recognition
Subjects: Computer Vision and Pattern Recognition (cs.CV)

To achieve good performance in face recognition, a large scale training dataset is usually required. A simple yet effective way to improve recognition performance is to use a dataset as large as possible by combining multiple datasets in the training. However, it is problematic and troublesome to naively combine different datasets due to two major issues. First, the same person can possibly appear in different datasets, leading to an identity overlapping issue between different datasets. Naively treating the same person as different classes in different datasets during training will affect back-propagation and generate non-representative embeddings. On the other hand, manually cleaning labels may take formidable human efforts, especially when there are millions of images and thousands of identities. Second, different datasets are collected in different situations and thus will lead to different domain distributions. Naively combining datasets will make it difficult to learn domain invariant embeddings across different datasets. In this paper, we propose DAIL: Dataset-Aware and Invariant Learning to resolve the above-mentioned issues. To solve the first issue of identity overlapping, we propose a dataset-aware loss for multi-dataset training by reducing the penalty when the same person appears in multiple datasets. This can be readily achieved with a modified softmax loss with a dataset-aware term. To solve the second issue, domain adaptation with gradient reversal layers is employed for dataset invariant learning. The proposed approach not only achieves state-of-the-art results on several commonly used face recognition validation sets, including LFW, CFP-FP, and AgeDB-30, but also shows great benefit for practical use.

[44]  arXiv:2101.05421 [pdf, other]
Title: Asynchronous Gathering in a Torus
Comments: 41 pages
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

We consider the gathering problem for asynchronous and oblivious robots that cannot communicate explicitly with each other, but are endowed with visibility sensors that allow them to see the positions of the other robots. Most of the investigations on the gathering problem on the discrete universe are done on ring shaped networks due to the number of symmetric configuration. We extend in this paper the study of the gathering problem on torus shaped networks assuming robots endowed with local weak multiplicity detection. That is, robots cannot make the difference between nodes occupied by only one robot from those occupied by more than one robots unless it is their current node. As a consequence, solutions based on creating a single multiplicity node as a landmark for the gathering cannot be used. We present in this paper a deterministic algorithm that solves the gathering problem starting from any rigid configuration on an asymmetric unoriented torus shaped network.

[45]  arXiv:2101.05425 [pdf]
Title: A Perspective-Based Understanding of Project Success
Comments: Journal, 18 pages, 2 figures, 3 tables
Journal-ref: Project Management Journal 43 (5) (2012), pp. 68-86
Subjects: Software Engineering (cs.SE)

Answering the call for alternative approaches to researching project management, we explore the evaluation of project success from a subjectivist perspective. An in-depth, longitudinal case study of information systems development in a large manufacturing company was used to investigate how various project stakeholders subjectively perceived the project outcome and what evaluation criteria they drew on in doing so. A conceptual framework is developed for understanding and analyzing evaluations of project success, both formal and informal. The framework highlights how different stakeholder perspectives influence the perceived outcome(s) of a project, and how project evaluations may differ between stakeholders and across time.

[46]  arXiv:2101.05426 [pdf]
Title: Evaluating prediction systems in software project estimation
Comments: Journal, 10 pages, 3 figures, 6 tables
Journal-ref: Information and Software Technology 54(8) (2012), pp.820-827
Subjects: Software Engineering (cs.SE)

Context: Software engineering has a problem in that when we empirically evaluate competing prediction systems we obtain conflicting results. Objective: To reduce the inconsistency amongst validation study results and provide a more formal foundation to interpret results with a particular focus on continuous prediction systems. Method: A new framework is proposed for evaluating competing prediction systems based upon (1) an unbiased statistic, Standardised Accuracy, (2) testing the result likelihood relative to the baseline technique of random 'predictions', that is guessing, and (3) calculation of effect sizes. Results: Previously published empirical evaluations of prediction systems are re-examined and the original conclusions shown to be unsafe. Additionally, even the strongest results are shown to have no more than a medium effect size relative to random guessing. Conclusions: Biased accuracy statistics such as MMRE are deprecated. By contrast this new empirical validation framework leads to meaningful results. Such steps will assist in performing future meta-analyses and in providing more robust and usable recommendations to practitioners.

[47]  arXiv:2101.05428 [pdf, other]
Title: Federated Learning: Opportunities and Challenges
Subjects: Machine Learning (cs.LG); Distributed, Parallel, and Cluster Computing (cs.DC)

Federated Learning (FL) is a concept first introduced by Google in 2016, in which multiple devices collaboratively learn a machine learning model without sharing their private data under the supervision of a central server. This offers ample opportunities in critical domains such as healthcare, finance etc, where it is risky to share private user information to other organisations or devices. While FL appears to be a promising Machine Learning (ML) technique to keep the local data private, it is also vulnerable to attacks like other ML models. Given the growing interest in the FL domain, this report discusses the opportunities and challenges in federated learning.

[48]  arXiv:2101.05435 [pdf, ps, other]
Title: A Critical Look at Coulomb Counting Towards Improving the Kalman Filter Based State of Charge Tracking Algorithms in Rechargeable Batteries
Subjects: Systems and Control (eess.SY)

In this paper, we consider the problem of state of charge estimation for rechargeable batteries. Coulomb counting is one of the traditional approaches to state of charge estimation and it is considered reliable as long as the battery capacity and initial state of charge are known. However, the Coulomb counting method is susceptible to errors from several sources and the extent of these errors are not studied in the literature. In this paper, we formally derive and quantify the state of charge estimation error during Coulomb counting due to the following four types of error sources: (i) current measurement error; (ii) current integration approximation error; (iii) battery capacity uncertainty; and (iv) the timing oscillator error/drift. It is shown that the resulting state of charge error can either be of the time-cumulative or of state-of-charge-proportional type. Time-cumulative errors increase with time and has the potential to completely invalidate the state of charge estimation in the long run. State-of-charge-proportional errors increase with the accumulated state of charge and reach its worst value within one charge/discharge cycle. Simulation analyses are presented to demonstrate the extent of these errors under several realistic scenarios and the paper discusses approaches to reduce the time-cumulative and state of charge-proportional errors.

[49]  arXiv:2101.05436 [pdf, other]
Title: Learning Safe Multi-Agent Control with Decentralized Neural Barrier Certificates
Comments: Published at ICLR 2021
Subjects: Multiagent Systems (cs.MA); Artificial Intelligence (cs.AI); Systems and Control (eess.SY)

We study the multi-agent safe control problem where agents should avoid collisions to static obstacles and collisions with each other while reaching their goals. Our core idea is to learn the multi-agent control policy jointly with learning the control barrier functions as safety certificates. We propose a novel joint-learning framework that can be implemented in a decentralized fashion, with generalization guarantees for certain function classes. Such a decentralized framework can adapt to an arbitrarily large number of agents. Building upon this framework, we further improve the scalability by incorporating neural network architectures that are invariant to the quantity and permutation of neighboring agents. In addition, we propose a new spontaneous policy refinement method to further enforce the certificate condition during testing. We provide extensive experiments to demonstrate that our method significantly outperforms other leading multi-agent control approaches in terms of maintaining safety and completing original tasks. Our approach also shows exceptional generalization capability in that the control policy can be trained with 8 agents in one scenario, while being used on other scenarios with up to 1024 agents in complex multi-agent environments and dynamics.

[50]  arXiv:2101.05438 [pdf, ps, other]
Title: A General Method for Generating Discrete Orthogonal Matrices
Subjects: Discrete Mathematics (cs.DM)

Discrete orthogonal matrices have several applications, such as in coding and cryptography. It is often challenging to generate discrete orthogonal matrices. A common approach widely in use is to discretize continuous orthogonal functions that have been discovered. The need of certain continuous functions is restrictive. To simplify the process while improving the flexibility, we present a general method to generate orthogonal matrices directly through the construction of certain even and odd polynomials from a set of distinct positive values, bypassing the need of continuous orthogonal functions. We provide a constructive proof by induction that not only asserts the existence of such polynomials, but also tells how to iteratively construct them. Besides the derivation of the method as simple as a few nested loops, we discuss two well-known discrete transforms, the Discrete Cosine Transform and the Discrete Tchebichef Transform, and how they can be achieved using our method with the specific values. We also show some examples of how to generate new orthogonal matrices from arbitrarily chosen values.

[51]  arXiv:2101.05443 [pdf, other]
Title: Unsupervised heart abnormality detection based on phonocardiogram analysis with Beta Variational Auto-Encoders
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)

Heart Sound (also known as phonocardiogram (PCG)) analysis is a popular way that detects cardiovascular diseases (CVDs). Most PCG analysis uses supervised way, which demands both normal and abnormal samples. This paper proposes a method of unsupervised PCG analysis that uses beta variational auto-encoder ($\beta-\text{VAE}$) to model the normal PCG signals. The best performed model reaches an AUC (Area Under Curve) value of 0.91 in ROC (Receiver Operating Characteristic) test for PCG signals collected from the same source. Unlike majority of $\beta-\text{VAE}$s that are used as generative models, the best-performed $\beta-\text{VAE}$ has a $\beta$ value smaller than 1. Further experiments then find that the introduction of a light weighted KL divergence between distribution of latent space and normal distribution improves the performance of anomaly PCG detection based on anomaly scores resulted by reconstruction loss. The fact suggests that anomaly score based on reconstruction loss may be better than anomaly scores based on latent vectors of samples

[52]  arXiv:2101.05444 [pdf]
Title: Application of Failure Modes and Effects Analysis in the Engineering Design Process
Subjects: Software Engineering (cs.SE)

Failure modes and effects analysis (FMEA) is one of the most practical design tools implemented in the product design to analyze the possible failures and to improve the design. The use of FMEA is diversified, and different approaches are proposed by various organizations and researchers from one application to another. The question is how to use the features of FMEA along with the design process. This research focuses on different types of FMEA in the design process, which is considered as the mapping between customer requirements, design components, and product functions. These three elements of design are the foundation of the integration model proposed in this research. The objective of this research is to understand an integrated approach of FMEA in the design process. Significantly, an integration framework is developed to integrate the design process and FMEA. Then, a step-by-step FMEA-facilitated design process is proposed to apply FMEA along with the design process.

[53]  arXiv:2101.05450 [pdf, other]
Title: Data Engagement Reconsidered: A Study of Automatic Stress Tracking Technology in Use
Comments: 13 pages, 2 figures, 1 table, Accepted at ACM 2021 CHI Conference on Human Factors in Computing Systems (CHI 2021)
Subjects: Human-Computer Interaction (cs.HC)

In today's fast-paced world, stress has become a growing health concern. While more automatic stress tracking technologies have recently become available on wearable or mobile devices, there is still a limited understanding of how they are actually used in everyday life. This paper presents an empirical study of automatic stress-tracking technologies in use in China, based on semi-structured interviews with 17 users. The study highlights three challenges of stress-tracking data engagement that prevent effective technology usage: the lack of immediate awareness, the lack of pre-required knowledge, and the lack of corresponding communal support. Drawing on the stress-tracking practices uncovered in the study, we bring these issues to the fore, and unpack assumptions embedded in related works on self-tracking and how data engagement is approached. We end by calling for a reconsideration of data engagement as part of self-tracking practices with technologies rather than simply looking at the user interface.

[54]  arXiv:2101.05451 [pdf]
Title: Finding faults: A scoping study of fault diagnostics for Industrial Cyber-Physical Systems
Comments: Journal, 19 pages, 7 figures, 6 tables
Journal-ref: 10.1016/j.jss.2020.110638
Subjects: Software Engineering (cs.SE)

Context: As Industrial Cyber-Physical Systems (ICPS) become more connected and widely-distributed, often operating in safety-critical environments, we require innovative approaches to detect and diagnose the faults that occur in them. Objective: We profile fault identification and diagnosis techniques employed in the aerospace, automotive, and industrial control domains. By examining both theoretical presentations as well as case studies from production environments, we present a profile of the current approaches being employed and identify gaps. Methodology: A scoping study was used to identify and compare fault detection and diagnosis methodologies that are presented in the current literature. Results: Fault identification and analysis studies from 127 papers published from 2004 to 2019 reveal a wide diversity of promising techniques, both emerging and in-use. These range from traditional Physics-based Models to Data-Driven Artificial Intelligence (AI) and Knowledge-Based approaches. Predictive diagnostics or prognostics featured prominently across all sectors, along with discussions of techniques including Fault trees, Petri nets and Markov approaches. We also profile some of the techniques that have reached the highest Technology Readiness Levels, showing how those methods are being applied in real-world environments beyond the laboratory. Conclusions: Our results suggest that the continuing wide use of both Model-Based and Data-Driven AI techniques across all domains, especially when they are used together in hybrid configuration, reflects the complexity of the current ICPS application space. While creating sufficiently-complete models is labor intensive, Model-free AI techniques were evidenced as a viable way of addressing aspects of this challenge, demonstrating the increasing sophistication of current machine learning systems.(Abridged)

[55]  arXiv:2101.05452 [pdf, other]
Title: Interpreting and Predicting Tactile Signals for the SynTouch BioTac
Comments: Submitted to International Journal of Robotics Research (IJRR)
Subjects: Robotics (cs.RO)

In the human hand, high-density contact information provided by afferent neurons is essential for many human grasping and manipulation capabilities. In contrast, robotic tactile sensors, including the state-of-the-art SynTouch BioTac, are typically used to provide low-density contact information, such as contact location, center of pressure, and net force. Although useful, these data do not convey or leverage the rich information content that some tactile sensors naturally measure. This research extends robotic tactile sensing beyond reduced-order models through 1) the automated creation of a precise experimental tactile dataset for the BioTac over a diverse range of physical interactions, 2) a 3D finite element (FE) model of the BioTac, which complements the experimental dataset with high-density, distributed contact data, 3) neural-network-based mappings from raw BioTac signals to not only low-dimensional experimental data, but also high-density FE deformation fields, and 4) mappings from the FE deformation fields to the raw signals themselves. The high-density data streams can provide a far greater quantity of interpretable information for grasping and manipulation algorithms than previously accessible.

[56]  arXiv:2101.05453 [pdf, ps, other]
Title: On the quantization of recurrent neural networks
Subjects: Machine Learning (cs.LG)

Integer quantization of neural networks can be defined as the approximation of the high precision computation of the canonical neural network formulation, using reduced integer precision. It plays a significant role in the efficient deployment and execution of machine learning (ML) systems, reducing memory consumption and leveraging typically faster computations. In this work, we present an integer-only quantization strategy for Long Short-Term Memory (LSTM) neural network topologies, which themselves are the foundation of many production ML systems. Our quantization strategy is accurate (e.g. works well with quantization post-training), efficient and fast to execute (utilizing 8 bit integer weights and mostly 8 bit activations), and is able to target a variety of hardware (by leveraging instructions sets available in common CPU architectures, as well as available neural accelerators).

[57]  arXiv:2101.05456 [pdf, other]
Title: Self-Supervised Learning for Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Self-supervised learning is emerging as an effective substitute for transfer learning from large datasets. In this work, we use kidney segmentation to explore this idea. The anatomical asymmetry of kidneys is leveraged to define an effective proxy task for kidney segmentation via self-supervised learning. A siamese convolutional neural network (CNN) is used to classify a given pair of kidney sections from CT volumes as being kidneys of the same or different sides. This knowledge is then transferred for the segmentation of kidneys using another deep CNN using one branch of the siamese CNN as the encoder for the segmentation network. Evaluation results on a publicly available dataset containing computed tomography (CT) scans of the abdominal region shows that a boost in performance and fast convergence can be had relative to a network trained conventionally from scratch. This is notable given that no additional data/expensive annotations or augmentation were used in training.

[58]  arXiv:2101.05457 [pdf, other]
Title: A Multiple Classifier Approach for Concatenate-Designed Neural Networks
Subjects: Neural and Evolutionary Computing (cs.NE); Machine Learning (cs.LG)

This article introduces a multiple classifier method to improve the performance of concatenate-designed neural networks, such as ResNet and DenseNet, with the purpose to alleviate the pressure on the final classifier. We give the design of the classifiers, which collects the features produced between the network sets, and present the constituent layers and the activation function for the classifiers, to calculate the classification score of each classifier. We use the L2 normalization method to obtain the classifier score instead of the Softmax normalization. We also determine the conditions that can enhance convergence. As a result, the proposed classifiers are able to improve the accuracy in the experimental cases significantly, and show that the method not only has better performance than the original models, but also produces faster convergence. Moreover, our classifiers are general and can be applied to all classification related concatenate-designed network models.

[59]  arXiv:2101.05462 [pdf, other]
Title: Leader Confirmation Replication for Millisecond Consensus in Geo-distributed Private Chains
Comments: Submitted to ICDCS 2021
Subjects: Networking and Internet Architecture (cs.NI); Distributed, Parallel, and Cluster Computing (cs.DC)

Geo-distributed private chain and database have created higher performance requirements for consistency models. However, with millisecond network latency between nodes, the widely used leader-based SMR models cause frequent retransmission of logs since they cannot know the logs replication status in time, which resulting in the leader costing high network and computing resource. To address the problem, we proposed a Leader Confirmation based Replication (LCR) model. First, we demonstrate the efficacy of the approach by designing the Future Log Replication model, a log in which follower is responsible for non-transactional log replication. It reduces the leader's network load using the signal log. Secondly, we designed a Generation Re-replication strategy, which can ensure the security and consistency of future logs when the number of nodes changes. Finally, we implemented LCR-Raft and designed experiments. The results show that in the single-ms network latency environments, LCR-Raft can provide higher TPS (1.5X~1.9X), and reduce the network traffic of the leader by 20\%-30\% with acceptable network traffic and CPU cost on followers. Besides, since LCR does not change the number of leader and leader election process, it has good portability.

[60]  arXiv:2101.05465 [pdf, ps, other]
Title: Noise Is Useful: Exploiting Data Diversity for Edge Intelligence
Comments: 5 pages, 6 figures, to be presented at IEEE Communications Letters
Subjects: Information Theory (cs.IT)

Edge intelligence requires to fast access distributed data samples generated by edge devices. The challenge is using limited radio resource to acquire massive data samples for training machine learning models at edge server. In this article, we propose a new communication-efficient edge intelligence scheme where the most useful data samples are selected to train the model. Here the usefulness or values of data samples is measured by data diversity which is defined as the difference between data samples. We derive a close-form expression of data diversity that combines data informativeness and channel quality. Then a joint data-and-channel diversity aware multiuser scheduling algorithm is proposed. We find that noise is useful for enhancing data diversity under some conditions.

[61]  arXiv:2101.05467 [pdf, other]
Title: Tackling Instance-Dependent Label Noise via a Universal Probabilistic Model
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

The drastic increase of data quantity often brings the severe decrease of data quality, such as incorrect label annotations, which poses a great challenge for robustly training Deep Neural Networks (DNNs). Existing learning \mbox{methods} with label noise either employ ad-hoc heuristics or restrict to specific noise assumptions. However, more general situations, such as instance-dependent label noise, have not been fully explored, as scarce studies focus on their label corruption process. By categorizing instances into confusing and unconfusing instances, this paper proposes a simple yet universal probabilistic model, which explicitly relates noisy labels to their instances. The resultant model can be realized by DNNs, where the training procedure is accomplished by employing an alternating optimization algorithm. Experiments on datasets with both synthetic and real-world label noise verify that the proposed method yields significant improvements on robustness over state-of-the-art counterparts.

[62]  arXiv:2101.05469 [pdf, other]
Title: Text Augmentation in a Multi-Task View
Comments: Accepted to EACL 2021
Subjects: Computation and Language (cs.CL)

Traditional data augmentation aims to increase the coverage of the input distribution by generating augmented examples that strongly resemble original samples in an online fashion where augmented examples dominate training. In this paper, we propose an alternative perspective -- a multi-task view (MTV) of data augmentation -- in which the primary task trains on original examples and the auxiliary task trains on augmented examples. In MTV data augmentation, both original and augmented samples are weighted substantively during training, relaxing the constraint that augmented examples must resemble original data and thereby allowing us to apply stronger levels of augmentation. In empirical experiments using four common data augmentation techniques on three benchmark text classification datasets, we find that the MTV leads to higher and more robust performance improvements than traditional augmentation.

[63]  arXiv:2101.05470 [pdf, other]
Title: OrigamiSet1.0: Two New Datasets for Origami Classification and Difficulty Estimation
Comments: In Proceedings of Origami Science Maths Education, 7OSME, Oxford UK (2018)
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Origami is becoming more and more relevant to research. However, there is no public dataset yet available and there hasn't been any research on this topic in machine learning. We constructed an origami dataset using images from the multimedia commons and other databases. It consists of two subsets: one for classification of origami images and the other for difficulty estimation. We obtained 16000 images for classification (half origami, half other objects) and 1509 for difficulty estimation with $3$ different categories (easy: 764, intermediate: 427, complex: 318). The data can be downloaded at: https://github.com/multimedia-berkeley/OriSet. Finally, we provide machine learning baselines.

[64]  arXiv:2101.05471 [pdf, other]
Title: Towards Practical Adam: Non-Convexity, Convergence Theory, and Mini-Batch Acceleration
Comments: 44 Pages. arXiv admin note: substantial text overlap with arXiv:1811.09358
Subjects: Machine Learning (cs.LG); Distributed, Parallel, and Cluster Computing (cs.DC); Optimization and Control (math.OC)

Adam is one of the most influential adaptive stochastic algorithms for training deep neural networks, which has been pointed out to be divergent even in the simple convex setting via a few simple counterexamples. Many attempts, such as decreasing an adaptive learning rate, adopting a big batch size, incorporating a temporal decorrelation technique, seeking an analogous surrogate, \textit{etc.}, have been tried to promote Adam-type algorithms to converge. In contrast with existing approaches, we introduce an alternative easy-to-check sufficient condition, which merely depends on the parameters of the base learning rate and combinations of historical second-order moments, to guarantee the global convergence of generic Adam for solving large-scale non-convex stochastic optimization. This observation coupled with this sufficient condition gives much deeper interpretations on the divergence of Adam. On the other hand, in practice, mini-Adam and distributed-Adam are widely used without theoretical guarantee, we further give an analysis on how will the batch size or the number of nodes in the distributed system will affect the convergence of Adam, which theoretically shows that mini-batch and distributed Adam can be linearly accelerated by using a larger mini-batch size or more number of nodes. At last, we apply the generic Adam and mini-batch Adam with a sufficient condition for solving the counterexample and training several different neural networks on various real-world datasets. Experimental results are exactly in accord with our theoretical analysis.

[65]  arXiv:2101.05473 [pdf, ps, other]
Title: Time-critical testing and search problems
Subjects: Discrete Mathematics (cs.DM)

This paper introduces a problem in which the state of a system needs to be determined through costly tests of its components by a limited number of testing units and before a given deadline. We also consider a closely related search problem in which there are multiple searchers to find a target before a given deadline. These natural generalizations of the classical sequential testing problem and search problem are applicable in a wide range of time-critical operations such as machine maintenance, diagnosing a patient, and new product development. We show that both problems are NP-hard, develop a pseudo-polynomial dynamic program for the special case of two time slots, and describe a partial-order-based as well as an assignment-based mixed integer program for the general case. Based on extensive computational experiments, we find that the assignment-based formulation performs better than the partial-order-based formulation for the testing variant, but that this is the other way round for the search variant. Finally, we propose a pairwise-interchange-based local search procedure and show that, empirically, it performs very well in finding near-optimal solutions.

[66]  arXiv:2101.05475 [pdf, other]
Title: EDSC: An Event-Driven Smart Contract Platform
Comments: 11 pages
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

This paper presents EDSC, a novel smart contract platform design based on the event-driven execution model as opposed to the traditionally employed transaction-driven execution model. We reason that such a design is a better fit for many emerging smart contract applications and is better positioned to address the scalability and performance challenges plaguing the smart contract ecosystem. We propose EDSC's design under the Ethereum framework, and the design can be easily adapted for other existing smart contract platforms. We have conducted implementation using Ethereum client and experiments where performance modeling results show on average 2.2 to 4.6 times reduced total latency of event triggered smart contracts, which demonstrates its effectiveness for supporting contracts that demand timely execution based on events. In addition, we discuss example use cases to demonstrate the design's utility and comment on its potential security dynamics.

[67]  arXiv:2101.05478 [pdf, other]
Title: WER-BERT: Automatic WER Estimation with BERT in a Balanced Ordinal Classification Paradigm
Comments: Accepted Long Paper at EACL 2021
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)

Audio Speech Recognition (ASR) systems are evaluated using Word Error Rate (WER) which is calculated by comparing the number of errors between the ground truth and the ASR system's transcription. This calculation, however, requires manual transcription of the speech signal to obtain the ground truth. Since transcribing audio signals is a costly process, Automatic WER Evaluation (e-WER) methods have been developed which attempt to predict the WER of a Speech system by only relying on the transcription and the speech signal features. While WER is a continuous variable, previous works have shown that positing e-WER as a classification problem is more effective than regression. However, while converting to a classification setting, these approaches suffer from heavy class imbalance. In this paper, we propose a new balanced paradigm for e-WER in a classification setting. Within this paradigm, we also propose WER-BERT, a BERT based architecture with speech features for e-WER. Furthermore, we introduce a distance loss function to tackle the ordinal nature of e-WER classification. The proposed approach and paradigm are evaluated on the Librispeech dataset and a commercial (black box) ASR system, Google Cloud's Speech-to-Text API. The results and experiments demonstrate that WER-BERT establishes a new state-of-the-art in automatic WER estimation.

[68]  arXiv:2101.05479 [pdf, other]
Title: Understanding the Role of Scene Graphs in Visual Question Answering
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

Visual Question Answering (VQA) is of tremendous interest to the research community with important applications such as aiding visually impaired users and image-based search. In this work, we explore the use of scene graphs for solving the VQA task. We conduct experiments on the GQA dataset which presents a challenging set of questions requiring counting, compositionality and advanced reasoning capability, and provides scene graphs for a large number of images. We adopt image + question architectures for use with scene graphs, evaluate various scene graph generation techniques for unseen images, propose a training curriculum to leverage human-annotated and auto-generated scene graphs, and build late fusion architectures to learn from multiple image representations. We present a multi-faceted study into the use of scene graphs for VQA, making this work the first of its kind.

[69]  arXiv:2101.05482 [pdf, ps, other]
Title: Iterative regularization for constrained minimization formulations of nonlinear inverse problems
Subjects: Numerical Analysis (math.NA); Optimization and Control (math.OC)

In this paper we the formulation of inverse problems as constrained minimization problems and their iterative solution by gradient or Newton type. We carry out a convergence analysis in the sense of regularization methods and discuss applicability to the problem of identifying the spatially varying diffusivity in an elliptic PDE from different sets of observations. Among these is a novel hybrid imaging techology known as impedance acoustic tomography, for which we provide numerical experiments.

[70]  arXiv:2101.05484 [pdf]
Title: 4D Attention-based Neural Network for EEG Emotion Recognition
Subjects: Machine Learning (cs.LG)

Electroencephalograph (EEG) emotion recognition is a significant task in the brain-computer interface field. Although many deep learning methods are proposed recently, it is still challenging to make full use of the information contained in different domains of EEG signals. In this paper, we present a novel method, called four-dimensional attention-based neural network (4D-aNN) for EEG emotion recognition. First, raw EEG signals are transformed into 4D spatial-spectral-temporal representations. Then, the proposed 4D-aNN adopts spectral and spatial attention mechanisms to adaptively assign the weights of different brain regions and frequency bands, and a convolutional neural network (CNN) is utilized to deal with the spectral and spatial information of the 4D representations. Moreover, a temporal attention mechanism is integrated into a bidirectional Long Short-Term Memory (LSTM) to explore temporal dependencies of the 4D representations. Our model achieves state-of-the-art performance on the SEED dataset under intra-subject splitting. The experimental results have shown the effectiveness of the attention mechanisms in different domains for EEG emotion recognition.

[71]  arXiv:2101.05486 [pdf, other]
Title: Label Contrastive Coding based Graph Neural Network for Graph Classification
Comments: Accept by DASFAA'21
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)

Graph classification is a critical research problem in many applications from different domains. In order to learn a graph classification model, the most widely used supervision component is an output layer together with classification loss (e.g.,cross-entropy loss together with softmax or margin loss). In fact, the discriminative information among instances are more fine-grained, which can benefit graph classification tasks. In this paper, we propose the novel Label Contrastive Coding based Graph Neural Network (LCGNN) to utilize label information more effectively and comprehensively. LCGNN still uses the classification loss to ensure the discriminability of classes. Meanwhile, LCGNN leverages the proposed Label Contrastive Loss derived from self-supervised learning to encourage instance-level intra-class compactness and inter-class separability. To power the contrastive learning, LCGNN introduces a dynamic label memory bank and a momentum updated encoder. Our extensive evaluations with eight benchmark graph datasets demonstrate that LCGNN can outperform state-of-the-art graph classification models. Experimental results also verify that LCGNN can achieve competitive performance with less training data because LCGNN exploits label information comprehensively.

[72]  arXiv:2101.05490 [pdf, other]
Title: Neural networks behave as hash encoders: An empirical study
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

The input space of a neural network with ReLU-like activations is partitioned into multiple linear regions, each corresponding to a specific activation pattern of the included ReLU-like activations. We demonstrate that this partition exhibits the following encoding properties across a variety of deep learning models: (1) {\it determinism}: almost every linear region contains at most one training example. We can therefore represent almost every training example by a unique activation pattern, which is parameterized by a {\it neural code}; and (2) {\it categorization}: according to the neural code, simple algorithms, such as $K$-Means, $K$-NN, and logistic regression, can achieve fairly good performance on both training and test data. These encoding properties surprisingly suggest that {\it normal neural networks well-trained for classification behave as hash encoders without any extra efforts.} In addition, the encoding properties exhibit variability in different scenarios. {Further experiments demonstrate that {\it model size}, {\it training time}, {\it training sample size}, {\it regularization}, and {\it label noise} contribute in shaping the encoding properties, while the impacts of the first three are dominant.} We then define an {\it activation hash phase chart} to represent the space expanded by {model size}, training time, training sample size, and the encoding properties, which is divided into three canonical regions: {\it under-expressive regime}, {\it critically-expressive regime}, and {\it sufficiently-expressive regime}. The source code package is available at \url{https://github.com/LeavesLei/activation-code}.

[73]  arXiv:2101.05494 [pdf, ps, other]
Title: Hostility Detection in Hindi leveraging Pre-Trained Language Models
Subjects: Computation and Language (cs.CL)

Hostile content on social platforms is ever increasing. This has led to the need for proper detection of hostile posts so that appropriate action can be taken to tackle them. Though a lot of work has been done recently in the English Language to solve the problem of hostile content online, similar works in Indian Languages are quite hard to find. This paper presents a transfer learning based approach to classify social media (i.e Twitter, Facebook, etc.) posts in Hindi Devanagari script as Hostile or Non-Hostile. Hostile posts are further analyzed to determine if they are Hateful, Fake, Defamation, and Offensive. This paper harnesses attention based pre-trained models fine-tuned on Hindi data with Hostile-Non hostile task as Auxiliary and fusing its features for further sub-tasks classification. Through this approach, we establish a robust and consistent model without any ensembling or complex pre-processing. We have presented the results from our approach in CONSTRAINT-2021 Shared Task on hostile post detection where our model performs extremely well with 3rd runner up in terms of Weighted Fine-Grained F1 Score.

[74]  arXiv:2101.05495 [pdf, other]
Title: Selective Deletion in a Blockchain
Journal-ref: International Workshop on Blockchain and Mobile Applications (BlockApp 2020) during the International Conference on Distributed Computing Systems (ICDCS 2020)
Subjects: Cryptography and Security (cs.CR); Computers and Society (cs.CY); Systems and Control (eess.SY)

The constantly growing size of blockchains becomes a challenge with the increasing usage. Especially the storage of unwanted data in a blockchain is an issue, because it cannot be removed naturally. In order to counteract this problem, we present the first concept for the selective deletion of single entries in a blockchain. For this purpose, the general consensus algorithm is extended by the functionality of regularly creating summary blocks. Previous data of the chain are summarized and stored again in a new block, leaving out unwanted information. With a shifting marker of the Genesis Block, data can be deleted from the beginning of a blockchain. In this way, the technology of the blockchain becomes fully transactional. The concept is independent of a specific block structure, network structure, or consensus algorithm. Moreover, this functionality can be adapted to current blockchains to solve multiple problems related to scalability. This approach enables the transfer of blockchain technology to further fields of application, among others in the area of Industry 4.0 and Product Life-cycle Management.

[75]  arXiv:2101.05499 [pdf, other]
Title: ECOL: Early Detection of COVID Lies Using Content, Prior Knowledge and Source Information
Comments: to be published in Constraint-2021 Workshop @ AAAI
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)

Social media platforms are vulnerable to fake news dissemination, which causes negative consequences such as panic and wrong medication in the healthcare domain. Therefore, it is important to automatically detect fake news in an early stage before they get widely spread. This paper analyzes the impact of incorporating content information, prior knowledge, and credibility of sources into models for the early detection of fake news. We propose a framework modeling those features by using BERT language model and external sources, namely Simple English Wikipedia and source reliability tags. The conducted experiments on CONSTRAINT datasets demonstrated the benefit of integrating these features for the early detection of fake news in the healthcare domain.

[76]  arXiv:2101.05500 [pdf, other]
Title: Joint Dimensionality Reduction for Separable Embedding Estimation
Subjects: Machine Learning (cs.LG)

Low-dimensional embeddings for data from disparate sources play critical roles in multi-modal machine learning, multimedia information retrieval, and bioinformatics. In this paper, we propose a supervised dimensionality reduction method that learns linear embeddings jointly for two feature vectors representing data of different modalities or data from distinct types of entities. We also propose an efficient feature selection method that complements, and can be applied prior to, our joint dimensionality reduction method. Assuming that there exist true linear embeddings for these features, our analysis of the error in the learned linear embeddings provides theoretical guarantees that the dimensionality reduction method accurately estimates the true embeddings when certain technical conditions are satisfied and the number of samples is sufficiently large. The derived sample complexity results are echoed by numerical experiments. We apply the proposed dimensionality reduction method to gene-disease association, and predict unknown associations using kernel regression on the dimension-reduced feature vectors. Our approach compares favorably against other dimensionality reduction methods, and against a state-of-the-art method of bilinear regression for predicting gene-disease associations.

[77]  arXiv:2101.05504 [pdf, other]
Title: Reliability Check via Weight Similarity in Privacy-Preserving Multi-Party Machine Learning
Subjects: Machine Learning (cs.LG)

Multi-party machine learning is a paradigm in which multiple participants collaboratively train a machine learning model to achieve a common learning objective without sharing their privately owned data. The paradigm has recently received a lot of attention from the research community aimed at addressing its associated privacy concerns. In this work, we focus on addressing the concerns of data privacy, model privacy, and data quality associated with privacy-preserving multi-party machine learning, i.e., we present a scheme for privacy-preserving collaborative learning that checks the participants' data quality while guaranteeing data and model privacy. In particular, we propose a novel metric called weight similarity that is securely computed and used to check whether a participant can be categorized as a reliable participant (holds good quality data) or not. The problems of model and data privacy are tackled by integrating homomorphic encryption in our scheme and uploading encrypted weights, which prevent leakages to the server and malicious participants, respectively. The analytical and experimental evaluations of our scheme demonstrate that it is accurate and ensures data and model privacy.

[78]  arXiv:2101.05507 [pdf, other]
Title: Evaluating the Robustness of Collaborative Agents
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC); Multiagent Systems (cs.MA)

In order for agents trained by deep reinforcement learning to work alongside humans in realistic settings, we will need to ensure that the agents are \emph{robust}. Since the real world is very diverse, and human behavior often changes in response to agent deployment, the agent will likely encounter novel situations that have never been seen during training. This results in an evaluation challenge: if we cannot rely on the average training or validation reward as a metric, then how can we effectively evaluate robustness? We take inspiration from the practice of \emph{unit testing} in software engineering. Specifically, we suggest that when designing AI agents that collaborate with humans, designers should search for potential edge cases in \emph{possible partner behavior} and \emph{possible states encountered}, and write tests which check that the behavior of the agent in these edge cases is reasonable. We apply this methodology to build a suite of unit tests for the Overcooked-AI environment, and use this test suite to evaluate three proposals for improving robustness. We find that the test suite provides significant insight into the effects of these proposals that were generally not revealed by looking solely at the average validation reward.

[79]  arXiv:2101.05508 [pdf, other]
Title: Augmented Informative Cooperative Perception
Comments: Submitted to ICDCS'21
Subjects: Multimedia (cs.MM); Human-Computer Interaction (cs.HC)

Connected vehicles, whether equipped with advanced driver-assistance systems or fully autonomous, are currently constrained to visual information in their lines-of-sight. A cooperative perception system among vehicles increases their situational awareness by extending their perception ranges. Existing solutions imply significant network and computation load, as well as high flow of not-always-relevant data received by vehicles. To address such issues, and thus account for the inherently diverse informativeness of the data, we present Augmented Informative Cooperative Perception (AICP) as the first fast-filtering system which optimizes the informativeness of shared data at vehicles. AICP displays the filtered data to the drivers in augmented reality head-up display. To this end, an informativeness maximization problem is presented for vehicles to select a subset of data to display to their drivers. Specifically, we propose (i) a dedicated system design with custom data structure and light-weight routing protocol for convenient data encapsulation, fast interpretation and transmission, and (ii) a comprehensive problem formulation and efficient fitness-based sorting algorithm to select the most valuable data to display at the application layer. We implement a proof-of-concept prototype of AICP with a bandwidth-hungry, latency-constrained real-life augmented reality application. The prototype realizes the informative-optimized cooperative perception with only 12.6 milliseconds additional latency. Next, we test the networking performance of AICP at scale and show that AICP effectively filter out less relevant packets and decreases the channel busy time.

[80]  arXiv:2101.05509 [pdf, other]
Title: Transformer-based Language Model Fine-tuning Methods for COVID-19 Fake News Detection
Comments: 9 pages, 1 figures
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

With the pandemic of COVID-19, relevant fake news is spreading all over the sky throughout the social media. Believing in them without discrimination can cause great trouble to people's life. However, universal language models may perform weakly in these fake news detection for lack of large-scale annotated data and sufficient semantic understanding of domain-specific knowledge. While the model trained on corresponding corpora is also mediocre for insufficient learning. In this paper, we propose a novel transformer-based language model fine-tuning approach for these fake news detection. First, the token vocabulary of individual model is expanded for the actual semantics of professional phrases. Second, we adapt the heated-up softmax loss to distinguish the hard-mining samples, which are common for fake news because of the disambiguation of short text. Then, we involve adversarial training to improve the model's robustness. Last, the predicted features extracted by universal language model RoBERTa and domain-specific model CT-BERT are fused by one multiple layer perception to integrate fine-grained and high-level specific representations. Quantitative experimental results evaluated on existing COVID-19 fake news dataset show its superior performances compared to the state-of-the-art methods among various evaluation metrics. Furthermore, the best weighted average F1 score achieves 99.02%.

[81]  arXiv:2101.05510 [pdf, other]
Title: Signal Processing on Higher-Order Networks: Livin' on the Edge ... and Beyond
Comments: 38 pages; 7 figures
Subjects: Social and Information Networks (cs.SI); Machine Learning (cs.LG); Physics and Society (physics.soc-ph); Machine Learning (stat.ML)

This tutorial paper presents a didactic treatment of the emerging topic of signal processing on higher-order networks. Drawing analogies from discrete and graph signal processing, we introduce the building blocks for processing data on simplicial complexes and hypergraphs, two common abstractions of higher-order networks that can incorporate polyadic relationships.We provide basic introductions to simplicial complexes and hypergraphs, making special emphasis on the concepts needed for processing signals on them. Leveraging these concepts, we discuss Fourier analysis, signal denoising, signal interpolation, node embeddings, and non-linear processing through neural networks in these two representations of polyadic relational structures. In the context of simplicial complexes, we specifically focus on signal processing using the Hodge Laplacian matrix, a multi-relational operator that leverages the special structure of simplicial complexes and generalizes desirable properties of the Laplacian matrix in graph signal processing. For hypergraphs, we present both matrix and tensor representations, and discuss the trade-offs in adopting one or the other. We also highlight limitations and potential research avenues, both to inform practitioners and to motivate the contribution of new researchers to the area.

[82]  arXiv:2101.05511 [pdf, other]
Title: Quantifying Blockchain Extractable Value: How dark is the forest?
Subjects: Cryptography and Security (cs.CR)

Permissionless blockchains such as Bitcoin have excelled at financial services. Yet, adversaries extract monetary value from the mesh of decentralized finance (DeFi) smart contracts. Some have characterized the Ethereum peer-to-peer network as a dark forest, wherein broadcast transactions represent prey, which are devoured by generalized trading bots.
While transaction (re)ordering and front-running are known to cause losses to users, we quantify how much value was sourced from blockchain extractable value (BEV). We systematize a transaction ordering taxonomy to quantify the USD extracted from sandwich attacks, liquidations, and decentralized exchange arbitrage. We estimate that over 2 years, those trading activities yielded 28.80M USD in profit, divided among 5,084 unique addresses. While arbitrage and liquidations might appear benign, traders can front-run others, causing financial losses to competitors.
To provide an example of a generalized trading bot, we show a simple yet effective automated transaction replay algorithm capable of replacing unconfirmed transactions without understanding the victim transactions' underlying logic. We estimate that our transaction replay algorithm could have yielded a profit of 51,688.33 ETH (17.60M USD) over 2 years on past blockchain data.
We also find that miners do not broadcast 1.64% of their mined transactions and instead choose to mine them privately. Privately mined transactions cannot be front-run by other traders or miners. We show that the largest Ethereum mining pool performs arbitrage and seemingly tries to cloak its private transaction mining activities. We therefore provide evidence that miners already extract Miner Extractable Value (MEV), which could destabilize the blockchain consensus security, as related work has shown.

[83]  arXiv:2101.05514 [pdf, other]
Title: Entangled Kernels -- Beyond Separability
Subjects: Machine Learning (cs.LG); Quantum Physics (quant-ph); Machine Learning (stat.ML)

We consider the problem of operator-valued kernel learning and investigate the possibility of going beyond the well-known separable kernels. Borrowing tools and concepts from the field of quantum computing, such as partial trace and entanglement, we propose a new view on operator-valued kernels and define a general family of kernels that encompasses previously known operator-valued kernels, including separable and transformable kernels. Within this framework, we introduce another novel class of operator-valued kernels called entangled kernels that are not separable. We propose an efficient two-step algorithm for this framework, where the entangled kernel is learned based on a novel extension of kernel alignment to operator-valued kernels. We illustrate our algorithm with an application to supervised dimensionality reduction, and demonstrate its effectiveness with both artificial and real data for multi-output regression.

[84]  arXiv:2101.05519 [pdf, other]
Title: BiGCN: A Bi-directional Low-Pass Filtering Graph Neural Network
Subjects: Machine Learning (cs.LG)

Graph convolutional networks have achieved great success on graph-structured data. Many graph convolutional networks can be regarded as low-pass filters for graph signals. In this paper, we propose a new model, BiGCN, which represents a graph neural network as a bi-directional low-pass filter. Specifically, we not only consider the original graph structure information but also the latent correlation between features, thus BiGCN can filter the signals along with both the original graph and a latent feature-connection graph. Our model outperforms previous graph neural networks in the tasks of node classification and link prediction on most of the benchmark datasets, especially when we add noise to the node features.

[85]  arXiv:2101.05536 [pdf, other]
Title: Scaling Equilibrium Propagation to Deep ConvNets by Drastically Reducing its Gradient Estimator Bias
Comments: NeurIPS 2020 Workshop : "Beyond Backpropagation Novel Ideas for Training Neural Architectures". arXiv admin note: substantial text overlap with arXiv:2006.03824
Subjects: Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

Equilibrium Propagation (EP) is a biologically-inspired counterpart of Backpropagation Through Time (BPTT) which, owing to its strong theoretical guarantees and the locality in space of its learning rule, fosters the design of energy-efficient hardware dedicated to learning. In practice, however, EP does not scale to visual tasks harder than MNIST. In this work, we show that a bias in the gradient estimate of EP, inherent in the use of finite nudging, is responsible for this phenomenon and that cancelling it allows training deep ConvNets by EP, including architectures with distinct forward and backward connections. These results highlight EP as a scalable approach to compute error gradients in deep neural networks, thereby motivating its hardware implementation.

[86]  arXiv:2101.05537 [pdf, other]
Title: Optimal Energy Shaping via Neural Approximators
Subjects: Systems and Control (eess.SY); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Dynamical Systems (math.DS)

We introduce optimal energy shaping as an enhancement of classical passivity-based control methods. A promising feature of passivity theory, alongside stability, has traditionally been claimed to be intuitive performance tuning along the execution of a given task. However, a systematic approach to adjust performance within a passive control framework has yet to be developed, as each method relies on few and problem-specific practical insights. Here, we cast the classic energy-shaping control design process in an optimal control framework; once a task-dependent performance metric is defined, an optimal solution is systematically obtained through an iterative procedure relying on neural networks and gradient-based optimization. The proposed method is validated on state-regulation tasks.

[87]  arXiv:2101.05538 [pdf, other]
Title: Cyber Taxi: A Taxonomy of Interactive Cyber Training and Education Systems
Journal-ref: Model-driven Simulation and Training Environments for Cybersecurity (MSTEC 2020)
Subjects: Cryptography and Security (cs.CR); Computers and Society (cs.CY); Systems and Control (eess.SY)

The lack of guided exercises and practical opportunities to learn about cybersecurity in a practical way makes it difficult for security experts to improve their proficiency. Capture the Flag events and Cyber Ranges are ideal for cybersecurity training. Thereby, the participants usually compete in teams against each other, or have to defend themselves in a specific scenario. As organizers of yearly events, we present a taxonomy for interactive cyber training and education. The proposed taxonomy includes different factors of the technical setup, audience, training environment, and training setup. By the comprehensive taxonomy, different aspects of interactive training are considered. This can help trainings to improve and to be established successfully. The provided taxonomy is extendable and can be used in further application areas as research on new security technologies.

[88]  arXiv:2101.05543 [pdf, other]
Title: On the Synchronization Power of Token Smart Contracts
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Cryptography and Security (cs.CR)

Modern blockchains support a variety of distributed applications beyond cryptocurrencies, including smart contracts -- which let users execute arbitrary code in a distributed and decentralized fashion. Regardless of their intended application, blockchain platforms implicitly assume consensus for the correct execution of a smart contract, thus requiring that all transactions are totally ordered. It was only recently recognized that consensus is not necessary to prevent double-spending in a cryptocurrency (Guerraoui et al., PODC'19), contrary to common belief. This result suggests that current implementations may be sacrificing efficiency and scalability because they synchronize transactions much more tightly than actually needed. In this work, we study the synchronization requirements of Ethereum's ERC20 token contract, one of the most widely adopted smart contacts. Namely, we model a smart-contract token as a concurrent object and analyze its consensus number as a measure of synchronization power. We show that the richer set of methods supported by ERC20 tokens, compared to standard cryptocurrencies, results in strictly stronger synchronization requirements. More surprisingly, the synchronization power of ERC20 tokens depends on the object's state and can thus be modified by method invocations. To prove this result, we develop a dedicated framework to express how the object's state affects the needed synchronization level. Our findings indicate that ERC20 tokens, as well as other token standards, are more powerful and versatile than plain cryptocurrencies, and are subject to dynamic requirements. Developing specific synchronization protocols that exploit these dynamic requirements will pave the way towards more robust and scalable blockchain platforms.

[89]  arXiv:2101.05544 [pdf, other]
Title: DICE: Diversity in Deep Ensembles via Conditional Redundancy Adversarial Estimation
Comments: Published as a conference paper at ICLR 2021. 9 main pages, 13 figures, 12 tables
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Information Theory (cs.IT)

Deep ensembles perform better than a single network thanks to the diversity among their members. Recent approaches regularize predictions to increase diversity; however, they also drastically decrease individual members' performances. In this paper, we argue that learning strategies for deep ensembles need to tackle the trade-off between ensemble diversity and individual accuracies. Motivated by arguments from information theory and leveraging recent advances in neural estimation of conditional mutual information, we introduce a novel training criterion called DICE: it increases diversity by reducing spurious correlations among features. The main idea is that features extracted from pairs of members should only share information useful for target class prediction without being conditionally redundant. Therefore, besides the classification loss with information bottleneck, we adversarially prevent features from being conditionally predictable from each other. We manage to reduce simultaneous errors while protecting class information. We obtain state-of-the-art accuracy results on CIFAR-10/100: for example, an ensemble of 5 networks trained with DICE matches an ensemble of 7 networks trained independently. We further analyze the consequences on calibration, uncertainty estimation, out-of-distribution detection and online co-distillation.

[90]  arXiv:2101.05548 [pdf, other]
Title: An enhanced VEM formulation for plane elasticity
Comments: 27 pages, 6 figures
Subjects: Numerical Analysis (math.NA)

In this paper, an enhanced Virtual Element Method (VEM) formulation is proposed for plane elasticity. It is based on the improvement of the strain representation within the element, without altering the degree of the displacement interpolating functions on the element boundary. The idea is to fully exploit polygonal elements with a high number of sides, a peculiar VEM feature, characterized by many displacement degrees of freedom on the element boundary, even if a low interpolation order is assumed over each side. The proposed approach is framed within a generalization of the classic VEM formulation, obtained by introducing an energy norm in the projection operator definition. Although such generalization may mainly appear to have a formal value, it allows to effectively point out the mechanical meaning of the quantities involved in the projection operator definition and to drive the selection of the enhanced representations. Various enhancements are proposed and tested through several numerical examples. Numerical results successfully show the capability of the enhanced VEM formulation to (i) considerably increase accuracy (with respect to standard VEM) while keeping the optimal convergence rate, (ii) bypass the need of stabilization terms in many practical cases, (iii) obtain natural serendipity elements in many practical cases, and (vi) effectively treat also nearly incompressible materials.

[91]  arXiv:2101.05549 [pdf, ps, other]
Title: Spectral Clustering Oracles in Sublinear Time
Comments: Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA). Society for Industrial and Applied Mathematics, 2021
Subjects: Data Structures and Algorithms (cs.DS)

Given a graph $G$ that can be partitioned into $k$ disjoint expanders with outer conductance upper bounded by $\epsilon\ll 1$, can we efficiently construct a small space data structure that allows quickly classifying vertices of $G$ according to the expander (cluster) they belong to? Formally, we would like an efficient local computation algorithm that misclassifies at most an $O(\epsilon)$ fraction of vertices in every expander. We refer to such a data structure as a \textit{spectral clustering oracle}. Our main result is a spectral clustering oracle with query time $O^*(n^{1/2+O(\epsilon)})$ and preprocessing time $2^{O(\frac{1}{\epsilon} k^4 \log^2(k))} n^{1/2+O(\epsilon)}$ that provides misclassification error $O(\epsilon \log k)$ per cluster for any $\epsilon \ll 1/\log k$. More generally, query time can be reduced at the expense of increasing the preprocessing time appropriately (as long as the product is about $n^{1+O(\epsilon)}$) -- this in particular gives a nearly linear time spectral clustering primitive. The main technical contribution is a sublinear time oracle that provides dot product access to the spectral embedding of $G$ by estimating distributions of short random walks from vertices in $G$. The distributions themselves provide a poor approximation to the spectral embedding, but we show that an appropriate linear transformation can be used to achieve high precision dot product access. We then show that dot product access to the spectral embedding is sufficient to design a clustering oracle. At a high level our approach amounts to hyperplane partitioning in the spectral embedding of $G$, but crucially operates on a nested sequence of carefully defined subspaces in the spectral embedding to achieve per cluster recovery guarantees.

[92]  arXiv:2101.05555 [pdf, other]
Title: Non-intrusive surrogate modeling for parametrized time-dependent PDEs using convolutional autoencoders
Subjects: Numerical Analysis (math.NA); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

This work presents a non-intrusive surrogate modeling scheme based on machine learning technology for predictive modeling of complex systems, described by parametrized time-dependent PDEs. For these problems, typical finite element approaches involve the spatiotemporal discretization of the PDE and the solution of the corresponding linear system of equations at each time step. Instead, the proposed method utilizes a convolutional autoencoder in conjunction with a feed forward neural network to establish a low-cost and accurate mapping from the problem's parametric space to its solution space. For this purpose, time history response data are collected by solving the high-fidelity model via FEM for a reduced set of parameter values. Then, by applying the convolutional autoencoder to this data set, a low-dimensional representation of the high-dimensional solution matrices is provided by the encoder, while the reconstruction map is obtained by the decoder. Using the latent representation given by the encoder, a feed-forward neural network is efficiently trained to map points from the problem's parametric space to the compressed version of the respective solution matrices. This way, the encoded response of the system at new parameter values is given by the neural network, while the entire response is delivered by the decoder. This approach effectively bypasses the need to serially formulate and solve the system's governing equations at each time increment, thus resulting in a significant cost reduction and rendering the method ideal for problems requiring repeated model evaluations or 'real-time' computations. The elaborated methodology is demonstrated on the stochastic analysis of time-dependent PDEs solved with the Monte Carlo method, however, it can be straightforwardly applied to other similar-type problems, such as sensitivity analysis, design optimization, etc.

[93]  arXiv:2101.05564 [pdf, other]
Title: FabricNet: A Fiber Recognition Architecture Using Ensemble ConvNets
Comments: Accepted in IEEE Access
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

Fabric is a planar material composed of textile fibers. Textile fibers are generated from many natural sources; including plants, animals, minerals, and even, it can be synthetic. A particular fabric may contain different types of fibers that pass through a complex production process. Fiber identification is usually carried out through chemical tests and microscopic tests. However, these testing processes are complicated as well as time-consuming. We propose FabricNet, a pioneering approach for the image-based textile fiber recognition system, which may have a revolutionary impact from individual to the industrial fiber recognition process. The FabricNet can recognize a large scale of fibers by only utilizing a surface image of fabric. The recognition system is constructed using a distinct category of class-based ensemble convolutional neural network (CNN) architecture. The experiment is conducted on recognizing 50 different types of textile fibers. This experiment includes a significantly large number of unique textile fibers than previous research endeavors to the best of our knowledge. We experiment with popular CNN architectures that include Inception, ResNet, VGG, MobileNet, DenseNet, and Xception. Finally, the experimental results demonstrate that FabricNet outperforms the state-of-the-art popular CNN architectures by reaching an accuracy of 84% and F1-score of 90%.

[94]  arXiv:2101.05567 [pdf, other]
Title: Design of false data injection attack on distributed process estimation
Comments: arXiv admin note: substantial text overlap with arXiv:2002.01545
Subjects: Systems and Control (eess.SY); Multiagent Systems (cs.MA)

Herein, design of false data injection attack on a distributed cyber-physical system is considered. A stochastic process with linear dynamics and Gaussian noise is measured by multiple agent nodes, each equipped with multiple sensors. The agent nodes form a multi-hop network among themselves. Each agent node computes an estimate of the process by using its sensor observation and messages obtained from neighboring nodes, via Kalman-consensus filtering. An external attacker, capable of arbitrarily manipulating the sensor observations of some or all agent nodes, injects errors into those sensor observations. The goal of the attacker is to steer the estimates at the agent nodes as close as possible to a pre-specified value, while respecting a constraint on the attack detection probability. To this end, a constrained optimization problem is formulated to find the optimal parameter values of a certain class of linear attacks. The parameters of linear attack are learnt on-line via a combination of stochastic approximation based update of a Lagrange multiplier, and an optimization technique involving either the Karush-Kuhn-Tucker (KKT) conditions or online stochastic gradient descent. The problem turns out to be convex for some special cases. Desired convergence of the proposed algorithms are proved by exploiting the convexity and properties of stochastic approximation algorithms. Finally, numerical results demonstrate the efficacy of the attack.

[95]  arXiv:2101.05570 [pdf, other]
Title: TypeNet: Deep Learning Keystroke Biometrics
Comments: arXiv admin note: substantial text overlap with arXiv:2004.03627
Subjects: Computer Vision and Pattern Recognition (cs.CV)

We study the performance of Long Short-Term Memory networks for keystroke biometric authentication at large scale in free-text scenarios. For this we introduce TypeNet, a Recurrent Neural Network (RNN) trained with a moderate number of keystrokes per identity. We evaluate different learning approaches depending on the loss function (softmax, contrastive, and triplet loss), number of gallery samples, length of the keystroke sequences, and device type (physical vs touchscreen keyboard). With 5 gallery sequences and test sequences of length 50, TypeNet achieves state-of-the-art keystroke biometric authentication performance with an Equal Error Rate of 2.2% and 9.2% for physical and touchscreen keyboards, respectively, significantly outperforming previous approaches. Our experiments demonstrate a moderate increase in error with up to 100,000 subjects, demonstrating the potential of TypeNet to operate at an Internet scale. We utilize two Aalto University keystroke databases, one captured on physical keyboards and the second on mobile devices (touchscreen keyboards). To the best of our knowledge, both databases are the largest existing free-text keystroke databases available for research with more than 136 million keystrokes from 168,000 subjects in physical keyboards, and 60,000 subjects with more than 63 million keystrokes acquired on mobile touchscreens.

[96]  arXiv:2101.05577 [pdf, other]
Title: All-at-once formulation meets the Bayesian approach: A study of two prototypical linear inverse problems
Subjects: Numerical Analysis (math.NA)

In this work, the Bayesian approach to inverse problems is formulated in an all-at-once setting. The advantages of the all-at-once formulation are known to include the avoidance of a parameter-to-state map as well as numerical improvements, especially when considering nonlinear problems. In the Bayesian approach, prior knowledge is taken into account with the help of a prior distribution. In addition, the error in the observation equation is formulated by means of a distribution. This method naturally results in a whole posterior distribution for the unknown target, not just point estimates. This allows for further statistical analysis including the computation of credible intervals. We combine the Bayesian setting with the all-at-once formulation, resulting in a novel approach for investigating inverse problems. With this combination we are able to chose a prior not only for the parameter, but also for the state variable, which directly influences the parameter. Furthermore, errors not only in the observation equation, but additionally, in the model can be taken into account. %The aim of this approach is not only to accomplish reasonable reconstructions of the unknown parameter but also to maximize the information gained from measurements through combining it with prior knowledge, obtained either from certain expertise or former investigation in the model. We analyze this approach with the help of two linear standard examples, namely the inverse source problem for the Poisson equation and the backwards heat equation, i.e. a stationary and a time dependent problem. Appropriate function spaces and derivation of adjoint operators are investigated. To assess the degree of ill-posedness, we analyze the singular values of the corresponding all-at-once forward operators. %as well as the convergence of the method. Finally, joint priors are designed and numerically tested.

[97]  arXiv:2101.05589 [pdf, other]
Title: DeFi-ning DeFi: Challenges & Pathway
Authors: Hendrik Amler (1), Lisa Eckey (1), Sebastian Faust (1), Marcel Kaiser (2), Philipp Sandner (2), Benjamin Schlosser (1) ((1) Technical University of Darmstadt, (2) Frankfurt School of Finance and Management)
Subjects: Cryptography and Security (cs.CR)

The decentralized and trustless nature of cryptocurrencies and blockchain technology leads to a shift in the digital world. The possibility to execute small programs, called smart contracts, on cryptocurrencies like Ethereum opened doors to countless new applications. One particular exciting use case is decentralized finance (DeFi), which aims to revolutionize traditional financial services by founding them on a decentralized infrastructure. We show the potential of DeFi by analyzing its advantages compared to traditional finance. Additionally, we survey the state-of-the-art of DeFi products and categorize existing services. Since DeFi is still in its infancy, there are countless hurdles for mass adoption. We discuss the most prominent challenges and point out possible solutions. Finally, we analyze the economics behind DeFi products. By carefully analyzing the state-of-the-art and discussing current challenges, we give a perspective on how the DeFi space might develop in the near future.

[98]  arXiv:2101.05591 [pdf, other]
Title: ANDROMEDA: An FPGA Based RISC-V MPSoC Exploration Framework
Comments: Accepted in VLSI Design 2021
Subjects: Hardware Architecture (cs.AR)

With the growing demands of consumer electronic products, the computational requirements are increasing exponentially. Due to the applications' computational needs, the computer architects are trying to pack as many cores as possible on a single die for accelerated execution of the application program codes. In a multiprocessor system-on-chip (MPSoC), striking a balance among the number of cores, memory subsystems, and network-on-chip parameters is essential to attain the desired performance. In this paper, we present ANDROMEDA, a RISC-V based framework that allows us to explore the different configurations of an MPSoC and observe the performance penalties and gains. We emulate the various configurations of MPSoC on the Synopsys HAPS-80D Dual FPGA platform. Using STREAM, matrix multiply, and N-body simulations as benchmarks, we demonstrate our framework's efficacy in quickly identifying the right parameters for efficient execution of these benchmarks.

[99]  arXiv:2101.05592 [pdf, other]
Title: Dynamic network analysis of a target defense differential game with limited observations
Comments: 9 figures
Subjects: Systems and Control (eess.SY)

In this paper, we study a Target-Attacker-Defender (TAD) differential game involving one attacker and one target, both with unlimited visibility range, and multiple defenders with limited visibility capabilities. We assume that the target and attacker are unaware of defenders' visibility constraints which results in a dynamic game with asymmetric information. We seek to obtain the strategies that are likely to be used by the players. The visibility constraints of the defenders induce a visibility network which encapsulates the visibility information during the evolution of the game. Based on this observation, we introduce network adapted feedback or implementable strategies for the defenders. We construct a class of parametric performance indices for which the defenders' strategies along with standard feedback Nash strategies of the attacker and target provides a network adapted feedback Nash equilibrium. We introduce a consistency criterion for selecting a subset (or refinement) of network adapted feedback Nash strategies, and provide an optimization based approach for computing them. Finally, we illustrate our results with numerical experiments.

[100]  arXiv:2101.05593 [pdf, ps, other]
Title: On the Temporality of Priors in Entity Linking
Journal-ref: 2020 European Conference on Information Retrieval
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)

Entity linking is a fundamental task in natural language processing which deals with the lexical ambiguity in texts. An important component in entity linking approaches is the mention-to-entity prior probability. Even though there is a large number of works in entity linking, the existing approaches do not explicitly consider the time aspect, specifically the temporality of an entity's prior probability. We posit that this prior probability is temporal in nature and affects the performance of entity linking systems. In this paper we systematically study the effect of the prior on the entity linking performance over the temporal validity of both texts and KBs.

[101]  arXiv:2101.05597 [pdf, ps, other]
Title: Necessary and Sufficient Condition for Satisfiability of a Boolean Formula in CNF and its Implications on P versus NP problem
Authors: Manoj Kumar
Comments: 16 Pages
Subjects: Computational Complexity (cs.CC)

In this paper, a necessary and sufficient condition for satisfiability of a boolean formula, in CNF, has been determined. It has been found that the maximum cardinality of satisfiable boolean formula increases exponentially, with increase in number of variables. Due to which, any algorithm require exponential time, in worst case scenario, depending upon the number of variables in a boolean formula, to check satisfiability of the given boolean formula. Which proves the non-existence of a polynomial time algorithm for satisfiability problem. As satisfiability is a NP-complete problem, and non-existence of a polynomial time algorithm to solve satisfiability proves exclusion of satisfiability from class P. Which implies P is not equal to NP. Further, the necessary and sufficient condition can be used to optimize existing algorithms, in some cases, the unsatisfiability of a given boolean function can be determined in polynomial time. For this purpose, a novel function has been defined, that can be used to determine cardinality of a given boolean formula, and occurances of a literal in the given formula, in polynomial time.

[102]  arXiv:2101.05604 [pdf, ps, other]
Title: Decoding of Interleaved Linearized Reed-Solomon Codes with Applications to Network Coding
Comments: 6 pages, 2 figures, submitted to ISIT 2021
Subjects: Information Theory (cs.IT)

Recently, Martinez-Penas and Kschischang (IEEE Trans. Inf. Theory, 2019) showed that lifted linearized Reed-Solomon codes are suitable codes for error control in multi-shot network coding. We show how to construct and decode lifted interleaved linearized Reed-Solomon codes. Compared to the construction by Martinez-Penas--Kschischang, interleaving allows to increase the decoding region significantly (especially w.r.t. the number of insertions) and decreases the overhead due to the lifting (i.e., increases the code rate), at the cost of an increased packet size. The proposed decoder is a list decoder that can also be interpreted as a probabilistic unique decoder. Although our best upper bound on the list size is exponential, we present a heuristic argument and simulation results that indicate that the list size is in fact one for most channel realizations up to the maximal decoding radius.

[103]  arXiv:2101.05605 [pdf, other]
Title: A Physics-Informed Machine Learning Model for Porosity Analysis in Laser Powder Bed Fusion Additive Manufacturing
Comments: 14 pages
Subjects: Machine Learning (cs.LG)

To control part quality, it is critical to analyze pore generation mechanisms, laying theoretical foundation for future porosity control. Current porosity analysis models use machine setting parameters, such as laser angle and part pose. However, these setting-based models are machine dependent, hence they often do not transfer to analysis of porosity for a different machine. To address the first problem, a physics-informed, data-driven model (PIM), which instead of directly using machine setting parameters to predict porosity levels of printed parts, it first interprets machine settings into physical effects, such as laser energy density and laser radiation pressure. Then, these physical, machine independent effects are used to predict porosity levels according to pass, flag, fail categories instead of focusing on quantitative pore size prediction. With six learning methods evaluation, PIM proved to achieve good performances with prediction error of 10$\sim$26%. Finally, pore-encouraging influence and pore-suppressing influence were analyzed for quality analysis.

[104]  arXiv:2101.05607 [pdf, other]
Title: Intelligent Reflecting Surfaces for Compute-and-Forward
Subjects: Information Theory (cs.IT); Networking and Internet Architecture (cs.NI)

In this paper, we show that using Intelligent Reflecting Surfaces (IRS) can enhance the computing capability of a wireless network scenario. We consider a Multiple Access Channel (MAC) where a number of users aim to send data to a Base Station (BS). The BS is interested in decoding a linear combination of the data from different users in the corresponding finite field. By focusing on the Compute-and-Forward framework, we show that through carefully choosing the IRS parameters, such a scenario's computation rate will be significantly improved. More specifically, we formulate an optimization problem to maximize the computation rate and tackle the problem via an alternating optimization (AO) approach. Our results confirm the usefulness of IRS technology for future wireless networks -- such as 6G -- with massive computation requirements.

[105]  arXiv:2101.05608 [pdf]
Title: Deep Cellular Recurrent Network for Efficient Analysis of Time-Series Data with Spatial Information
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Signal Processing (eess.SP)

Efficient processing of large-scale time series data is an intricate problem in machine learning. Conventional sensor signal processing pipelines with hand engineered feature extraction often involve huge computational cost with high dimensional data. Deep recurrent neural networks have shown promise in automated feature learning for improved time-series processing. However, generic deep recurrent models grow in scale and depth with increased complexity of the data. This is particularly challenging in presence of high dimensional data with temporal and spatial characteristics. Consequently, this work proposes a novel deep cellular recurrent neural network (DCRNN) architecture to efficiently process complex multi-dimensional time series data with spatial information. The cellular recurrent architecture in the proposed model allows for location-aware synchronous processing of time series data from spatially distributed sensor signal sources. Extensive trainable parameter sharing due to cellularity in the proposed architecture ensures efficiency in the use of recurrent processing units with high-dimensional inputs. This study also investigates the versatility of the proposed DCRNN model for classification of multi-class time series data from different application domains. Consequently, the proposed DCRNN architecture is evaluated using two time-series datasets: a multichannel scalp EEG dataset for seizure detection, and a machine fault detection dataset obtained in-house. The results suggest that the proposed architecture achieves state-of-the-art performance while utilizing substantially less trainable parameters when compared to comparable methods in the literature.

[106]  arXiv:2101.05611 [pdf, other]
Title: TrNews: Heterogeneous User-Interest Transfer Learning for News Recommendation
Comments: EACL 2021
Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)

We investigate how to solve the cross-corpus news recommendation for unseen users in the future. This is a problem where traditional content-based recommendation techniques often fail. Luckily, in real-world recommendation services, some publisher (e.g., Daily news) may have accumulated a large corpus with lots of consumers which can be used for a newly deployed publisher (e.g., Political news). To take advantage of the existing corpus, we propose a transfer learning model (dubbed as TrNews) for news recommendation to transfer the knowledge from a source corpus to a target corpus. To tackle the heterogeneity of different user interests and of different word distributions across corpora, we design a translator-based transfer-learning strategy to learn a representation mapping between source and target corpora. The learned translator can be used to generate representations for unseen users in the future. We show through experiments on real-world datasets that TrNews is better than various baselines in terms of four metrics. We also show that our translator is effective among existing transfer strategies.

[107]  arXiv:2101.05612 [pdf, other]
Title: A SOM-based Gradient-Free Deep Learning Method with Convergence Analysis
Subjects: Machine Learning (cs.LG)

As gradient descent method in deep learning causes a series of questions, this paper proposes a novel gradient-free deep learning structure. By adding a new module into traditional Self-Organizing Map and introducing residual into the map, a Deep Valued Self-Organizing Map network is constructed. And analysis about the convergence performance of such a deep Valued Self-Organizing Map network is proved in this paper, which gives an inequality about the designed parameters with the dimension of inputs and the loss of prediction.

[108]  arXiv:2101.05614 [pdf]
Title: Review on the Security Threats of Internet of Things
Comments: 9 Pages, 9 figures
Journal-ref: International Journal of Computer Applications (IJCA), 2020
Subjects: Cryptography and Security (cs.CR); Signal Processing (eess.SP)

Internet of Things (IoT) is being considered as the growth engine for industrial revolution 4.0. The combination of IoT, cloud computing and healthcare can contribute in ensuring well-being of people. One important challenge of IoT network is maintaining privacy and to overcome security threats. This paper provides a systematic review of the security aspects of IoT. Firstly, the application of IoT in industrial and medical service scenarios are described, and the security threats are discussed for the different layers of IoT healthcare architecture. Secondly, different types of existing malware including spyware, viruses, worms, keyloggers, and trojan horses are described in the context of IoT. Thirdly, some of the recent malware attacks such as Mirai, echobot and reaper are discussed. Next, a comparative discussion is presented on the effectiveness of different machine learning algorithms in mitigating the security threats. It is found that the k-nearest neighbor (kNN) machine learning algorithm exhibits excellent accuracy in detecting malware. This paper also reviews different tools for ransomware detection, classification and analysis. Finally, a discussion is presented on the existing security issues, open challenges and possible future scopes in ensuring IoT security.

[109]  arXiv:2101.05615 [pdf, other]
Title: FBGEMM: Enabling High-Performance Low-Precision Deep Learning Inference
Subjects: Machine Learning (cs.LG); Performance (cs.PF)

Deep learning models typically use single-precision (FP32) floating point data types for representing activations and weights, but a slew of recent research work has shown that computations with reduced-precision data types (FP16, 16-bit integers, 8-bit integers or even 4- or 2-bit integers) are enough to achieve same accuracy as FP32 and are much more efficient. Therefore, we designed fbgemm, a high-performance kernel library, from ground up to perform high-performance quantized inference on current generation CPUs. fbgemm achieves efficiency by fusing common quantization operations with a high-performance gemm implementation and by shape- and size-specific kernel code generation at runtime. The library has been deployed at Facebook, where it delivers greater than 2x performance gains with respect to our current production baseline.

[110]  arXiv:2101.05616 [pdf]
Title: Road Surface Translation Under Snow-covered and Semantic Segmentation for Snow Hazard Index
Comments: 8 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)

In 2020, record heavy snowfall have been occurred owing to climate change. Actually, 2,000 vehicles on the highway could get stuck for three days. Due to the freezing of the road surface, 10 vehicles could have a billiard accident. Road managers are required to provide them immediately in order to alert drivers to snow cover at hazardous location. This paper proposes a deep learning application with CCTV image post-processing to automatically calculate a snow hazard indicator. First, the road surface of hidden region under snow-covered is translated using generative adversarial network, pix2pix. Second, snow-covered and road surface classes are detected using semantic segmentation, DeepLabv3+ under backbone MobileNet. Based on these trained networks, we enable to automatically compute the road to snow rate hazard index how much snow is covered on the road surface. We demonstrate the applied results to 1,000 CCTV snow images on Hokkaido and North Tohoku area in Japan. We mention the usefulness and the practical robustness.

[111]  arXiv:2101.05617 [pdf, other]
Title: High-order numerical solutions to the shallow-water equations on the rotated cubed-sphere grid
Subjects: Numerical Analysis (math.NA)

High-order numerical methods are applied to the shallow-water equations on the sphere. A space-time tensor formalism is used to express the equations of motion covariantly and to describe the geometry of the rotated cubed-sphere grid. The spatial discretization is done with the direct flux reconstruction method, which is an alternative formulation to the Discontinuous Galerkin approach. The equations of motion are solved in differential form and the resulting discretization is free from quadrature rules. It is well known that the time step of traditional explicit methods is limited by the phase speed of the fastest waves. Exponential time integration schemes remove this stability restriction and allow larger time steps. New multistep exponential propagation iterative methods of orders 4, 5 and 6 are introduced. The complex-step approximation of the Jacobian is applied to the Krylov-based KIOPS (Krylov with incomplete orthogonalization procedure solver) algorithm for computing matrix-vector products with {\varphi}-functions. Results are evaluated using standard benchmarks.

[112]  arXiv:2101.05620 [pdf]
Title: A Framework for Assurance of Medication Safety using Machine Learning
Subjects: Machine Learning (cs.LG); Computers and Society (cs.CY)

Medication errors continue to be the leading cause of avoidable patient harm in hospitals. This paper sets out a framework to assure medication safety that combines machine learning and safety engineering methods. It uses safety analysis to proactively identify potential causes of medication error, based on expert opinion. As healthcare is now data rich, it is possible to augment safety analysis with machine learning to discover actual causes of medication error from the data, and to identify where they deviate from what was predicted in the safety analysis. Combining these two views has the potential to enable the risk of medication errors to be managed proactively and dynamically. We apply the framework to a case study involving thoracic surgery, e.g. oesophagectomy, where errors in giving beta-blockers can be critical to control atrial fibrillation. This case study combines a HAZOP-based safety analysis method known as SHARD with Bayesian network structure learning and process mining to produce the analysis results, showing the potential of the framework for ensuring patient safety, and for transforming the way that safety is managed in complex healthcare environments.

[113]  arXiv:2101.05623 [pdf, other]
Title: Design of borehole resistivity measurement acquisition systems using deep learning
Subjects: Machine Learning (cs.LG); Signal Processing (eess.SP); Numerical Analysis (math.NA)

Borehole resistivity measurements recorded with logging-while-drilling (LWD) instruments are widely used for characterizing the earth's subsurface properties. They facilitate the extraction of natural resources such as oil and gas. LWD instruments require real-time inversions of electromagnetic measurements to estimate the electrical properties of the earth's subsurface near the well and possibly correct the well trajectory. Deep Neural Network (DNN)-based methods are suitable for the rapid inversion of borehole resistivity measurements as they approximate the forward and inverse problem offline during the training phase and they only require a fraction of a second for the evaluation (aka prediction). However, the inverse problem generally admits multiple solutions. DNNs with traditional loss functions based on data misfit are ill-equipped for solving an inverse problem. This can be partially overcome by adding regularization terms to a loss function specifically designed for encoder-decoder architectures. But adding regularization seriously limits the number of possible solutions to a set of a priori desirable physical solutions. To avoid this, we use a two-step loss function without any regularization. In addition, to guarantee an inverse solution, we need a carefully selected measurement acquisition system with a sufficient number of measurements. In this work, we propose a DNN-based iterative algorithm for designing such a measurement acquisition system. We illustrate our DNN-based iterative algorithm via several synthetic examples. Numerical results show that the obtained measurement acquisition system is sufficient to identify and characterize both resistive and conductive layers above and below the logging instrument. Numerical results are promising, although further improvements are required to make our method amenable for industrial purposes.

[114]  arXiv:2101.05624 [pdf, other]
Title: Adversarially robust and explainable model compression with on-device personalization for NLP applications
Subjects: Machine Learning (cs.LG)

On-device Deep Neural Networks (DNNs) have recently gained more attention due to the increasing computing power of the mobile devices and the number of applications in Computer Vision (CV), Natural Language Processing (NLP), and Internet of Things (IoTs). Unfortunately, the existing efficient convolutional neural network (CNN) architectures designed for CV tasks are not directly applicable to NLP tasks and the tiny Recurrent Neural Network (RNN) architectures have been designed primarily for IoT applications. In NLP applications, although model compression has seen initial success in on-device text classification, there are at least three major challenges yet to be addressed: adversarial robustness, explainability, and personalization. Here we attempt to tackle these challenges by designing a new training scheme for model compression and adversarial robustness, including the optimization of an explainable feature mapping objective, a knowledge distillation objective, and an adversarially robustness objective. The resulting compressed model is personalized using on-device private training data via fine-tuning. We perform extensive experiments to compare our approach with both compact RNN (e.g., FastGRNN) and compressed RNN (e.g., PRADO) architectures in both natural and adversarial NLP test settings.

[115]  arXiv:2101.05625 [pdf, other]
Title: Learning Student Interest Trajectory for MOOCThread Recommendation
Subjects: Information Retrieval (cs.IR); Computers and Society (cs.CY); Machine Learning (cs.LG)

In recent years, Massive Open Online Courses (MOOCs) have witnessed immense growth in popularity. Now, due to the recent Covid19 pandemic situation, it is important to push the limits of online education. Discussion forums are primary means of interaction among learners and instructors. However, with growing class size, students face the challenge of finding useful and informative discussion forums. This problem can be solved by matching the interest of students with thread contents. The fundamental challenge is that the student interests drift as they progress through the course, and forum contents evolve as students or instructors update them. In our paper, we propose to predict future interest trajectories of students. Our model consists of two key operations: 1) Update operation and 2) Projection operation. Update operation models the inter-dependency between the evolution of student and thread using coupled Recurrent Neural Networks when the student posts on the thread. The projection operation learns to estimate future embedding of students and threads. For students, the projection operation learns the drift in their interests caused by the change in the course topic they study. The projection operation for threads exploits how different posts induce varying interest levels in a student according to the thread structure. Extensive experimentation on three real-world MOOC datasets shows that our model significantly outperforms other baselines for thread recommendation.

[116]  arXiv:2101.05626 [pdf, other]
Title: Eating Garlic Prevents COVID-19 Infection: Detecting Misinformation on the Arabic Content of Twitter
Comments: 18 pages, 4 figures
Subjects: Information Retrieval (cs.IR); Computers and Society (cs.CY); Machine Learning (cs.LG); Social and Information Networks (cs.SI)

The rapid growth of social media content during the current pandemic provides useful tools for disseminating information which has also become a root for misinformation. Therefore, there is an urgent need for fact-checking and effective techniques for detecting misinformation in social media. In this work, we study the misinformation in the Arabic content of Twitter. We construct a large Arabic dataset related to COVID-19 misinformation and gold-annotate the tweets into two categories: misinformation or not. Then, we apply eight different traditional and deep machine learning models, with different features including word embeddings and word frequency. The word embedding models (\textsc{FastText} and word2vec) exploit more than two million Arabic tweets related to COVID-19. Experiments show that optimizing the area under the curve (AUC) improves the models' performance and the Extreme Gradient Boosting (XGBoost) presents the highest accuracy in detecting COVID-19 misinformation online.

[117]  arXiv:2101.05628 [pdf, other]
Title: Game-based Pricing and Task Offloading in Mobile Edge Computing Enabled Edge-Cloud Systems
Comments: 12 pages, 9 figures
Subjects: Computer Science and Game Theory (cs.GT)

As a momentous enabling of the Internet of things (IoT), mobile edge computing (MEC) provides IoT mobile devices (MD) with powerful external computing and storage resources. However, a mechanism addressing distributed task offloading and price competition for the open exchange marketplace has not been established properly, which has become a huge obstacle to MEC's application in the IoT market. In this paper, we formulate a distributed mechanism to analyze the interaction between OSPs and IoT MDs in the MEC enabled edge-cloud system by appling multi-leader multi-follower two-tier Stackelberg game theory. We first prove the existence of the Stackelberg equilibrium, and then we propose two distributed algorithms, namely iterative proximal offloading algorithm (IPOA) and iterative Stackelberg game pricing algorithm (ISPA). The IPOA solves the follower non-cooperative game among IoT MDs and ISPA uses backward induction to deal with the price competition among OSPs. Experimental results show that IPOA can markedly reduce the disutility of IoT MDs compared with other traditional task offloading schemes and the price of anarchy is always less than 150\%. Besides, results also demonstrate that ISPA is reliable in boosting the revenue of OSPs.

[118]  arXiv:2101.05629 [pdf, ps, other]
Title: Off-grid Channel Estimation with Sparse Bayesian Learning for OTFS Systems
Comments: 30 pages, 9 figures, submitted to IEEE Transactions on Wireless Communications
Subjects: Information Theory (cs.IT)

This paper proposes an off-grid channel estimation scheme for orthogonal time-frequency space (OTFS) systems adopting the sparse Bayesian learning (SBL) framework. To avoid channel spreading caused by the fractional delay and Doppler shifts and to fully exploit the channel sparsity in the delay-Doppler (DD) domain, we estimate the original DD domain channel response rather than the effective DD domain channel response as commonly adopted in the literature. OTFS channel estimation is first formulated as a one-dimensional (1D) off-grid sparse signal recovery (SSR) problem based on a virtual sampling grid defined in the DD space, where the on-grid and off-grid components of the delay and Doppler shifts are separated for estimation. In particular, the on-grid components of the delay and Doppler shifts are jointly determined by the entry indices with significant values in the recovered sparse vector. Then, the corresponding off-grid components are modeled as hyper-parameters in the proposed SBL framework, which can be estimated via the expectation-maximization method. To strike a balance between channel estimation performance and computational complexity, we further propose a two-dimensional (2D) off-grid SSR problem via decoupling the delay and Doppler shift estimations. In our developed 1D and 2D off-grid SBL-based channel estimation algorithms, the hyper-parameters are updated alternatively for computing the conditional posterior distribution of channels, which can be exploited to reconstruct the effective DD domain channel. Compared with the 1D method, the proposed 2D method enjoys a much lower computational complexity while only suffers slight performance degradation. Simulation results verify the superior performance of the proposed channel estimation schemes over state-of-the-art schemes.

[119]  arXiv:2101.05631 [pdf]
Title: Parkinson's Disease Diagnosis Using Deep Learning
Authors: Mohamad Alissa
Comments: Master Research Project
Subjects: Machine Learning (cs.LG); Image and Video Processing (eess.IV)

Parkinson's Disease (PD) is a chronic, degenerative disorder which leads to a range of motor and cognitive symptoms. PD diagnosis is a challenging task since its symptoms are very similar to other diseases such as normal ageing and essential tremor. Much research has been applied to diagnosing this disease. This project aims to automate the PD diagnosis process using deep learning, Recursive Neural Networks (RNN) and Convolutional Neural Networks (CNN), to differentiate between healthy and PD patients. Besides that, since different datasets may capture different aspects of this disease, this project aims to explore which PD test is more effective in the discrimination process by analysing different imaging and movement datasets (notably cube and spiral pentagon datasets). In addition, this project evaluates which dataset type, imaging or time series, is more effective in diagnosing PD.

[120]  arXiv:2101.05633 [pdf, other]
Title: Enhanced Audit Techniques Empowered by the Reinforcement Learning Pertaining to IFRS 16 Lease
Authors: Byungryul Choi
Comments: for codes, please refer to this https URL
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)

The purpose of accounting audit is to have clear understanding on the financial activities of a company, which can be enhanced by machine learning or reinforcement learning as numeric analysis better than manual analysis can be made. For the purpose of assessment on the relevance, completeness and accuracy of the information produced by entity pertaining to the newly implemented International Financial Reporting Standard 16 Lease (IFRS 16) is one of such candidates as its characteristic of requiring the understanding on the nature of contracts and its complete analysis from listing up without omission, which can be enhanced by the digitalization of contracts for the purpose of creating the lists, still leaving the need of auditing cash flows of companies for the possible omission due to the potential error at the stage of data collection, especially for entities with various short or middle term business sites and related leases, such as construction entities.
The implementation of the reinforcement learning and its well-known code is to be made for the purpose of drawing the possibility and utilizability of interpreters from domain knowledge to numerical system, also can be called 'gamification interpreter' or 'numericalization interpreter' which can be referred or compared to the extrapolation with nondimensional numbers, such as Froude Number, in physics, which was a source of inspiration at this study. Studies on the interpreters can be able to empower the utilizability of artificial general intelligence in domain and commercial area.

[121]  arXiv:2101.05634 [pdf, other]
Title: Better Together -- An Ensemble Learner for Combining the Results of Ready-made Entity Linking Systems
Comments: SAC '20: Proceedings of the 35th Annual ACM Symposium on Applied Computing
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)

Entity linking (EL) is the task of automatically identifying entity mentions in text and resolving them to a corresponding entity in a reference knowledge base like Wikipedia. Throughout the past decade, a plethora of EL systems and pipelines have become available, where performance of individual systems varies heavily across corpora, languages or domains. Linking performance varies even between different mentions in the same text corpus, where, for instance, some EL approaches are better able to deal with short surface forms while others may perform better when more context information is available. To this end, we argue that performance may be optimised by exploiting results from distinct EL systems on the same corpus, thereby leveraging their individual strengths on a per-mention basis. In this paper, we introduce a supervised approach which exploits the output of multiple ready-made EL systems by predicting the correct link on a per-mention basis. Experimental results obtained on existing ground truth datasets and exploiting three state-of-the-art EL systems show the effectiveness of our approach and its capacity to significantly outperform the individual EL systems as well as a set of baseline methods.

[122]  arXiv:2101.05636 [pdf]
Title: To what extent is researchers' data-sharing motivated by formal mechanisms of recognition and credit?
Comments: 26 pages, 4 figures, 7 tables
Subjects: Digital Libraries (cs.DL); Applications (stat.AP)

Data sharing by researchers is a centerpiece of Open Science principles and scientific progress. For a sample of 6019 researchers, we analyze the extent/frequency of their data sharing. Specifically, the relationship with the following four variables: how much they value data citations, the extent to which their data-sharing activities are formally recognized, their perceptions of whether sufficient credit is awarded for data sharing, and the reported extent to which data citations motivate their data sharing. In addition, we analyze the extent to which researchers have reused openly accessible data, as well as how data sharing varies by professional age-cohort, and its relationship to the value they place on data citations. Furthermore, we consider most of the explanatory variables simultaneously by estimating a multiple linear regression that predicts the extent/frequency of their data sharing. We use the dataset of the State of Open Data Survey 2019 by Springer Nature and Digital Science. Results do allow us to conclude that a desire for recognition/credit is a major incentive for data sharing. Thus, the possibility of receiving data citations is highly valued when sharing data, especially among younger researchers, irrespective of the frequency with which it is practiced. Finally, the practice of data sharing was found to be more prevalent at late research career stages, despite this being when citations are less valued and have a lower motivational impact. This could be due to the fact that later-career researchers may benefit less from keeping their data private.

[123]  arXiv:2101.05639 [pdf]
Title: Untargeted, Targeted and Universal Adversarial Attacks and Defenses on Time Series
Comments: Published at IJCNN 2020
Subjects: Machine Learning (cs.LG)

Deep learning based models are vulnerable to adversarial attacks. These attacks can be much more harmful in case of targeted attacks, where an attacker tries not only to fool the deep learning model, but also to misguide the model to predict a specific class. Such targeted and untargeted attacks are specifically tailored for an individual sample and require addition of an imperceptible noise to the sample. In contrast, universal adversarial attack calculates a special imperceptible noise which can be added to any sample of the given dataset so that, the deep learning model is forced to predict a wrong class. To the best of our knowledge these targeted and universal attacks on time series data have not been studied in any of the previous works. In this work, we have performed untargeted, targeted and universal adversarial attacks on UCR time series datasets. Our results show that deep learning based time series classification models are vulnerable to these attacks. We also show that universal adversarial attacks have good generalization property as it need only a fraction of the training data. We have also performed adversarial training based adversarial defense. Our results show that models trained adversarially using Fast gradient sign method (FGSM), a single step attack, are able to defend against FGSM as well as Basic iterative method (BIM), a popular iterative attack.

[124]  arXiv:2101.05640 [pdf, other]
Title: Continuous Deep Q-Learning with Simulator for Stabilization of Uncertain Discrete-Time Systems
Subjects: Machine Learning (cs.LG); Systems and Control (eess.SY); Machine Learning (stat.ML)

Applications of reinforcement learning (RL) to stabilization problems of real systems are restricted since an agent needs many experiences to learn an optimal policy and may determine dangerous actions during its exploration. If we know a mathematical model of a real system, a simulator is useful because it predicates behaviors of the real system using the mathematical model with a given system parameter vector. We can collect many experiences more efficiently than interactions with the real system. However, it is difficult to identify the system parameter vector accurately. If we have an identification error, experiences obtained by the simulator may degrade the performance of the learned policy. Thus, we propose a practical RL algorithm that consists of two stages. At the first stage, we choose multiple system parameter vectors. Then, we have a mathematical model for each system parameter vector, which is called a virtual system. We obtain optimal Q-functions for multiple virtual systems using the continuous deep Q-learning algorithm. At the second stage, we represent a Q-function for the real system by a linear approximated function whose basis functions are optimal Q-functions learned at the first stage. The agent learns the Q-function through interactions with the real system online. By numerical simulations, we show the usefulness of our proposed method.

[125]  arXiv:2101.05641 [pdf, other]
Title: $C^3DRec$: Cloud-Client Cooperative Deep Learning for Temporal Recommendation in the Post-GDPR Era
Authors: Jialiang Han, Yun Ma
Subjects: Information Retrieval (cs.IR); Cryptography and Security (cs.CR); Machine Learning (cs.LG)

Mobile devices enable users to retrieve information at any time and any place. Considering the occasional requirements and fragmentation usage pattern of mobile users, temporal recommendation techniques are proposed to improve the efficiency of information retrieval on mobile devices by means of accurately recommending items via learning temporal interests with short-term user interaction behaviors. However, the enforcement of privacy-preserving laws and regulations, such as GDPR, may overshadow the successful practice of temporal recommendation. The reason is that state-of-the-art recommendation systems require to gather and process the user data in centralized servers but the interaction behaviors data used for temporal recommendation are usually non-transactional data that are not allowed to gather without the explicit permission of users according to GDPR. As a result, if users do not permit services to gather their interaction behaviors data, the temporal recommendation fails to work. To realize the temporal recommendation in the post-GDPR era, this paper proposes $C^3DRec$, a cloud-client cooperative deep learning framework of mining interaction behaviors for recommendation while preserving user privacy. $C^3DRec$ constructs a global recommendation model on centralized servers using data collected before GDPR and fine-tunes the model directly on individual local devices using data collected after GDPR. We design two modes to accomplish the recommendation, i.e. pull mode where candidate items are pulled down onto the devices and fed into the local model to get recommended items, and push mode where the output of the local model is pushed onto the server and combined with candidate items to get recommended ones. Evaluation results show that $C^3DRec$ achieves comparable recommendation accuracy to the centralized approaches, with minimal privacy concern.

[126]  arXiv:2101.05645 [pdf, other]
Title: Ensemble of LSTMs and feature selection for human action prediction
Subjects: Robotics (cs.RO)

As robots are becoming more and more ubiquitous in human environments, it will be necessary for robotic systems to better understand and predict human actions. However, this is not an easy task, at times not even for us humans, but based on a relatively structured set of possible actions, appropriate cues, and the right model, this problem can be computationally tackled. In this paper, we propose to use an ensemble of long-short term memory (LSTM) networks for human action prediction. To train and evaluate models, we used the MoGaze dataset - currently the most comprehensive dataset capturing poses of human joints and the human gaze. We have thoroughly analyzed the MoGaze dataset and selected a reduced set of cues for this task. Our model can predict (i) which of the labeled objects the human is going to grasp, and (ii) which of the macro locations the human is going to visit (such as table or shelf). We have exhaustively evaluated the proposed method and compared it to individual cue baselines. The results suggest that our LSTM model slightly outperforms the gaze baseline in single object picking accuracy, but achieves better accuracy in macro object prediction. Furthermore, we have also analyzed the prediction accuracy when the gaze is not used, and in this case, the LSTM model considerably outperformed the best single cue baseline

[127]  arXiv:2101.05646 [pdf]
Title: Malicious Code Detection: Run Trace Output Analysis by LSTM
Comments: 11 pages, 5 figures, 5 tables, accepted to IEEE Access
Subjects: Cryptography and Security (cs.CR); Computation and Language (cs.CL); Machine Learning (cs.LG)

Malicious software threats and their detection have been gaining importance as a subdomain of information security due to the expansion of ICT applications in daily settings. A major challenge in designing and developing anti-malware systems is the coverage of the detection, particularly the development of dynamic analysis methods that can detect polymorphic and metamorphic malware efficiently. In the present study, we propose a methodological framework for detecting malicious code by analyzing run trace outputs by Long Short-Term Memory (LSTM). We developed models of run traces of malicious and benign Portable Executable (PE) files. We created our dataset from run trace outputs obtained from dynamic analysis of PE files. The obtained dataset was in the instruction format as a sequence and was called Instruction as a Sequence Model (ISM). By splitting the first dataset into basic blocks, we obtained the second one called Basic Block as a Sequence Model (BSM). The experiments showed that the ISM achieved an accuracy of 87.51% and a false positive rate of 18.34%, while BSM achieved an accuracy of 99.26% and a false positive rate of 2.62%.

[128]  arXiv:2101.05650 [pdf, other]
Title: Rescaling CNN through Learnable Repetition of Network Parameters
Comments: Under Review at ICIP 2021
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Deeper and wider CNNs are known to provide improved performance for deep learning tasks. However, most such networks have poor performance gain per parameter increase. In this paper, we investigate whether the gain observed in deeper models is purely due to the addition of more optimization parameters or whether the physical size of the network as well plays a role. Further, we present a novel rescaling strategy for CNNs based on learnable repetition of its parameters. Based on this strategy, we rescale CNNs without changing their parameter count, and show that learnable sharing of weights itself can provide significant boost in the performance of any given model without changing its parameter count. We show that small base networks when rescaled, can provide performance comparable to deeper networks with as low as 6% of optimization parameters of the deeper one.
The relevance of weight sharing is further highlighted through the example of group-equivariant CNNs. We show that the significant improvements obtained with group-equivariant CNNs over the regular CNNs on classification problems are only partly due to the added equivariance property, and part of it comes from the learnable repetition of network weights. For rot-MNIST dataset, we show that up to 40% of the relative gain reported by state-of-the-art methods for rotation equivariance could actually be due to just the learnt repetition of weights.

[129]  arXiv:2101.05652 [pdf, other]
Title: A Nature-Inspired Feature Selection Approach based on Hypercomplex Information
Comments: 17 pages, 7 figures
Journal-ref: APPLIED SOFT COMPUTING; v. 94, SEP 2020
Subjects: Neural and Evolutionary Computing (cs.NE); Machine Learning (cs.LG)

Feature selection for a given model can be transformed into an optimization task. The essential idea behind it is to find the most suitable subset of features according to some criterion. Nature-inspired optimization can mitigate this problem by producing compelling yet straightforward solutions when dealing with complicated fitness functions. Additionally, new mathematical representations, such as quaternions and octonions, are being used to handle higher-dimensional spaces. In this context, we are introducing a meta-heuristic optimization framework in a hypercomplex-based feature selection, where hypercomplex numbers are mapped to real-valued solutions and then transferred onto a boolean hypercube by a sigmoid function. The intended hypercomplex feature selection is tested for several meta-heuristic algorithms and hypercomplex representations, achieving results comparable to some state-of-the-art approaches. The good results achieved by the proposed approach make it a promising tool amongst feature selection research.

[130]  arXiv:2101.05656 [pdf, other]
Title: On Informative Tweet Identification For Tracking Mass Events
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)

Twitter has been heavily used as an important channel for communicating and discussing about events in real-time. In such major events, many uninformative tweets are also published rapidly by many users, making it hard to follow the events. In this paper, we address this problem by investigating machine learning methods for automatically identifying informative tweets among those that are relevant to a target event. We examine both traditional approaches with a rich set of handcrafted features and state of the art approaches with automatically learned features. We further propose a hybrid model that leverages both the handcrafted features and the automatically learned ones. Our experiments on several large datasets of real-world events show that the latter approaches significantly outperform the former and our proposed model performs the best, suggesting highly effective mechanisms for tracking mass events.

[131]  arXiv:2101.05661 [pdf, other]
Title: A Pipeline for Vision-Based On-Orbit Proximity Operations Using Deep Learning and Synthetic Imagery
Comments: Accepted to IEEE Aerospace Conference 2021. 14 pages, 11 figures
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Software Engineering (cs.SE)

Deep learning has become the gold standard for image processing over the past decade. Simultaneously, we have seen growing interest in orbital activities such as satellite servicing and debris removal that depend on proximity operations between spacecraft. However, two key challenges currently pose a major barrier to the use of deep learning for vision-based on-orbit proximity operations. Firstly, efficient implementation of these techniques relies on an effective system for model development that streamlines data curation, training, and evaluation. Secondly, a scarcity of labeled training data (images of a target spacecraft) hinders creation of robust deep learning models. This paper presents an open-source deep learning pipeline, developed specifically for on-orbit visual navigation applications, that addresses these challenges. The core of our work consists of two custom software tools built on top of a cloud architecture that interconnects all stages of the model development process. The first tool leverages Blender, an open-source 3D graphics toolset, to generate labeled synthetic training data with configurable model poses (positions and orientations), lighting conditions, backgrounds, and commonly observed in-space image aberrations. The second tool is a plugin-based framework for effective dataset curation and model training; it provides common functionality like metadata generation and remote storage access to all projects while giving complete independence to project-specific code. Time-consuming, graphics-intensive processes such as synthetic image generation and model training run on cloud-based computational resources which scale to any scope and budget and allow development of even the largest datasets and models from any machine. The presented system has been used in the Texas Spacecraft Laboratory with marked benefits in development speed and quality.

[132]  arXiv:2101.05665 [pdf]
Title: Exploring the socio-technical interplay of Industry 4.0: a single case study of an Italian manufacturing organisation
Comments: Proceedings of the 6th International Workshop on Socio-Technical Perspective in IS Development (STPIS 2020), June 8-9, 2020 this http URL
Journal-ref: CEUR Workshop Proceedings, Vol-2789,pages 121-126, 2020
Subjects: Computers and Society (cs.CY)

In this position paper, we explore the socio-technical interplay of Industry 4.0. Industry 4.0 is an industrial plan that aims at automating the production process by the adoption of advanced leading-edge technologies down the assembly line. Most of the studies employ a technical perspective that is focused on studying how to integrate various technologies and the resulting benefits for organisations. In contrast, few studies use a socio-technical perspective of Industry 4.0. We close this gap employs the socio-technical lens on an in-depth single case study of a manufacturing organisation that effectively adopted Industry 4.0 technologies. The findings of our studies shed light both on the socio-technical interplay between workers and technologies and the novel role of workers. We conclude proposing a socio-technical framework for an Industry 4.0 context.

[133]  arXiv:2101.05667 [pdf, other]
Title: The Expando-Mono-Duo Design Pattern for Text Ranking with Pretrained Sequence-to-Sequence Models
Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)

We propose a design pattern for tackling text ranking problems, dubbed "Expando-Mono-Duo", that has been empirically validated for a number of ad hoc retrieval tasks in different domains. At the core, our design relies on pretrained sequence-to-sequence models within a standard multi-stage ranking architecture. "Expando" refers to the use of document expansion techniques to enrich keyword representations of texts prior to inverted indexing. "Mono" and "Duo" refer to components in a reranking pipeline based on a pointwise model and a pairwise model that rerank initial candidates retrieved using keyword search. We present experimental results from the MS MARCO passage and document ranking tasks, the TREC 2020 Deep Learning Track, and the TREC-COVID challenge that validate our design. In all these tasks, we achieve effectiveness that is at or near the state of the art, in some cases using a zero-shot approach that does not exploit any training data from the target task. To support replicability, implementations of our design pattern are open-sourced in the Pyserini IR toolkit and PyGaggle neural reranking library.

[134]  arXiv:2101.05670 [pdf]
Title: Exploring the Smart City Adoption Process: Evidence from the Belgian urban context
Comments: Proceedings of the 6th International Workshop on Socio-Technical Perspective in IS Development (STPIS 2020), June 8-9, 2020 this http URL
Journal-ref: CEUR Workshop Proceedings, Vol-2789, pages 1-7, 2020
Subjects: Computers and Society (cs.CY)

In this position paper, we explore the adoption of a Smart City with a socio-technical perspective. A Smart city is a transformational technological process leading to profound modifications of existing urban regimes and infrastructure components. In this study, we consider a Smart City as a socio-technical system where the interplay between technologies and users ensures the sustainable development of smart city initiatives that improve the quality of life and solve important socio-economic problems. The adoption of a Smart City required a participative approach where users are involved during the adoption process to joint optimise both systems. Thus, we contribute to socio-technical research showing how a participative approach based on press relationships to facilitate information exchange between municipal actors and citizens worked as a success factor for the smart city adoption. We also discuss the limitations of this approach.

[135]  arXiv:2101.05673 [pdf, other]
Title: Analysis of hidden feedback loops in continuous machine learning systems
Authors: Anton Khritankov
Comments: 6 pages, 5 figures
Journal-ref: Soft. Qual.: Fut. Persp. on Soft. Eng. Q. SWQD 2021. LNBIP, V. 404
Subjects: Machine Learning (cs.LG); Software Engineering (cs.SE)

In this concept paper, we discuss intricacies of specifying and verifying the quality of continuous and lifelong learning artificial intelligence systems as they interact with and influence their environment causing a so-called concept drift. We signify a problem of implicit feedback loops, demonstrate how they intervene with user behavior on an exemplary housing prices prediction system. Based on a preliminary model, we highlight conditions when such feedback loops arise and discuss possible solution approaches.

[136]  arXiv:2101.05678 [pdf, other]
Title: Lebesgue integration. Detailed proofs to be formalized in Coq
Authors: François Clément (SERENA, CERMICS), Vincent Martin (LMAC)
Subjects: Logic in Computer Science (cs.LO); Classical Analysis and ODEs (math.CA); Functional Analysis (math.FA)

To obtain the highest confidence on the correction of numerical simulation programs implementing the finite element method, one has to formalize the mathematical notions and results that allow to establish the soundness of the method. Sobolev spaces are the correct framework in which most partial derivative equations may be stated and solved. These functional spaces are built on integration and measure theory. Hence, this chapter in functional analysis is a mandatory theoretical cornerstone for the definition of the finite element method. The purpose of this document is to provide the formal proof community with very detailed pen-and-paper proofs of the main results from integration and measure theory.

[137]  arXiv:2101.05681 [pdf, ps, other]
Title: Adaptive Private Distributed Matrix Multiplication
Comments: arXiv admin note: text overlap with arXiv:2004.12925
Subjects: Information Theory (cs.IT)

We consider the problem of designing codes with flexible rate (referred to as rateless codes), for private distributed matrix-matrix multiplication. A master server owns two private matrices $\mathbf{A}$ and $\mathbf{B}$ and hires worker nodes to help computing their multiplication. The matrices should remain information-theoretically private from the workers. Codes with fixed rate require the master to assign tasks to the workers and then wait for a predetermined number of workers to finish their assigned tasks. The size of the tasks, hence the rate of the scheme, depends on the number of workers that the master waits for. We design a rateless private matrix-matrix multiplication scheme, called RPM3. In contrast to fixed-rate schemes, our scheme fixes the size of the tasks and allows the master to send multiple tasks to the workers. The master keeps sending tasks and receiving results until it can decode the multiplication; rendering the scheme flexible and adaptive to heterogeneous environments. Despite resulting in a smaller rate than known straggler-tolerant schemes, RPM3 provides a smaller mean waiting time of the master by leveraging the heterogeneity of the workers. The waiting time is studied under two different models for the workers' service time. We provide upper bounds for the mean waiting time under both models. In addition, we provide lower bounds on the mean waiting time under the worker-dependent fixed service time model.

[138]  arXiv:2101.05682 [pdf, other]
Title: AVGCN: Trajectory Prediction using Graph Convolutional Networks Guided by Human Attention
Comments: 7 pages, 4 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)

Pedestrian trajectory prediction is a critical yet challenging task, especially for crowded scenes. We suggest that introducing an attention mechanism to infer the importance of different neighbors is critical for accurate trajectory prediction in scenes with varying crowd size. In this work, we propose a novel method, AVGCN, for trajectory prediction utilizing graph convolutional networks (GCN) based on human attention (A denotes attention, V denotes visual field constraints). First, we train an attention network that estimates the importance of neighboring pedestrians, using gaze data collected as subjects perform a bird's eye view crowd navigation task. Then, we incorporate the learned attention weights modulated by constraints on the pedestrian's visual field into a trajectory prediction network that uses a GCN to aggregate information from neighbors efficiently. AVGCN also considers the stochastic nature of pedestrian trajectories by taking advantage of variational trajectory prediction. Our approach achieves state-of-the-art performance on several trajectory prediction benchmarks, and the lowest average prediction error over all considered benchmarks.

[139]  arXiv:2101.05684 [pdf, other]
Title: Generating coherent spontaneous speech and gesture from text
Comments: 3 pages, 2 figures, published at the ACM International Conference on Intelligent Virtual Agents (IVA) 2020
Journal-ref: Proceedings of the 20th ACM International Conference on Intelligent Virtual Agents (IVA '20), 2020, 3 pages
Subjects: Machine Learning (cs.LG); Graphics (cs.GR); Sound (cs.SD); Audio and Speech Processing (eess.AS)

Embodied human communication encompasses both verbal (speech) and non-verbal information (e.g., gesture and head movements). Recent advances in machine learning have substantially improved the technologies for generating synthetic versions of both of these types of data: On the speech side, text-to-speech systems are now able to generate highly convincing, spontaneous-sounding speech using unscripted speech audio as the source material. On the motion side, probabilistic motion-generation methods can now synthesise vivid and lifelike speech-driven 3D gesticulation. In this paper, we put these two state-of-the-art technologies together in a coherent fashion for the first time. Concretely, we demonstrate a proof-of-concept system trained on a single-speaker audio and motion-capture dataset, that is able to generate both speech and full-body gestures together from text input. In contrast to previous approaches for joint speech-and-gesture generation, we generate full-body gestures from speech synthesis trained on recordings of spontaneous speech from the same person as the motion-capture data. We illustrate our results by visualising gesture spaces and text-speech-gesture alignments, and through a demonstration video at https://simonalexanderson.github.io/IVA2020 .

[140]  arXiv:2101.05687 [pdf, other]
Title: Towards Accurate Camouflaged Object Detection with Mixture Convolution and Interactive Fusion
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Camouflaged object detection (COD), which aims to identify the objects that conceal themselves into the surroundings, has recently drawn increasing research efforts in the field of computer vision. In practice, the success of deep learning based COD is mainly determined by two key factors, including (i) A significantly large receptive field, which provides rich context information, and (ii) An effective fusion strategy, which aggregates the rich multi-level features for accurate COD. Motivated by these observations, in this paper, we propose a novel deep learning based COD approach, which integrates the large receptive field and effective feature fusion into a unified framework. Specifically, we first extract multi-level features from a backbone network. The resulting features are then fed to the proposed dual-branch mixture convolution modules, each of which utilizes multiple asymmetric convolutional layers and two dilated convolutional layers to extract rich context features from a large receptive field. Finally, we fuse the features using specially-designed multi-level interactive fusion modules, each of which employs an attention mechanism along with feature interaction for effective feature fusion. Our method detects camouflaged objects with an effective fusion strategy, which aggregates the rich context information from a large receptive field. All of these designs meet the requirements of COD well, allowing the accurate detection of camouflaged objects. Extensive experiments on widely-used benchmark datasets demonstrate that our method is capable of accurately detecting camouflaged objects and outperforms the state-of-the-art methods.

[141]  arXiv:2101.05694 [pdf, other]
Title: Temporal Logic Task Allocation in Heterogeneous Multi-Robot Systems
Comments: 47pages, 17 figures
Subjects: Robotics (cs.RO)

In this paper, we consider the problem of optimally allocating tasks, expressed as global Linear Temporal Logic (LTL) specifications, to teams of heterogeneous mobile robots. The robots are classified in different types that capture their different capabilities, and each task may require robots of multiple types. The specific robots assigned to each task are immaterial, as long as they are of the desired type. Given a discrete workspace, our goal is to design paths, i.e., sequences of discrete states, for the robots so that the LTL specification is satisfied. To obtain a scalable solution to this complex temporal logic task allocation problem, we propose a hierarchical approach that first allocates specific robots to tasks using the information about the tasks contained in the Nondeterministic Buchi Automaton (NBA) that captures the LTL specification, and then designs low-level executable plans for the robots that respect the high-level assignment. Specifically, we first prune and relax the NBA by removing all negative atomic propositions. This step is motivated by "lazy collision checking" methods in robotics and allows to simplify the planning problem by checking constraint satisfaction only when needed. Then, we extract sequences of subtasks from the relaxed NBA along with their temporal orders, and formulate a Mixed Integer Linear Program (MILP) to allocate these subtasks to the robots. Finally, we define generalized multi-robot path planning problems to obtain low-level executable robot plans that satisfy both the high-level task allocation and the temporal constraints captured by the negative atomic propositions in the original NBA. We provide theoretical results showing completeness and soundness of our proposed method and present numerical simulations demonstrating that our method can generate robot paths with lower cost, considerably faster than existing methods.

[142]  arXiv:2101.05697 [pdf, other]
Title: Multi-Fidelity Digital Twins: a Means for Better Cyber-Physical Systems Testing?
Authors: Aitor Arrieta
Subjects: Software Engineering (cs.SE)

Cyber-Physical Systems (CPSs) combine software and physical components. These systems are widely applied in society within many domains, including the automotive, aerospace, railway, etc. Testing these systems is extremely challenging, therefore, it has attracted significant attention from the research community. A driving CPS testing technique in industry is simulation-based testing. However, this poses significant challenges. In this new-idea paper we present a novel approach to enhance the testing processes of CPSs. This novel approach is motivated with examples and open questions.

[143]  arXiv:2101.05700 [pdf, other]
Title: Spillover Algorithm: A Decentralized Coordination Approach for Multi-Robot Production Planning in Open Shared Factories
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Discrete Mathematics (cs.DM); Multiagent Systems (cs.MA)

Open and shared manufacturing factories typically dispose of a limited number of robots that should be properly allocated to tasks in time and space for an effective and efficient system performance. In particular, we deal with the dynamic capacitated production planning problem with sequence independent setup costs where quantities of products to manufacture and location of robots need to be determined at consecutive periods within a given time horizon and products can be anticipated or backordered related to the demand period. We consider a decentralized multi-agent variant of this problem in an open factory setting with multiple owners of robots as well as different owners of the items to be produced, both considered self-interested and individually rational. Existing solution approaches to the classic constrained lot-sizing problem are centralized exact methods that require sharing of global knowledge of all the participants' private and sensitive information and are not applicable in the described multi-agent context. Therefore, we propose a computationally efficient decentralized approach based on the spillover effect that solves this NP-hard problem by distributing decisions in an intrinsically decentralized multi-agent system environment while protecting private and sensitive information. To the best of our knowledge, this is the first decentralized algorithm for the solution of the studied problem in intrinsically decentralized environments where production resources and/or products are owned by multiple stakeholders with possibly conflicting objectives. To show its efficiency, the performance of the Spillover Algorithm is benchmarked against state-of-the-art commercial solver CPLEX 12.8.

[144]  arXiv:2101.05701 [pdf, other]
Title: TUDublin team at Constraint@AAAI2021 -- COVID19 Fake News Detection
Comments: 8 pages
Subjects: Computation and Language (cs.CL)

The paper is devoted to the participation of the TUDublin team in Constraint@AAAI2021 - COVID19 Fake News Detection Challenge. Today, the problem of fake news detection is more acute than ever in connection with the pandemic. The number of fake news is increasing rapidly and it is necessary to create AI tools that allow us to identify and prevent the spread of false information about COVID-19 urgently. The main goal of the work was to create a model that would carry out a binary classification of messages from social media as real or fake news in the context of COVID-19. Our team constructed the ensemble consisting of Bidirectional Long Short Term Memory, Support Vector Machine, Logistic Regression, Naive Bayes and a combination of Logistic Regression and Naive Bayes. The model allowed us to achieve 0.94 F1-score, which is within 5\% of the best result.

[145]  arXiv:2101.05702 [pdf, other]
Title: Structural Analysis of Multimode DAE Systems: summary of results
Authors: Albert Benveniste (HYCOMES), Benoît Caillaud (HYCOMES), Mathias Malandain (HYCOMES)
Comments: arXiv admin note: text overlap with arXiv:2008.05166
Subjects: Programming Languages (cs.PL)

Modern modeling languages for general physical systems, such as Modelica, Amesim, or Simscape, rely on Differential Algebraic Equations (DAEs), i.e., constraints of the form f(\dot{x},x,u)=0. This drastically facilitates modeling from first principles of the physics, as well as model reuse. In recent works [RR9334], we presented the mathematical theory needed to establish the development of compilers and tools for DAE-based physical modeling languages on solid mathematical grounds.At the core of this analysis sits the so-called structural analysis, whose purpose, at compile time, is to either identify under- and over-specified subsystems (if any), or to rewrite the model in a form amenable of existing DAE solvers, including the handling of mode change events. The notion of "structure" collects, for each mode and mode change event, the variables and equations involved, as well as the latent equations (additional equations redundant with the system), needed to prepare the code submitted to the solver. The notion of DAE index (the minimal number of times any equation has to be possibly differentiated) is part of this structural analysis. This report complements [RR9334] by collecting all the needed background on structural analysis. The body of knowledge on structural analysis is large and scattered, which also motivated us to collect it in a single report.We first explain the primary meaning of structural analysis of systems of equations, namely the study of their regularity or singularity in some generic sense. We then briefly review the body of graph theory used in this context. We develop some extensions, for which we are not aware of any reference, namely the structural analysis of systems of equations with existential quantifiers. For the structural analysis of DAE systems, we focus on John Pryce's Sigma-method, that we both summarize and extend to non-square systems. The uses of these tools and methods in [RR9334] are highlighted in this report.

[146]  arXiv:2101.05703 [pdf, other]
Title: Exploring Asymmetric Roles in Mixed-Ability Gaming
Comments: 21 pages, 1 figure. Manuscript submitted to ACM Conference on Human Factors in Computing Systems (CHI 21)
Subjects: Human-Computer Interaction (cs.HC)

The landscape of digital games is segregated by player ability. For example, sighted players have a multitude of highly visual games at their disposal, while blind players may choose from a variety of audio games. Attempts at improving cross-ability access to any of those are often limited in the experience they provide, or disregard multiplayer experiences. We explore ability-based asymmetric roles as a design approach to create engaging and challenging mixed-ability play. Our team designed and developed two collaborative testbed games exploring asymmetric interdependent roles. In a remote study with 13 mixed-visual-ability pairs we assessed how roles affected perceptions of engagement, competence, and autonomy, using a mixed-methods approach. The games provided an engaging and challenging experience, in which differences in visual ability were not limiting. Our results underline how experiences unequal by design can give rise to an equitable joint experience.

[147]  arXiv:2101.05706 [pdf]
Title: Environmental Variable Monitoring with IoT Technology
Comments: 14 pages, in Spanish. 12 figures. Work presented in TechFest 2019 Conference. organized by Universidad de San Buenaventura. Bogota. Colombia. this https URL Proceedings in press
Subjects: Networking and Internet Architecture (cs.NI)

This article describes the design of a flexible and low-cost platform for monitoring environmental variables applied to agriculture. For the construction of this platform, technologies based on the communication protocol, Wi-Fi, Bluetooth, and Zigbee were used, using the embedded Raspberry pi 3 b + system and sensors to quantify different environmental variables, using different open source hardware and software tools. The network is made up of a central node (gateway), implemented on Samsung's Artik 1020 card, and two nodes where the sensors for reading environmental variables are connected. Finally, the data is collected by the gateway, which will be in charge of processing and storing it in a database so that the user in the future can access the information in real time from anywhere.

[148]  arXiv:2101.05709 [pdf, other]
Title: Rule-based Optimal Control for Autonomous Driving
Comments: accepted in ICCPS2021
Subjects: Robotics (cs.RO); Systems and Control (eess.SY)

We develop optimal control strategies for Autonomous Vehicles (AVs) that are required to meet complex specifications imposed by traffic laws and cultural expectations of reasonable driving behavior. We formulate these specifications as rules, and specify their priorities by constructing a priority structure. We propose a recursive framework, in which the satisfaction of the rules in the priority structure are iteratively relaxed based on their priorities. Central to this framework is an optimal control problem, where convergence to desired states is achieved using Control Lyapunov Functions (CLFs), and safety is enforced through Control Barrier Functions (CBFs). We also show how the proposed framework can be used for after-the-fact, pass / fail evaluation of trajectories - a given trajectory is rejected if we can find a controller producing a trajectory that leads to less violation of the rule priority structure. We present case studies with multiple driving scenarios to demonstrate the effectiveness of the proposed framework.

[149]  arXiv:2101.05716 [pdf, other]
Title: SICKNL: A Dataset for Dutch Natural Language Inference
Comments: To appear at EACL 2021
Subjects: Computation and Language (cs.CL)

We present SICK-NL (read: signal), a dataset targeting Natural Language Inference in Dutch. SICK-NL is obtained by translating the SICK dataset of Marelli et al. (2014)from English into Dutch. Having a parallel inference dataset allows us to compare both monolingual and multilingual NLP models for English and Dutch on the two tasks. In the paper, we motivate and detail the translation process, perform a baseline evaluation on both the original SICK dataset and its Dutch incarnation SICK-NL, taking inspiration from Dutch skipgram embeddings and contextualised embedding models. In addition, we encapsulate two phenomena encountered in the translation to formulate stress tests and verify how well the Dutch models capture syntactic restructurings that do not affect semantics. Our main finding is all models perform worse on SICK-NL than on SICK, indicating that the Dutch dataset is more challenging than the English original. Results on the stress tests show that models don't fully capture word order freedom in Dutch, warranting future systematic studies.

[150]  arXiv:2101.05717 [pdf]
Title: Adaptive Frequency Response Reserve based on Real-time System Inertia
Authors: Shutang You
Comments: 6 pages, 17 figures, 3 tables
Subjects: Systems and Control (eess.SY)

To ensure adequate and economic reserve for primary frequency response in the current and future power system, this paper proposes real-time frequency response reserve (FRR) requirement based on system inertia. This minimum FRR will help power system operators adjust the current frequency response requirement and accommodate more renewable generations while achieving a saving of both energy and facility costs. Most importantly, the ability to adaptively vary the FRR will provide the additional agility, resiliency, and reliability to the grid.

[151]  arXiv:2101.05718 [pdf, other]
Title: Automating Gamification Personalization: To the User and Beyond
Comments: Submitted to IEEE Transactions on Learning Technologies. 14 pages, 2 figures, 8 tables
Subjects: Human-Computer Interaction (cs.HC); Artificial Intelligence (cs.AI)

Personalized gamification explores knowledge about the users to tailor gamification designs to improve one-size-fits-all gamification. The tailoring process should simultaneously consider user and contextual characteristics (e.g., activity to be done and geographic location), which leads to several occasions to tailor. Consequently, tools for automating gamification personalization are needed. The problems that emerge are that which of those characteristics are relevant and how to do such tailoring are open questions, and that the required automating tools are lacking. We tackled these problems in two steps. First, we conducted an exploratory study, collecting participants' opinions on the game elements they consider the most useful for different learning activity types (LAT) via survey. Then, we modeled opinions through conditional decision trees to address the aforementioned tailoring process. Second, as a product from the first step, we implemented a recommender system that suggests personalized gamification designs (which game elements to use), addressing the problem of automating gamification personalization. Our findings i) present empirical evidence that LAT, geographic locations, and other user characteristics affect users' preferences, ii) enable defining gamification designs tailored to user and contextual features simultaneously, and iii) provide technological aid for those interested in designing personalized gamification. The main implications are that demographics, game-related characteristics, geographic location, and LAT to be done, as well as the interaction between different kinds of information (user and contextual characteristics), should be considered in defining gamification designs and that personalizing gamification designs can be improved with aid from our recommender system.

[152]  arXiv:2101.05719 [pdf, ps, other]
Title: Minimum Cost Flows, MDPs, and $\ell_1$-Regression in Nearly Linear Time for Dense Instances
Subjects: Data Structures and Algorithms (cs.DS); Optimization and Control (math.OC)

In this paper we provide new randomized algorithms with improved runtimes for solving linear programs with two-sided constraints. In the special case of the minimum cost flow problem on $n$-vertex $m$-edge graphs with integer polynomially-bounded costs and capacities we obtain a randomized method which solves the problem in $\tilde{O}(m+n^{1.5})$ time. This improves upon the previous best runtime of $\tilde{O}(m\sqrt{n})$ (Lee-Sidford 2014) and, in the special case of unit-capacity maximum flow, improves upon the previous best runtimes of $m^{4/3+o(1)}$ (Liu-Sidford 2020, Kathuria 2020) and $\tilde{O}(m\sqrt{n})$ (Lee-Sidford 2014) for sufficiently dense graphs.
For $\ell_1$-regression in a matrix with $n$-columns and $m$-rows we obtain a randomized method which computes an $\epsilon$-approximate solution in $\tilde{O}(mn+n^{2.5})$ time. This yields a randomized method which computes an $\epsilon$-optimal policy of a discounted Markov Decision Process with $S$ states and $A$ actions per state in time $\tilde{O}(S^2A+S^{2.5})$. These methods improve upon the previous best runtimes of methods which depend polylogarithmically on problem parameters, which were $\tilde{O}(mn^{1.5})$ (Lee-Sidford 2015) and $\tilde{O}(S^{2.5}A)$ (Lee-Sidford 2014, Sidford-Wang-Wu-Ye 2018).
To obtain this result we introduce two new algorithmic tools of independent interest. First, we design a new general interior point method for solving linear programs with two sided constraints which combines techniques from (Lee-Song-Zhang 2019, Brand et al. 2020) to obtain a robust stochastic method with iteration count nearly the square root of the smaller dimension. Second, to implement this method we provide dynamic data structures for efficiently maintaining approximations to variants of Lewis-weights, a fundamental importance measure for matrices which generalize leverage scores and effective resistances.

[153]  arXiv:2101.05725 [pdf, other]
Title: Stereo camera system calibration: the need of two sets of parameters
Subjects: Computer Vision and Pattern Recognition (cs.CV)

The reconstruction of a scene via a stereo-camera system is a two-steps process, where at first images from different cameras are matched to identify the set of point-to-point correspondences that then will actually be reconstructed in the three dimensional real world. The performance of the system strongly relies of the calibration procedure, which has to be carefully designed to guarantee optimal results. We implemented three different calibration methods and we compared their performance over 19 datasets. We present the experimental evidence that, due to the image noise, a single set of parameters is not sufficient to achieve high accuracy in the identification of the correspondences and in the 3D reconstruction at the same time. We propose to calibrate the system twice to estimate two different sets of parameters: the one obtained by minimizing the reprojection error that will be used when dealing with quantities defined in the 2D space of the cameras, and the one obtained by minimizing the reconstruction error that will be used when dealing with quantities defined in the real 3D world.

[154]  arXiv:2101.05730 [pdf, other]
Title: Towards Understanding and Evaluating Structural Node Embeddings
Comments: A shorter version of this paper was presented in the Mining and Learning with Graphs workshop at KDD 2020
Subjects: Social and Information Networks (cs.SI)

While most network embedding techniques model the proximity between nodes in a network, recently there has been significant interest in structural embeddings that are based on node equivalences, a notion rooted in sociology: equivalences or positions are collections of nodes that have similar roles--i.e., similar functions, ties or interactions with nodes in other positions--irrespective of their distance or reachability in the network. Unlike the proximity-based methods that are rigorously evaluated in the literature, the evaluation of structural embeddings is less mature. It relies on small synthetic or real networks with labels that are not perfectly defined, and its connection to sociological equivalences has hitherto been vague and tenuous. With new node embedding methods being developed at a breakneck pace, proper evaluation and systematic characterization of existing approaches will be essential to progress. To fill in this gap, we set out to understand what types of equivalences structural embeddings capture. We are the first to contribute rigorous intrinsic and extrinsic evaluation methodology for structural embeddings, along with carefully-designed, diverse datasets of varying sizes. We observe a number of different evaluation variables that can lead to different results (e.g., choice of similarity measure, classifier, label definitions). We find that degree distributions within nodes' local neighborhoods can lead to simple yet effective baselines in their own right and guide the future development of structural embedding. We hope that our findings can influence the design of further node embedding methods and also pave the way for more comprehensive and fair evaluation of structural embedding methods.

[155]  arXiv:2101.05734 [pdf, other]
Title: Phase-bounded finite element method for two-fluid incompressible flow systems
Journal-ref: International Journal of Multiphase Flow 117 (2019) 1-13
Subjects: Numerical Analysis (math.NA); Fluid Dynamics (physics.flu-dyn)

An understanding of the hydrodynamics of multiphase processes is essential for their design and operation. Multiphase computational fluid dynamics (CFD) simulations enable researchers to gain insight which is inaccessible experimentally. The model frequently used to simulate these processes is the two-fluid (Euler-Euler) model where fluids are treated as inter-penetrating continua. It is formulated for the multiphase flow regime where one phase is dispersed within another and enables simulation on experimentally relevant scales. Phase fractions are used to describe the composition of the mixture and are bounded quantities. Consequently, numerical solution methods used in simulations must preserve boundedness for accuracy and physical fidelity. In this work, a numerical method for the two-fluid model is developed in which phase fraction constraints are imposed through the use of an nonlinear variational inequality solver which implicitly imposes inequality constraints. The numerical method is verified and compared to an established explicit numerical method.

[156]  arXiv:2101.05735 [pdf, other]
Title: The Good, the Bad and the Ugly: Pitfalls and Best Practices in Automated Sound Static Analysis of Ethereum Smart Contracts
Subjects: Cryptography and Security (cs.CR)

Ethereum smart contracts are distributed programs running on top of the Ethereum blockchain. Since program flaws can cause significant monetary losses and can hardly be fixed due to the immutable nature of the blockchain, there is a strong need of automated analysis tools which provide formal security guarantees. Designing such analyzers, however, proved to be challenging and error-prone. We review the existing approaches to automated, sound, static analysis of Ethereum smart contracts and highlight prevalent issues in the state of the art. Finally, we overview eThor, a recent static analysis tool that we developed following a principled design and implementation approach based on rigorous semantic foundations to overcome the problems of past works.

[157]  arXiv:2101.05738 [pdf, other]
Title: A Pragmatic Approach for Hyper-Parameter Tuning in Search-based Test Case Generation
Subjects: Software Engineering (cs.SE)

Search-based test case generation, which is the application of meta-heuristic search for generating test cases, has been studied a lot in the literature, lately. Since, in theory, the performance of meta-heuristic search methods is highly dependent on their hyper-parameters, there is a need to study hyper-parameter tuning in this domain. In this paper, we propose a new metric ("Tuning Gain"), which estimates how cost-effective tuning a particular class is. We then predict "Tuning Gain" using static features of source code classes. Finally, we prioritize classes for tuning, based on the estimated "Tuning Gains" and spend the tuning budget only on the highly-ranked classes. To evaluate our approach, we exhaustively analyze 1,200 hyper-parameter configurations of a well-known search-based test generation tool (EvoSuite) for 250 classes of 19 projects from benchmarks such as SF110 and SBST2018 tool competition. We used a tuning approach called Meta-GA and compared the tuning results with and without the proposed class prioritization. The results show that for a low tuning budget, prioritizing classes outperforms the alternatives in terms of extra covered branches (10 times more than a traditional global tuning). In addition, we report the impact of different features of our approach such as search space size, tuning budgets, tuning algorithms, and the number of classes to tune, on the final results.

[158]  arXiv:2101.05754 [pdf, ps, other]
Title: A Strong Bisimulation for Control Operators by Means of Multiplicative and Exponential Reduction
Comments: arXiv admin note: text overlap with arXiv:1906.09370
Subjects: Logic in Computer Science (cs.LO)

The purpose of this paper is to identify programs with control operators whose reduction semantics are in exact correspondence. This is achieved by introducing a relation $\simeq$, defined over a revised presentation of Parigot's $\lambda\mu$-calculus we dub $\Lambda M$.
Our result builds on three main ingredients which guide our semantical development: (1) factorization of Parigot's $\lambda\mu$-reduction into multiplicative and exponential steps by means of explicit operators, (2) adaptation of Laurent's original $\simeq_\sigma$-equivalence to $\Lambda M$, and (3) interpretation of $\Lambda M$ into Laurent's polarized proof-nets (PPN). More precisely, we first give a translation of $\Lambda M$-terms into PPN which simulates the reduction relation of our calculus via cut elimination of PPN. Second, we establish a precise correspondence between our relation $\simeq$ and Laurent's $\simeq_\sigma$-equivalence for $\lambda\mu$-terms. Moreover, $\simeq$-equivalent terms translate to structurally equivalent PPN. Most notably, $\simeq$ is shown to be a strong bisimulation with respect to reduction in $\Lambda M$, i.e. two $\simeq$-equivalent terms have the exact same reduction semantics, a result which fails for Regnier's $\simeq_\sigma$-equivalence in $\lambda$-calculus as well as for Laurent's $\simeq_\sigma$-equivalence in $\lambda\mu$.

[159]  arXiv:2101.05766 [pdf, other]
Title: Ajalon: Simplifying the Authoring of Wearable Cognitive Assistants
Subjects: Human-Computer Interaction (cs.HC)

Wearable Cognitive Assistance (WCA) amplifies human cognition in real time through a wearable device and low-latency wireless access to edge computing infrastructure. It is inspired by, and broadens, the metaphor of GPS navigation tools that provide real-time step-by-step guidance, with prompt error detection and correction. WCA applications are likely to be transformative in education, health care, industrial troubleshooting, manufacturing, and many other areas. Today, WCA application development is difficult and slow, requiring skills in areas such as machine learning and computer vision that are not widespread among software developers. This paper describes Ajalon, an authoring toolchain for WCA applications that reduces the skill and effort needed at each step of the development pipeline. Our evaluation shows that Ajalon significantly reduces the effort needed to create new WCA applications.

[160]  arXiv:2101.05768 [pdf, other]
Title: How to Attack and Defend 5G Radio Access Network Slicing with Reinforcement Learning
Subjects: Networking and Internet Architecture (cs.NI)

Reinforcement learning (RL) for network slicing is considered in the 5G radio access network, where the base station, gNodeB, allocates resource blocks (RBs) to the requests of user equipments and maximizes the total reward of accepted requests over time. Based on adversarial machine learning, a novel over-the-air attack is introduced to manipulate the RL algorithm and disrupt 5G network slicing. Subject to an energy budget, the adversary observes the spectrum and builds its own RL-based surrogate model that selects which RBs to jam with the objective of maximizing the number of failed network slicing requests due to jammed RBs. By jamming the RBs, the adversary reduces the RL algorithm's reward. As this reward is used as the input to update the RL algorithm, the performance does not recover even after the adversary stops jamming. This attack is evaluated in terms of the recovery time and the (maximum and total) reward loss, and it is shown to be much more effective than benchmark (random and myopic) jamming attacks. Different reactive and proactive defense mechanisms (protecting the RL algorithm's updates or misleading the adversary's learning process) are introduced to show that it is viable to defend 5G network slicing against this attack.

[161]  arXiv:2101.05775 [pdf, other]
Title: $\text{O}^2$PF: Oversampling via Optimum-Path Forest for Breast Cancer Detection
Comments: 6 pages, 3 figures. 2020 IEEE 33rd International Symposium on Computer-Based Medical Systems (CBMS)
Subjects: Machine Learning (cs.LG)

Breast cancer is among the most deadly diseases, distressing mostly women worldwide. Although traditional methods for detection have presented themselves as valid for the task, they still commonly present low accuracies and demand considerable time and effort from professionals. Therefore, a computer-aided diagnosis (CAD) system capable of providing early detection becomes hugely desirable. In the last decade, machine learning-based techniques have been of paramount importance in this context, since they are capable of extracting essential information from data and reasoning about it. However, such approaches still suffer from imbalanced data, specifically on medical issues, where the number of healthy people samples is, in general, considerably higher than the number of patients. Therefore this paper proposes the $\text{O}^2$PF, a data oversampling method based on the unsupervised Optimum-Path Forest Algorithm. Experiments conducted over the full oversampling scenario state the robustness of the model, which is compared against three well-established oversampling methods considering three breast cancer and three general-purpose tasks for medical issues datasets.

[162]  arXiv:2101.05778 [pdf, other]
Title: Topological Deep Learning
Comments: 28 pages, 14 figures
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)

This work introduces the Topological CNN (TCNN), which encompasses several topologically defined convolutional methods. Manifolds with important relationships to the natural image space are used to parameterize image filters which are used as convolutional weights in a TCNN. These manifolds also parameterize slices in layers of a TCNN across which the weights are localized. We show evidence that TCNNs learn faster, on less data, with fewer learned parameters, and with greater generalizability and interpretability than conventional CNNs. We introduce and explore TCNN layers for both image and video data. We propose extensions to 3D images and 3D video.

[163]  arXiv:2101.05779 [pdf, other]
Title: Structured Prediction as Translation between Augmented Natural Languages
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL)

We propose a new framework, Translation between Augmented Natural Languages (TANL), to solve many structured prediction language tasks including joint entity and relation extraction, nested named entity recognition, relation classification, semantic role labeling, event extraction, coreference resolution, and dialogue state tracking. Instead of tackling the problem by training task-specific discriminative classifiers, we frame it as a translation task between augmented natural languages, from which the task-relevant information can be easily extracted. Our approach can match or outperform task-specific models on all tasks, and in particular, achieves new state-of-the-art results on joint entity and relation extraction (CoNLL04, ADE, NYT, and ACE2005 datasets), relation classification (FewRel and TACRED), and semantic role labeling (CoNLL-2005 and CoNLL-2012). We accomplish this while using the same architecture and hyperparameters for all tasks and even when training a single model to solve all tasks at the same time (multi-task learning). Finally, we show that our framework can also significantly improve the performance in a low-resource regime, thanks to better use of label semantics.

[164]  arXiv:2101.05781 [pdf, other]
Title: Time-Based CAN Intrusion Detection Benchmark
Authors: Deborah H. Blevins (1), Pablo Moriano (2), Robert A. Bridges (2), Miki E. Verma (2), Michael D. Iannacone (2), Samuel C Hollifield (2) ((1) University of Kentucky, (2) Oak Ridge National Laboratory)
Comments: 7 pages, 2 figures
Subjects: Cryptography and Security (cs.CR); Machine Learning (cs.LG)

Modern vehicles are complex cyber-physical systems made of hundreds of electronic control units (ECUs) that communicate over controller area networks (CANs). This inherited complexity has expanded the CAN attack surface which is vulnerable to message injection attacks. These injections change the overall timing characteristics of messages on the bus, and thus, to detect these malicious messages, time-based intrusion detection systems (IDSs) have been proposed. However, time-based IDSs are usually trained and tested on low-fidelity datasets with unrealistic, labeled attacks. This makes difficult the task of evaluating, comparing, and validating IDSs. Here we detail and benchmark four time-based IDSs against the newly published ROAD dataset, the first open CAN IDS dataset with real (non-simulated) stealthy attacks with physically verified effects. We found that methods that perform hypothesis testing by explicitly estimating message timing distributions have lower performance than methods that seek anomalies in a distribution-related statistic. In particular, these "distribution-agnostic" based methods outperform "distribution-based" methods by at least 55% in area under the precision-recall curve (AUC-PR). Our results expand the body of knowledge of CAN time-based IDSs by providing details of these methods and reporting their results when tested on datasets with real advanced attacks. Finally, we develop an after-market plug-in detector using lightweight hardware, which can be used to deploy the best performing IDS method on nearly any vehicle.

[165]  arXiv:2101.05783 [pdf, other]
Title: Persistent Anti-Muslim Bias in Large Language Models
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)

It has been observed that large-scale language models capture undesirable societal biases, e.g. relating to race and gender; yet religious bias has been relatively unexplored. We demonstrate that GPT-3, a state-of-the-art contextual language model, captures persistent Muslim-violence bias. We probe GPT-3 in various ways, including prompt completion, analogical reasoning, and story generation, to understand this anti-Muslim bias, demonstrating that it appears consistently and creatively in different uses of the model and that it is severe even compared to biases about other religious groups. For instance, "Muslim" is analogized to "terrorist" in 23% of test cases, while "Jewish" is mapped to "money" in 5% of test cases. We quantify the positive distraction needed to overcome this bias with adversarial text prompts, and find that use of the most positive 6 adjectives reduces violent completions for "Muslims" from 66% to 20%, but which is still higher than for other religious groups.

[166]  arXiv:2101.05784 [pdf, other]
Title: Quantitative View of the Structure of Institutional Scientific Collaborations Using the Examples of Halle, Jena and Leipzig
Comments: 18 pages, 5 figures, 5 tables
Subjects: Digital Libraries (cs.DL); Social and Information Networks (cs.SI)

Examining effectiveness of institutional scientific coalitions can inform future policies. This is a study on the structure of scientific collaborations in three cities in central Germany. Since 1995, the three universities of this region have formed and maintained a coalition which led to the establishment of an interdisciplinary center in 2012, i.e., German Center for Integrative Biodiversity Research (iDiv). We investigate whether the impact of the former coalition is evident in the region's structure of scientific collaborations and the scientific output of the new center. Using publications data from 1996-2018, we build co-authorship networks and identify the most cohesive communities in terms of collaboration, and compare them with communities identified based on publications presented as the scientific outcome of the coalition and new center on their website. Our results show that despite the highly cohesive structure of collaborations presented on the coalition website, there is still much potential to be realized. The newly established center has bridged the member institutions but not to a particularly strong level. We see that geographical proximity, collaboration policies, funding, and organizational structure alone do not ensure prosperous scientific collaboration structures. When new center's scientific output is compared with its regional context, observed trends become less conspicuous. Nevertheless, the level of success the coalition achieved could inform policy makers regarding other regions' scientific development plans.

[167]  arXiv:2101.05786 [pdf]
Title: Persuasive Natural Language Generation -- A Literature Review
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

This literature review focuses on the use of Natural Language Generation (NLG) to automatically detect and generate persuasive texts. Extending previous research on automatic identification of persuasion in text, we concentrate on generative aspects through conceptualizing determinants of persuasion in five business-focused categories: benevolence, linguistic appropriacy, logical argumentation, trustworthiness, tools and datasets. These allow NLG to increase an existing message's persuasiveness. Previous research illustrates key aspects in each of the above mentioned five categories. A research agenda to further study persuasive NLG is developed. The review includes analysis of seventy-seven articles, outlining the existing body of knowledge and showing the steady progress in this research field.

[168]  arXiv:2101.05787 [pdf, other]
Title: A Novel Physics-Based and Data-Supported Microstructure Model for Part-Scale Simulation of Ti-6Al-4V Selective Laser Melting
Comments: 32 pages, 14 figures
Subjects: Computational Engineering, Finance, and Science (cs.CE)

The elasto-plastic material behavior, material strength and failure modes of metals fabricated by additive manufacturing technologies are significantly determined by the underlying process-specific microstructure evolution. In this work a novel physics-based and data-supported phenomenological microstructure model for Ti-6Al-4V is proposed that is suitable for the part-scale simulation of selective laser melting processes. The model predicts spatially homogenized phase fractions of the most relevant microstructural species, namely the stable $\beta$-phase, the stable $\alpha_{\text{s}}$-phase as well as the metastable Martensite $\alpha_{\text{m}}$-phase, in a physically consistent manner. In particular, the modeled microstructure evolution, in form of diffusion-based and non-diffusional transformations, is a pure consequence of energy and mobility competitions among the different specifies, without the need for heuristic transformation criteria as often applied in existing models. The mathematically consistent formulation of the evolution equations in rate form renders the model suitable for the practically relevant scenario of temperature- or time-dependent diffusion coefficients, arbitrary temperature profiles, and multiple coexisting phases. Due to its physically motivated foundation, the proposed model requires only a minimal number of free parameters, which are determined in an inverse identification process considering a broad experimental data basis in form of time-temperature transformation diagrams. Subsequently, the predictive ability of the model is demonstrated by means of continuous cooling transformation diagrams, showing that experimentally observed characteristics such as critical cooling rates emerge naturally from the proposed microstructure model, instead of being enforced as heuristic transformation criteria.

[169]  arXiv:2101.05791 [pdf, other]
Title: U-Noise: Learnable Noise Masks for Interpretable Image Segmentation
Comments: Submitted to ICIP
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Deep Neural Networks (DNNs) are widely used for decision making in a myriad of critical applications, ranging from medical to societal and even judicial. Given the importance of these decisions, it is crucial for us to be able to interpret these models. We introduce a new method for interpreting image segmentation models by learning regions of images in which noise can be applied without hindering downstream model performance. We apply this method to segmentation of the pancreas in CT scans, and qualitatively compare the quality of the method to existing explainability techniques, such as Grad-CAM and occlusion sensitivity. Additionally we show that, unlike other methods, our interpretability model can be quantitatively evaluated based on the downstream performance over obscured images.

[170]  arXiv:2101.05792 [pdf, ps, other]
Title: Group Testing with a Graph Infection Spread Model
Subjects: Information Theory (cs.IT); Computers and Society (cs.CY); Data Structures and Algorithms (cs.DS); Networking and Internet Architecture (cs.NI); Signal Processing (eess.SP)

We propose a novel infection spread model based on a random connection graph which represents connections between $n$ individuals. Infection spreads via connections between individuals and this results in a probabilistic cluster formation structure as well as a non-i.i.d. (correlated) infection status for individuals. We propose a class of two-step sampled group testing algorithms where we exploit the known probabilistic infection spread model. We investigate the metrics associated with two-step sampled group testing algorithms. To demonstrate our results, for analytically tractable exponentially split cluster formation trees, we calculate the required number of tests and the expected number of false classifications in terms of the system parameters, and identify the trade-off between them. For such exponentially split cluster formation trees, for zero-error construction, we prove that the required number of tests is $O(\log_2n)$. Thus, for such cluster formation trees, our algorithm outperforms any zero-error non-adaptive group test, binary splitting algorithm, and Hwang's generalized binary splitting algorithm. Our results imply that, by exploiting probabilistic information on the connections of individuals, group testing can be used to reduce the number of required tests significantly even when infection rate is high, contrasting the prevalent belief that group testing is useful only when infection rate is low.

[171]  arXiv:2101.05795 [pdf, ps, other]
Title: A Metaheuristic-Driven Approach to Fine-Tune Deep Boltzmann Machines
Comments: 30 pages, 7 figures
Journal-ref: Applied Soft Computing 97 (2020): 105717
Subjects: Machine Learning (cs.LG)

Deep learning techniques, such as Deep Boltzmann Machines (DBMs), have received considerable attention over the past years due to the outstanding results concerning a variable range of domains. One of the main shortcomings of these techniques involves the choice of their hyperparameters, since they have a significant impact on the final results. This work addresses the issue of fine-tuning hyperparameters of Deep Boltzmann Machines using metaheuristic optimization techniques with different backgrounds, such as swarm intelligence, memory- and evolutionary-based approaches. Experiments conducted in three public datasets for binary image reconstruction showed that metaheuristic techniques can obtain reasonable results.

[172]  arXiv:2101.05796 [pdf, other]
Title: DeFlow: Learning Complex Image Degradations from Unpaired Data with Conditional Flows
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)

The difficulty of obtaining paired data remains a major bottleneck for learning image restoration and enhancement models for real-world applications. Current strategies aim to synthesize realistic training data by modeling noise and degradations that appear in real-world settings. We propose DeFlow, a method for learning stochastic image degradations from unpaired data. Our approach is based on a novel unpaired learning formulation for conditional normalizing flows. We model the degradation process in the latent space of a shared flow encoder-decoder network. This allows us to learn the conditional distribution of a noisy image given the clean input by solely minimizing the negative log-likelihood of the marginal distributions. We validate our DeFlow formulation on the task of joint image restoration and super-resolution. The models trained with the synthetic data generated by DeFlow outperform previous learnable approaches on all three datasets.

Cross-lists for Fri, 15 Jan 21

[173]  arXiv:2009.01678 (cross-list from math.PR) [pdf, ps, other]
Title: Hamilton-Jacobi equations for inference of matrix tensor products
Comments: 44 pages
Subjects: Probability (math.PR); Information Theory (cs.IT)

We study the high-dimensional limit of the free energy associated with the inference problem of finite-rank matrix tensor products. In general, we bound the limit from above by the unique solution to a certain Hamilton-Jacobi equation. Under additional assumptions on the nonlinearity in the equation which is determined explicitly by the model, we identify the limit with the solution. Two notions of solutions, weak solutions and viscosity solutions, are considered, each of which has its own advantages and requires different treatments. For concreteness, we apply our results to a model with i.i.d. entries and symmetric interactions. In particular, for the first order and even order tensor products, we identify the limit and obtain estimates on convergence rates; for other odd orders, upper bounds are obtained.

[174]  arXiv:2101.05313 (cross-list from eess.AS) [pdf, other]
Title: Whispered and Lombard Neural Speech Synthesis
Comments: To appear in SLT 2021
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)

It is desirable for a text-to-speech system to take into account the environment where synthetic speech is presented, and provide appropriate context-dependent output to the user. In this paper, we present and compare various approaches for generating different speaking styles, namely, normal, Lombard, and whisper speech, using only limited data. The following systems are proposed and assessed: 1) Pre-training and fine-tuning a model for each style. 2) Lombard and whisper speech conversion through a signal processing based approach. 3) Multi-style generation using a single model based on a speaker verification model. Our mean opinion score and AB preference listening tests show that 1) we can generate high quality speech through the pre-training/fine-tuning approach for all speaking styles. 2) Although our speaker verification (SV) model is not explicitly trained to discriminate different speaking styles, and no Lombard and whisper voice is used for pre-training this system, the SV model can be used as a style encoder for generating different style embeddings as input for the Tacotron system. We also show that the resulting synthetic Lombard speech has a significant positive impact on intelligibility gain.

[175]  arXiv:2101.05326 (cross-list from eess.IV) [pdf, other]
Title: Advancing Eosinophilic Esophagitis Diagnosis and Phenotype Assessment with Deep Learning Computer Vision
Comments: This paper contains 12 pages, 9 figures, and 7 tables
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

Eosinophilic Esophagitis (EoE) is an inflammatory esophageal disease which is increasing in prevalence. The diagnostic gold-standard involves manual review of a patient's biopsy tissue sample by a clinical pathologist for the presence of 15 or greater eosinophils within a single high-power field (400x magnification). Diagnosing EoE can be a cumbersome process with added difficulty for assessing the severity and progression of disease. We propose an automated approach for quantifying eosinophils using deep image segmentation. A U-Net model and post-processing system are applied to generate eosinophil-based statistics that can diagnose EoE as well as describe disease severity and progression. These statistics are captured in biopsies at the initial EoE diagnosis and are then compared with patient metadata: clinical and treatment phenotypes. The goal is to find linkages that could potentially guide treatment plans for new patients at their initial disease diagnosis. A deep image classification model is further applied to discover features other than eosinophils that can be used to diagnose EoE. This is the first study to utilize a deep learning computer vision approach for EoE diagnosis and to provide an automated process for tracking disease severity and progression.

[176]  arXiv:2101.05339 (cross-list from cond-mat.mtrl-sci) [pdf, other]
Title: Accelerating the screening of amorphous polymer electrolytes by learning to reduce random and systematic errors in molecular dynamics simulations
Comments: 25 pages, 5 figures + supplementary information
Subjects: Materials Science (cond-mat.mtrl-sci); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

Machine learning has been widely adopted to accelerate the screening of materials. Most existing studies implicitly assume that the training data are generated through a deterministic, unbiased process, but this assumption might not hold for the simulation of some complex materials. In this work, we aim to screen amorphous polymer electrolytes which are promising candidates for the next generation lithium-ion battery technology but extremely expensive to simulate due to their structural complexity. We demonstrate that a multi-task graph neural network can learn from a large amount of noisy, biased data and a small number of unbiased data and reduce both random and systematic errors in predicting the transport properties of polymer electrolytes. This observation allows us to achieve accurate predictions on the properties of complex materials by learning to reduce errors in the training data, instead of running repetitive, expensive simulations which is conventionally used to reduce simulation errors. With this approach, we screen a space of 6247 polymer electrolytes, orders of magnitude larger than previous computational studies. We also find a good extrapolation performance to the top polymers from a larger space of 53362 polymers and 31 experimentally-realized polymers. The strategy employed in this work may be applicable to a broad class of material discovery problems that involve the simulation of complex, amorphous materials.

[177]  arXiv:2101.05365 (cross-list from econ.GN) [pdf]
Title: Scared into Action: How Partisanship and Fear are Associated with Reactions to Public Health Directives
Comments: 54 pages, 11 figures
Subjects: General Economics (econ.GN); Computation and Language (cs.CL); Computation (stat.CO)

Differences in political ideology are increasingly appearing as an impediment to successful bipartisan communication from local leadership. For example, recent empirical findings have shown that conservatives are less likely to adhere to COVID-19 health directives. This behavior is in direct contradiction to past research which indicates that conservatives are more rule abiding, prefer to avoid loss, and are more prevention-motivated than liberals. We reconcile this disconnect between recent empirical findings and past research by using insights gathered from press releases, millions of tweets, and mobility data capturing local movement in retail, grocery, workplace, parks, and transit domains during COVID-19 shelter-in-place orders. We find that conservatives adhere to health directives when they express more fear of the virus. In order to better understand this phenomenon, we analyze both official and citizen communications and find that press releases from local and federal government, along with the number of confirmed COVID-19 cases, lead to an increase in expressions of fear on Twitter.

[178]  arXiv:2101.05402 (cross-list from math.ST) [pdf, other]
Title: Optimal Clustering in Anisotropic Gaussian Mixture Models
Subjects: Statistics Theory (math.ST); Machine Learning (cs.LG); Machine Learning (stat.ML)

We study the clustering task under anisotropic Gaussian Mixture Models where the covariance matrices from different clusters are unknown and are not necessarily the identical matrix. We characterize the dependence of signal-to-noise ratios on the cluster centers and covariance matrices and obtain the minimax lower bound for the clustering problem. In addition, we propose a computationally feasible procedure and prove it achieves the optimal rate within a few iterations. The proposed procedure is a hard EM type algorithm, and it can also be seen as a variant of the Lloyd's algorithm that is adjusted to the anisotropic covariance matrices.

[179]  arXiv:2101.05404 (cross-list from cond-mat.quant-gas) [pdf, other]
Title: Machine-learning enhanced dark soliton detection in Bose-Einstein condensates
Comments: 17 pages, 5 figures
Subjects: Quantum Gases (cond-mat.quant-gas); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Quantum Physics (quant-ph)

Most data in cold-atom experiments comes from images, the analysis of which is limited by our preconceptions of the patterns that could be present in the data. We focus on the well-defined case of detecting dark solitons -- appearing as local density depletions in a BEC -- using a methodology that is extensible to the general task of pattern recognition in images of cold atoms. Studying soliton dynamics over a wide range of parameters requires the analysis of large datasets, making the existing human-inspection-based methodology a significant bottleneck. Here we describe an automated classification and positioning system for identifying localized excitations in atomic Bose-Einstein condensates (BECs) utilizing deep convolutional neural networks to eliminate the need for human image examination. Furthermore, we openly publish our labeled dataset of dark solitons, the first of its kind, for further machine learning research.

[180]  arXiv:2101.05410 (cross-list from eess.IV) [pdf, other]
Title: A Multi-Stage Attentive Transfer Learning Framework for Improving COVID-19 Diagnosis
Authors: Yi Liu, Shuiwang Ji
Comments: 12 pages, 4 figures, 6 tables
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

Computed tomography (CT) imaging is a promising approach to diagnosing the COVID-19. Machine learning methods can be employed to train models from labeled CT images and predict whether a case is positive or negative. However, there exists no publicly-available and large-scale CT data to train accurate models. In this work, we propose a multi-stage attentive transfer learning framework for improving COVID-19 diagnosis. Our proposed framework consists of three stages to train accurate diagnosis models through learning knowledge from multiple source tasks and data of different domains. Importantly, we propose a novel self-supervised learning method to learn multi-scale representations for lung CT images. Our method captures semantic information from the whole lung and highlights the functionality of each lung region for better representation learning. The method is then integrated to the last stage of the proposed transfer learning framework to reuse the complex patterns learned from the same CT images. We use a base model integrating self-attention (ATTNs) and convolutional operations. Experimental results show that networks with ATTNs induce greater performance improvement through transfer learning than networks without ATTNs. This indicates attention exhibits higher transferability than convolution. Our results also show that the proposed self-supervised learning method outperforms several baseline methods.

[181]  arXiv:2101.05434 (cross-list from eess.IV) [pdf, other]
Title: A Unified Conditional Disentanglement Framework for Multimodal Brain MR Image Translation
Comments: Accepted to IEEE International Symposium on Biomedical Imaging (ISBI) 2021
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

Multimodal MRI provides complementary and clinically relevant information to probe tissue condition and to characterize various diseases. However, it is often difficult to acquire sufficiently many modalities from the same subject due to limitations in study plans, while quantitative analysis is still demanded. In this work, we propose a unified conditional disentanglement framework to synthesize any arbitrary modality from an input modality. Our framework hinges on a cycle-constrained conditional adversarial training approach, where it can extract a modality-invariant anatomical feature with a modality-agnostic encoder and generate a target modality with a conditioned decoder. We validate our framework on four MRI modalities, including T1-weighted, T1 contrast enhanced, T2-weighted, and FLAIR MRI, from the BraTS'18 database, showing superior performance on synthesis quality over the comparison methods. In addition, we report results from experiments on a tumor segmentation task carried out with synthesized data.

[182]  arXiv:2101.05439 (cross-list from eess.IV) [pdf, other]
Title: Dual-cycle Constrained Bijective VAE-GAN For Tagged-to-Cine Magnetic Resonance Image Synthesis
Comments: Accepted to IEEE International Symposium on Biomedical Imaging (ISBI) 2021
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

Tagged magnetic resonance imaging (MRI) is a widely used imaging technique for measuring tissue deformation in moving organs. Due to tagged MRI's intrinsic low anatomical resolution, another matching set of cine MRI with higher resolution is sometimes acquired in the same scanning session to facilitate tissue segmentation, thus adding extra time and cost. To mitigate this, in this work, we propose a novel dual-cycle constrained bijective VAE-GAN approach to carry out tagged-to-cine MR image synthesis. Our method is based on a variational autoencoder backbone with cycle reconstruction constrained adversarial training to yield accurate and realistic cine MR images given tagged MR images. Our framework has been trained, validated, and tested using 1,768, 416, and 1,560 subject-independent paired slices of tagged and cine MRI from twenty healthy subjects, respectively, demonstrating superior performance over the comparison methods. Our method can potentially be used to reduce the extra acquisition time and cost, while maintaining the same workflow for further motion analyses.

[183]  arXiv:2101.05442 (cross-list from eess.IV) [pdf, other]
Title: Automated Model Design and Benchmarking of 3D Deep Learning Models for COVID-19 Detection with Chest CT Scans
Comments: Accepted by AAAI 2021, COVID-19, Neural Architecture Search, AutoML
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

The COVID-19 pandemic has spread globally for several months. Because its transmissibility and high pathogenicity seriously threaten people's lives, it is crucial to accurately and quickly detect COVID-19 infection. Many recent studies have shown that deep learning (DL) based solutions can help detect COVID-19 based on chest CT scans. However, most existing work focuses on 2D datasets, which may result in low quality models as the real CT scans are 3D images. Besides, the reported results span a broad spectrum on different datasets with a relatively unfair comparison. In this paper, we first use three state-of-the-art 3D models (ResNet3D101, DenseNet3D121, and MC3\_18) to establish the baseline performance on the three publicly available chest CT scan datasets. Then we propose a differentiable neural architecture search (DNAS) framework to automatically search for the 3D DL models for 3D chest CT scans classification with the Gumbel Softmax technique to improve the searching efficiency. We further exploit the Class Activation Mapping (CAM) technique on our models to provide the interpretability of the results. The experimental results show that our automatically searched models (CovidNet3D) outperform the baseline human-designed models on the three datasets with tens of times smaller model size and higher accuracy. Furthermore, the results also verify that CAM can be well applied in CovidNet3D for COVID-19 datasets to provide interpretability for medical diagnosis.

[184]  arXiv:2101.05477 (cross-list from math.ST) [pdf, other]
Title: Optimal network online change point localisation
Subjects: Statistics Theory (math.ST); Machine Learning (cs.LG)

We study the problem of online network change point detection. In this setting, a collection of independent Bernoulli networks is collected sequentially, and the underlying distributions change when a change point occurs. The goal is to detect the change point as quickly as possible, if it exists, subject to a constraint on the number or probability of false alarms. In this paper, on the detection delay, we establish a minimax lower bound and two upper bounds based on NP-hard algorithms and polynomial-time algorithms, i.e., \[ \mbox{detection delay} \begin{cases} \gtrsim \log(1/\alpha) \frac{\max\{r^2/n, \, 1\}}{\kappa_0^2 n \rho},\\ \lesssim \log(\Delta/\alpha) \frac{\max\{r^2/n, \, \log(r)\}}{\kappa_0^2 n \rho}, & \mbox{with NP-hard algorithms},\\ \lesssim \log(\Delta/\alpha) \frac{r}{\kappa_0^2 n \rho}, & \mbox{with polynomial-time algorithms}, \end{cases} \] where $\kappa_0, n, \rho, r$ and $\alpha$ are the normalised jump size, network size, entrywise sparsity, rank sparsity and the overall Type-I error upper bound. All the model parameters are allowed to vary as $\Delta$, the location of the change point, diverges. The polynomial-time algorithms are novel procedures that we propose in this paper, designed for quick detection under two different forms of Type-I error control. The first is based on controlling the overall probability of a false alarm when there are no change points, and the second is based on specifying a lower bound on the expected time of the first false alarm. Extensive experiments show that, under different scenarios and the aforementioned forms of Type-I error control, our proposed approaches outperform state-of-the-art methods.

[185]  arXiv:2101.05516 (cross-list from eess.AS) [pdf, other]
Title: Speaker activity driven neural speech extraction
Comments: Submitted to ICASSP'21
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)

Target speech extraction, which extracts the speech of a target speaker in a mixture given auxiliary speaker clues, has recently received increased interest. Various clues have been investigated such as pre-recorded enrollment utterances, direction information, or video of the target speaker. In this paper, we explore the use of speaker activity information as an auxiliary clue for single-channel neural network-based speech extraction. We propose a speaker activity driven speech extraction neural network (ADEnet) and show that it can achieve performance levels competitive with enrollment-based approaches, without the need for pre-recordings. We further demonstrate the potential of the proposed approach for processing meeting-like recordings, where speaker activity obtained from a diarization system is used as a speaker clue for ADEnet. We show that this simple yet practical approach can successfully extract speakers after diarization, which leads to improved ASR performance when using a single microphone, especially in high overlapping conditions, with a relative word error rate reduction of up to 25 %.

[186]  arXiv:2101.05525 (cross-list from eess.AS) [pdf, other]
Title: An evaluation of word-level confidence estimation for end-to-end automatic speech recognition
Comments: Accepted at SLT 2021
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)

Quantifying the confidence (or conversely the uncertainty) of a prediction is a highly desirable trait of an automatic system, as it improves the robustness and usefulness in downstream tasks. In this paper we investigate confidence estimation for end-to-end automatic speech recognition (ASR). Previous work has addressed confidence measures for lattice-based ASR, while current machine learning research mostly focuses on confidence measures for unstructured deep learning. However, as the ASR systems are increasingly being built upon deep end-to-end methods, there is little work that tries to develop confidence measures in this context. We fill this gap by providing an extensive benchmark of popular confidence methods on four well-known speech datasets. There are two challenges we overcome in adapting existing methods: working on structured data (sequences) and obtaining confidences at a coarser level than the predictions (words instead of tokens). Our results suggest that a strong baseline can be obtained by scaling the logits by a learnt temperature, followed by estimating the confidence as the negative entropy of the predictive distribution and, finally, sum pooling to aggregate at word level.

[187]  arXiv:2101.05546 (cross-list from q-bio.GN) [pdf]
Title: Feature reduction for machine learning on molecular features: The GeneScore
Comments: 11 pages, 9 figures, 4 tables
Subjects: Genomics (q-bio.GN); Machine Learning (cs.LG)

We present the GeneScore, a concept of feature reduction for Machine Learning analysis of biomedical data. Using expert knowledge, the GeneScore integrates different molecular data types into a single score. We show that the GeneScore is superior to a binary matrix in the classification of cancer entities from SNV, Indel, CNV, gene fusion and gene expression data. The GeneScore is a straightforward way to facilitate state-of-the-art analysis, while making use of the available scientific knowledge on the nature of molecular data features used.

[188]  arXiv:2101.05560 (cross-list from quant-ph) [pdf, ps, other]
Title: Secure Multi-Party Quantum Conference and Xor Computation
Comments: Accepted in Quantum Information and Computation
Subjects: Quantum Physics (quant-ph); Cryptography and Security (cs.CR)

Quantum conference is a process of securely exchanging messages between three or more parties, using quantum resources. A Measurement Device Independent Quantum Dialogue (MDI-QD) protocol, which is secure against information leakage, has been proposed (Quantum Information Processing 16.12 (2017): 305) in 2017, is proven to be insecure against intercept-and-resend attack strategy. We first modify this protocol and generalize this MDI-QD to a three-party quantum conference and then to a multi-party quantum conference. We also propose a protocol for quantum multi-party XOR computation. None of these three protocols proposed here use entanglement as a resource and we prove the correctness and security of our proposed protocols.

[189]  arXiv:2101.05573 (cross-list from math.OC) [pdf, ps, other]
Title: On the design of terminal ingredients for data-driven MPC
Subjects: Optimization and Control (math.OC); Systems and Control (eess.SY)

We present a model predictive control (MPC) scheme to control unknown linear time-invariant systems using only measured input-output data and no model knowledge. The scheme includes a terminal cost and a terminal set constraint on an extended state containing past input-output values. We provide an explicit design procedure for the corresponding terminal ingredients that only uses measured input-output data. Further, we prove that the MPC scheme based on these terminal ingredients exponentially stabilizes the desired setpoint in closed loop. Finally, we illustrate the advantages over existing methods with a numerical example.

[190]  arXiv:2101.05580 (cross-list from physics.soc-ph) [pdf, other]
Title: Should the government reward cooperation? Insights from an agent-based model of wealth redistribution
Subjects: Physics and Society (physics.soc-ph); Multiagent Systems (cs.MA); General Economics (econ.GN); Adaptation and Self-Organizing Systems (nlin.AO)

In our multi-agent model agents generate wealth from repeated interactions for which a prisoner's dilemma payoff matrix is assumed. Their gains are taxed by a government at a rate $\alpha$. The resulting budget is spent to cover administrative costs and to pay a bonus to cooperative agents, which can be identified correctly only with a probability $p$. Agents decide at each time step to choose either cooperation or defection based on different information. In the local scenario, they compare their potential gains from both strategies. In the global scenario, they compare the gains of the cooperative and defective subpopulations. We derive analytical expressions for the critical bonus needed to make cooperation as attractive as defection. We show that for the local scenario the government can establish only a medium level of cooperation, because the critical bonus increases with the level of cooperation. In the global scenario instead full cooperation can be achieved once the cold-start problem is solved, because the critical bonus decreases with the level of cooperation. This allows to lower the tax rate, while maintaining high cooperation.

[191]  arXiv:2101.05581 (cross-list from math.DS) [pdf, ps, other]
Title: Uncertainty Quantification of Bifurcations in Random Ordinary Differential Equations
Subjects: Dynamical Systems (math.DS); Numerical Analysis (math.NA); Probability (math.PR)

We are concerned with random ordinary differential equations (RODEs). Our main question of interest is how uncertainties in system parameters propagate through the possibly highly nonlinear dynamical system and affect the system's bifurcation behavior. We come up with a methodology to determine the probability of the occurrence of different types of bifurcations based on the probability distribution of the input parameters. In a first step, we reduce the system's behavior to the dynamics on its center manifold. We thereby still capture the major qualitative behavior of the RODEs. In a second step, we analyze the reduced RODEs and quantify the probability of the occurrence of different types of bifurcations based on the (nonlinear) functional appearance of uncertain parameters. To realize this major step, we present three approaches: an analytical one, where the probability can be calculated explicitly based on Mellin transformation and inversion, a semi-analytical one consisting of a combination of the analytical approach with a moment-based numerical estimation procedure, and a particular sampling-based approach using unscented transformation. We complement our new methodology with various numerical examples.

[192]  arXiv:2101.05657 (cross-list from math.OC) [pdf, other]
Title: No-go Theorem for Acceleration in the Hyperbolic Plane
Comments: 12 pages
Subjects: Optimization and Control (math.OC); Data Structures and Algorithms (cs.DS); Machine Learning (cs.LG); Machine Learning (stat.ML)

In recent years there has been significant effort to adapt the key tools and ideas in convex optimization to the Riemannian setting. One key challenge has remained: Is there a Nesterov-like accelerated gradient method for geodesically convex functions on a Riemannian manifold? Recent work has given partial answers and the hope was that this ought to be possible.
Here we dash these hopes. We prove that in a noisy setting, there is no analogue of accelerated gradient descent for geodesically convex functions on the hyperbolic plane. Our results apply even when the noise is exponentially small. The key intuition behind our proof is short and simple: In negatively curved spaces, the volume of a ball grows so fast that information about the past gradients is not useful in the future.

[193]  arXiv:2101.05677 (cross-list from stat.OT) [pdf, other]
Title: Improving non-deterministic uncertainty modelling in Industry 4.0 scheduling
Subjects: Other Statistics (stat.OT); Artificial Intelligence (cs.AI); Applications (stat.AP)

The latest Industrial revolution has helped industries in achieving very high rates of productivity and efficiency. It has introduced data aggregation and cyber-physical systems to optimize planning and scheduling. Although, uncertainty in the environment and the imprecise nature of human operators are not accurately considered for into the decision making process. This leads to delays in consignments and imprecise budget estimations. This widespread practice in the industrial models is flawed and requires rectification. Various other articles have approached to solve this problem through stochastic or fuzzy set model methods. This paper presents a comprehensive method to logically and realistically quantify the non-deterministic uncertainty through probabilistic uncertainty modelling. This method is applicable on virtually all Industrial data sets, as the model is self adjusting and uses epsilon-contamination to cater to limited or incomplete data sets. The results are numerically validated through an Industrial data set in Flanders, Belgium. The data driven results achieved through this robust scheduling method illustrate the improvement in performance.

[194]  arXiv:2101.05679 (cross-list from stat.ML) [pdf, other]
Title: Convex Smoothed Autoencoder-Optimal Transport model
Authors: Aratrika Mustafi
Comments: 26 pages
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)

Generative modelling is a key tool in unsupervised machine learning which has achieved stellar success in recent years. Despite this huge success, even the best generative models such as Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs) come with their own shortcomings, mode collapse and mode mixture being the two most prominent problems. In this paper we develop a new generative model capable of generating samples which resemble the observed data, and is free from mode collapse and mode mixture. Our model is inspired by the recently proposed Autoencoder-Optimal Transport (AE-OT) model and tries to improve on it by addressing the problems faced by the AE-OT model itself, specifically with respect to the sample generation algorithm. Theoretical results concerning the bound on the error in approximating the non-smooth Brenier potential by its smoothed estimate, and approximating the discontinuous optimal transport map by a smoothed optimal transport map estimate have also been established in this paper.

[195]  arXiv:2101.05695 (cross-list from eess.AS) [pdf, other]
Title: EmoCat: Language-agnostic Emotional Voice Conversion
Comments: Submitted to IEEE ICASSP 2021
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)

Emotional voice conversion models adapt the emotion in speech without changing the speaker identity or linguistic content. They are less data hungry than text-to-speech models and allow to generate large amounts of emotional data for downstream tasks. In this work we propose EmoCat, a language-agnostic emotional voice conversion model. It achieves high-quality emotion conversion in German with less than 45 minutes of German emotional recordings by exploiting large amounts of emotional data in US English. EmoCat is an encoder-decoder model based on CopyCat, a voice conversion system which transfers prosody. We use adversarial training to remove emotion leakage from the encoder to the decoder. The adversarial training is improved by a novel contribution to gradient reversal to truly reverse gradients. This allows to remove only the leaking information and to converge to better optima with higher conversion performance. Evaluations show that Emocat can convert to different emotions but misses on emotion intensity compared to the recordings, especially for very expressive emotions. EmoCat is able to achieve audio quality on par with the recordings for five out of six tested emotion intensities.

[196]  arXiv:2101.05726 (cross-list from math.AP) [pdf, other]
Title: A degenerate elliptic-parabolic system arising in competitive contaminant transport
Comments: old paper accepted in 2017 but not uploaded to arxiv
Journal-ref: Journal of Mathematical Analysis and Applications, 457 (2018), pp. 77-103
Subjects: Analysis of PDEs (math.AP); Numerical Analysis (math.NA)

In this work we investigate a coupled system of degenerate and nonlinear partial differential equations governing the transport of reactive solutes in groundwater. We show that the system admits a unique weak solution provided the nonlinear adsorption isotherm associated with the reaction process satisfies certain physically reasonable structural conditions. We conclude, moreover, that the solute concentrations stay non-negative if the source term is componentwise non-negative and investigate numerically the finite speed of propagation of compactly supported initial concentrations, in a two-component test case.

[197]  arXiv:2101.05744 (cross-list from stat.OT) [pdf, ps, other]
Title: A comparative study of scoring systems by simulations
Authors: László Csató
Comments: 14 pages, 4 figures, 5 tables
Subjects: Other Statistics (stat.OT); Computer Science and Game Theory (cs.GT); General Economics (econ.GN)

Scoring rules aggregate individual rankings by assigning some points to each position in each ranking such that the total sum of points provides the overall ranking of the alternatives. They are widely used in sports competitions consisting of multiple contests. We study the tradeoff between two risks in this setting: (1) the threat of early clinch when the title has been clinched before the last contest(s) of the competition take place; (2) the danger of winning the competition without finishing first in any contest. In particular, four historical points scoring systems of the Formula One World Championship are compared with the family of geometric scoring rules that have favourable axiomatic properties. The formers are found to be competitive or even better. The current scheme seems to be a reasonable compromise in optimising the above goals. Our results shed more light on the evolution of the Formula One points scoring systems and contribute to the issue of choosing the set of point values.

[198]  arXiv:2101.05782 (cross-list from astro-ph.IM) [pdf, other]
Title: Checkpoint, Restore, and Live Migration for Science Platforms
Comments: 4 pages, 2 figures, to appear in the Proceedings of ADASS XXX
Subjects: Instrumentation and Methods for Astrophysics (astro-ph.IM); Distributed, Parallel, and Cluster Computing (cs.DC)

We demonstrate a fully functional implementation of (per-user) checkpoint, restore, and live migration capabilities for JupyterHub platforms. Checkpointing -- the ability to freeze and suspend to disk the running state (contents of memory, registers, open files, etc.) of a set of processes -- enables the system to snapshot a user's Jupyter session to permanent storage. The restore functionality brings a checkpointed session back to a running state, to continue where it left off at a later time and potentially on a different machine. Finally, live migration enables moving running Jupyter notebook servers between different machines, transparent to the analysis code and w/o disconnecting the user. Our implementation of these capabilities works at the system level, with few limitations, and typical checkpoint/restore times of O(10s) with a pathway to O(1s) live migrations. It opens a myriad of interesting use cases, especially for cloud-based deployments: from checkpointing idle sessions w/o interruption of the user's work (achieving cost reductions of 4x or more), execution on spot instances w. transparent migration on eviction (with additional cost reductions up to 3x), to automated migration of workloads to ideally suited instances (e.g. moving an analysis to a machine with more or less RAM or cores based on observed resource utilization). The capabilities we demonstrate can make science platforms fully elastic while retaining excellent user experience.

Replacements for Fri, 15 Jan 21

[199]  arXiv:1202.0302 (replaced) [pdf, other]
Title: Kernels on Sample Sets via Nonparametric Divergence Estimates
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
[200]  arXiv:1502.01633 (replaced) [pdf, other]
Title: A Concurrency-Optimal List-Based Set
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
[201]  arXiv:1509.07553 (replaced) [pdf, other]
Title: Linear-time Learning on Distributions with Approximate Kernel Embeddings
Journal-ref: AAAI'16: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, February 2016, 2073-2079
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)
[202]  arXiv:1511.04150 (replaced) [pdf, other]
Title: Deep Mean Maps
Subjects: Machine Learning (stat.ML); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[203]  arXiv:1608.06212 (replaced) [pdf, ps, other]
Title: Datatype defining rewrite systems for naturals and integers
Comments: 32 pages; 15 tables. Changes wrt v4 and v3: the webarchive with TRSs and termination proofs is now hosted at arXiv; (S.1) a note on models for natural and integer numbers and ground-completeness has been added; (S.2) Def.2.1 on a DDRS has been added; (S.3) Table 4 and a remark on natural number arithmetic have been added; the main contents of arXiv:1406.3280v4 is subsumed in this paper
Subjects: Logic in Computer Science (cs.LO)
[204]  arXiv:1702.02982 (replaced) [pdf, other]
Title: Fixing an error in Caponnetto and de Vito (2007)
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Statistics Theory (math.ST)
[205]  arXiv:1702.07409 (replaced) [pdf, other]
Title: Founsure 1.0: An Erasure Code Library with Efficient Repair and Update Features
Comments: Accepted to Elsevier SoftwareX, 2021
Subjects: Information Theory (cs.IT)
[206]  arXiv:1712.01497 (replaced) [pdf, other]
Title: Atomic Norm Based Localization of Far-Field and Near-Field Signals with Generalized Symmetric Arrays
Subjects: Information Theory (cs.IT); Signal Processing (eess.SP)
[207]  arXiv:1801.01401 (replaced) [pdf, other]
Title: Demystifying MMD GANs
Comments: Published at ICLR 2018: this https URL
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)
[208]  arXiv:1804.07796 (replaced) [pdf, other]
Title: A second-order numerical method for the aggregation equations
Comments: Improved manuscript
Subjects: Numerical Analysis (math.NA)
[209]  arXiv:1805.11565 (replaced) [pdf, other]
Title: On gradient regularizers for MMD GANs
Comments: Code at this https URL
Journal-ref: Advances in Neural Information Processing Systems 31 (NeurIPS 2018), 6700-6710
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)
[210]  arXiv:1811.08357 (replaced) [pdf, other]
Title: Learning deep kernels for exponential family densities
Journal-ref: Proceedings of the 36th International Conference on Machine Learning (ICML 2019), PMLR 97:6737-6746
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Methodology (stat.ME)
[211]  arXiv:1812.01995 (replaced) [pdf, other]
Title: Deep Learning Model for Finding New Superconductors
Comments: 10 pages in main text. Deep learning, Machine learning, Material search, Superconductors
Journal-ref: Phys. Rev. B 103, 014509, (2021)
Subjects: Machine Learning (cs.LG); Materials Science (cond-mat.mtrl-sci); Superconductivity (cond-mat.supr-con); Computation and Language (cs.CL); Computational Physics (physics.comp-ph)
[212]  arXiv:1812.03894 (replaced) [pdf, other]
Title: Physics-Based Learning for Robotic Environmental Sensing
Comments: 20 pages, 26 figures
Subjects: Robotics (cs.RO); Machine Learning (cs.LG); Machine Learning (stat.ML)
[213]  arXiv:1901.02874 (replaced) [pdf, other]
Title: DUNEuro -- A software toolbox for forward modeling in bioelectromagnetism
Subjects: Mathematical Software (cs.MS); Neurons and Cognition (q-bio.NC)
[214]  arXiv:1904.12054 (replaced) [pdf, other]
Title: Benchmark and Survey of Automated Machine Learning Frameworks
Comments: Revised version accepted for publication at Journal of Artificial Intelligence Research (JAIR)
Journal-ref: Journal of Artificial Intelligence Research 70 (2021) 411-474
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
[215]  arXiv:1905.11762 (replaced) [pdf, other]
Title: On Mixing Eventual and Strong Consistency: Acute Cloud Types
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
[216]  arXiv:1906.02104 (replaced) [pdf, ps, other]
Title: Unbiased estimators for the variance of MMD estimators
Comments: Fixes and extends the appendices of arXiv:1611.04488 and arXiv:1511.04581
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)
[217]  arXiv:1906.05497 (replaced) [pdf, other]
Title: Deep Network Approximation Characterized by Number of Neurons
Journal-ref: Communications in Computational Physics, Volume 28, Issue 5, November 2020, Pages 1768-1811
Subjects: Numerical Analysis (math.NA); Machine Learning (cs.LG)
[218]  arXiv:1906.08189 (replaced) [pdf, other]
Title: Reward Prediction Error as an Exploration Objective in Deep RL
Comments: Published at IJCAI 2020, camera-ready version
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
[219]  arXiv:1906.10207 (replaced) [pdf, other]
Title: State estimation under attack in partially-observed discrete event systems
Comments: This work has been submitted to the journal "Automatica"
Subjects: Cryptography and Security (cs.CR); Formal Languages and Automata Theory (cs.FL)
[220]  arXiv:1906.10655 (replaced) [pdf, ps, other]
Title: Complexity of Highly Parallel Non-Smooth Convex Optimization
Subjects: Optimization and Control (math.OC); Data Structures and Algorithms (cs.DS); Machine Learning (cs.LG)
[221]  arXiv:1908.02503 (replaced) [pdf, other]
Title: Computational approaches to non-convex, sparsity-inducing multi-penalty regularization
Comments: 20 pages, 2 figures
Subjects: Information Theory (cs.IT)
[222]  arXiv:1909.05690 (replaced) [pdf, other]
Title: In Defense of LSTMs for Addressing Multiple Instance Learning Problems
Comments: accepted in ACCV 2020 (oral)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[223]  arXiv:1909.13377 (replaced) [pdf, other]
Title: Lane Attention: Predicting Vehicles' Moving Trajectories by Learning Their Attention over Lanes
Comments: IROS 2020
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO); Machine Learning (stat.ML)
[224]  arXiv:1910.04366 (replaced) [pdf, ps, other]
Title: Understanding Limitation of Two Symmetrized Orders by Worst-case Complexity
Comments: 31 pages, 9 tables
Subjects: Optimization and Control (math.OC); Machine Learning (cs.LG)
[225]  arXiv:1910.05177 (replaced) [pdf, other]
Title: IdBench: Evaluating Semantic Representations of Identifier Names in Source Code
Comments: Accepted as full research paper at International Conference on Software Engineering (ICSE) 2021
Subjects: Machine Learning (cs.LG); Programming Languages (cs.PL); Software Engineering (cs.SE); Machine Learning (stat.ML)
[226]  arXiv:1912.04215 (replaced) [pdf, other]
Title: Existence, uniqueness, and approximation of solutions of jump-diffusion SDEs with discontinuous drift
Subjects: Numerical Analysis (math.NA); Probability (math.PR)
[227]  arXiv:1912.08516 (replaced) [pdf, other]
Title: PCPATCH: software for the topological construction of multigrid relaxation methods
Comments: 22 pages, updated in response to reviews
Subjects: Mathematical Software (cs.MS); Numerical Analysis (math.NA)
[228]  arXiv:1912.11662 (replaced) [pdf, other]
Title: Terahertz Multi-User Massive MIMO with Intelligent Reflecting Surface: Beam Training and Hybrid Beamforming
Subjects: Information Theory (cs.IT); Signal Processing (eess.SP)
[229]  arXiv:2001.00137 (replaced) [pdf, other]
Title: Stacked DeBERT: All Attention in Incomplete Data for Text Classification
Comments: Published (this https URL), Code (this https URL)
Journal-ref: Neural Networks 136 (2021) 87-96
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)
[230]  arXiv:2001.04875 (replaced) [pdf, ps, other]
Title: Scalable distributed and decentralized $\mathscr{H}_2$ controller synthesis for interconnected linear discrete-time systems
Comments: Changes in this version include: overview of dissipativity-based results for interconnected systems is removed, result on the existence of a decentralized controller is included and a simulation example for the illustration of scalability replaces the previous simulation example
Subjects: Systems and Control (eess.SY)
[231]  arXiv:2001.09326 (replaced) [pdf, other]
Title: Gesticulator: A framework for semantically-aware speech-driven gesture generation
Comments: ICMI 2020 Best Paper Award. Code is available. 9 pages, 6 figures
Journal-ref: Proceedings of the 2020 International Conference on Multimodal Interaction (ICMI '20)
Subjects: Human-Computer Interaction (cs.HC); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[232]  arXiv:2002.04149 (replaced) [pdf, other]
Title: Maximizing Products of Linear Forms, and The Permanent of Positive Semidefinite Matrices
Comments: 12 pages, 2 figures
Subjects: Optimization and Control (math.OC); Data Structures and Algorithms (cs.DS); Combinatorics (math.CO)
[233]  arXiv:2002.09049 (replaced) [pdf, other]
Title: Post-training Quantization with Multiple Points: Mixed Precision without Mixed Precision
Comments: Accepted by AAAI2021
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
[234]  arXiv:2002.09116 (replaced) [pdf, other]
Title: Learning Deep Kernels for Non-Parametric Two-Sample Tests
Journal-ref: Proceedings of the 37th International Conference on Machine Learning (ICML 2020), PMLR 119:6316-6326
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Methodology (stat.ME)
[235]  arXiv:2003.03351 (replaced) [pdf, ps, other]
Title: Tighter Bound Estimation of Sensitivity Analysis for Incremental and Decremental Data Modification
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
[236]  arXiv:2003.07289 (replaced) [pdf, other]
Title: VMLoc: Variational Fusion For Learning-Based Multimodal Camera Localization
Journal-ref: The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-2021)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[237]  arXiv:2003.08526 (replaced) [pdf, other]
Title: Pose Augmentation: Class-agnostic Object Pose Transformation for Object Recognition
Comments: ECCV 2020, with supplementary materials
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[238]  arXiv:2004.02133 (replaced) [pdf, other]
Title: Neuron Linear Transformation: Modeling the Domain Shift for Crowd Counting
Comments: accepted by IEEE T-NNLS
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[239]  arXiv:2004.05916 (replaced) [pdf, other]
Title: Telling BERT's full story: from Local Attention to Global Aggregation
Comments: Accepted at EACL 2021
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL)
[240]  arXiv:2004.07234 (replaced) [pdf, other]
Title: LOCA: LOcal Conformal Autoencoder for standardized data coordinates
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
[241]  arXiv:2004.10474 (replaced) [pdf, other]
Title: Assurance 2.0: A Manifesto
Authors: Robin Bloomfield (1), John Rushby (2) ((1) Adelard LLP and City, University of London (2) SRI International)
Subjects: Software Engineering (cs.SE); Systems and Control (eess.SY)
[242]  arXiv:2004.11969 (replaced) [pdf, other]
Title: Leveraging Planar Regularities for Point Line Visual-Inertial Odometry
Comments: Accepted to IROS 2020
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[243]  arXiv:2004.12141 (replaced) [pdf, other]
Title: Church Synthesis on Register Automata over Linearly Ordered Data Domains
Comments: version accepted for STACS2021 with added appendix
Subjects: Formal Languages and Automata Theory (cs.FL)
[244]  arXiv:2005.02171 (replaced) [pdf]
Title: Neural Computing for Online Arabic Handwriting Character Recognition using Hard Stroke Features Mining
Authors: Amjad Rehman
Comments: 16 pages
Journal-ref: IJICIC 2021
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[245]  arXiv:2005.03448 (replaced) [pdf, other]
Title: Physics-informed learning of governing equations from scarce data
Comments: 46 pages; 1 table, 6 figures and 3 extended data figures in main text; 2 tables and 12 figures in supplementary information
Subjects: Machine Learning (cs.LG); Computational Physics (physics.comp-ph); Data Analysis, Statistics and Probability (physics.data-an); Machine Learning (stat.ML)
[246]  arXiv:2005.08795 (replaced) [pdf, other]
Title: From Symmetric to Asymmetric Asynchronous Byzantine Consensus
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
[247]  arXiv:2005.10400 (replaced) [pdf, other]
Title: Principal Fairness for Human and Algorithmic Decision-Making
Subjects: Computers and Society (cs.CY); Machine Learning (cs.LG); Machine Learning (stat.ML)
[248]  arXiv:2005.10881 (replaced) [pdf, other]
Title: Revisiting Membership Inference Under Realistic Assumptions
Subjects: Cryptography and Security (cs.CR); Machine Learning (cs.LG); Machine Learning (stat.ML)
[249]  arXiv:2005.11580 (replaced) [src]
Title: Evolution of Cooperative Hunting in Artificial Multi-layered Societies
Comments: Conflict of interest with my previous collaborators. Thus, we retract the preprint. We retract all previous versions of the paper as well, but due to the arXiv policy, previous versions cannot be removed. We ask that you ignore earlier versions and do not refer to or distribute them further. Thanks
Subjects: Computers and Society (cs.CY); Neural and Evolutionary Computing (cs.NE); Adaptation and Self-Organizing Systems (nlin.AO); Physics and Society (physics.soc-ph)
[250]  arXiv:2005.11882 (replaced) [pdf, other]
Title: Sentiment Analysis: Automatically Detecting Valence, Emotions, and Other Affectual States from Text
Authors: Saif M. Mohammad
Comments: This is the author's manuscript of what is slated to appear in the Second Edition of Emotion Measurement, 2021
Journal-ref: Second Edition of Emotion Measurement, 2021
Subjects: Computation and Language (cs.CL)
[251]  arXiv:2005.12366 (replaced) [pdf, other]
Title: Robust exact differentiators with predefined convergence time
Subjects: Systems and Control (eess.SY); Dynamical Systems (math.DS)
[252]  arXiv:2005.12469 (replaced) [pdf, other]
Title: CARPe Posterum: A Convolutional Approach for Real-time Pedestrian Path Prediction
Comments: AAAI 2021
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[253]  arXiv:2006.00916 (replaced) [pdf, other]
Title: Renewable Power Trades and Network Congestion Externalities
Subjects: Systems and Control (eess.SY); General Economics (econ.GN); Physics and Society (physics.soc-ph)
[254]  arXiv:2006.01067 (replaced) [pdf, other]
Title: Aligning Faithful Interpretations with their Social Attribution
Comments: Accepted as a journal paper to TACL
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
[255]  arXiv:2006.03680 (replaced) [pdf, other]
Title: Evaluating the Disentanglement of Deep Generative Models through Manifold Topology
Comments: Published at ICLR 2021
Subjects: Machine Learning (stat.ML); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[256]  arXiv:2006.04581 (replaced) [pdf, other]
Title: Precoder Design and Power Allocation for Downlink MIMO-NOMA via Simultaneous Triangularization
Comments: Accepted for presentation at the 2021 IEEE Wireless Communications and Networking Conference. This paper is the conference version of arXiv:2006.06471 with 6pp, 2 figures, for code, see this https URL
Subjects: Information Theory (cs.IT)
[257]  arXiv:2006.04767 (replaced) [pdf, other]
Title: Motion Prediction using Trajectory Sets and Self-Driving Domain Knowledge
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO); Machine Learning (stat.ML)
[258]  arXiv:2006.05942 (replaced) [pdf, ps, other]
Title: On Uniform Convergence and Low-Norm Interpolation Learning
Comments: v3: No content changes to this final version, as published at NeurIPS 2020: this https URL
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)
[259]  arXiv:2006.06471 (replaced) [pdf, other]
Title: Uplink and Downlink MIMO-NOMA with Simultaneous Triangularization
Comments: Accepted by the IEEE Transactions on Wireless Communications. This is the journal version of the submission arXiv:2006.04581 with 33 pages, 10 figures, and 2 tables. For associated code see this https URL
Journal-ref: IEEE Transactions on Wireless Communications, 2021
Subjects: Information Theory (cs.IT)
[260]  arXiv:2006.07676 (replaced) [pdf, other]
Title: EchoIA: Implicit Authentication System Based on User Feedback
Comments: 6 pages
Subjects: Cryptography and Security (cs.CR)
[261]  arXiv:2006.08475 (replaced) [pdf, other]
Title: Comparing Alternative Route Planning Techniques: A Comparative User Study on Melbourne, Dhaka and Copenhagen Road Networks
Comments: Extended the user study to also include the road networks of Dhaka and Copenhagen (the previous version only had Melbourne road network)
Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI)
[262]  arXiv:2006.14154 (replaced) [pdf, other]
Title: Strictly Batch Imitation Learning by Energy-based Distribution Matching
Comments: In Proc. 34th International Conference on Neural Information Processing Systems (NeurIPS 2020)
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)
[263]  arXiv:2006.16801 (replaced) [pdf, other]
Title: Random Partitioning Forest for Point-Wise and Collective Anomaly Detection -- Application to Intrusion Detection
Comments: arXiv admin note: text overlap with arXiv:1705.03800
Journal-ref: IEEE Transactions on Information Forensics and Security, pp1-16, 2021
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
[264]  arXiv:2007.05214 (replaced) [pdf, other]
Title: Gated Recurrent Context: Softmax-free Attention for Online Encoder-Decoder Speech Recognition
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[265]  arXiv:2007.05646 (replaced) [pdf, other]
Title: Transformations between deep neural networks
Comments: 14 pages, 10 figures
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
[266]  arXiv:2007.07206 (replaced) [pdf, other]
Title: Learning Robust State Abstractions for Hidden-Parameter Block MDPs
Comments: Accepted at the 9th International Conference on Learning Representations. 22 pages, 14 figures
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
[267]  arXiv:2007.08158 (replaced) [pdf, other]
Title: Channel Estimation for RIS-Aided mmWave MIMO Systems via Atomic Norm Minimization
Comments: 30 pages, 10 figures, Submitted to IEEE TWC, under second round review
Subjects: Signal Processing (eess.SP); Information Theory (cs.IT)
[268]  arXiv:2007.11091 (replaced) [pdf, other]
Title: EMaQ: Expected-Max Q-Learning Operator for Simple Yet Effective Offline and Online RL
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
[269]  arXiv:2007.15385 (replaced) [pdf, other]
Title: A Novel Point Inclusion Test for Convex Polygons Based on Voronoi Tessellations
Authors: Rahman Salim Zengin (1), Volkan Sezer (1) ((1) Istanbul Technical University)
Comments: 8 pages, 6 figures, "for the source code, see this https URL"
Subjects: Computational Geometry (cs.CG)
[270]  arXiv:2008.00181 (replaced) [pdf, other]
Title: Relation-aware Meta-learning for Market Segment Demand Prediction with Limited Records
Comments: First two authors contributed equally; Accepted by WSDM 2021
Subjects: Machine Learning (cs.LG); Information Retrieval (cs.IR)
[271]  arXiv:2008.01553 (replaced) [pdf, other]
Title: E-Tree Learning: A Novel Decentralized Model Learning Framework for Edge AI
Comments: IEEE Internet of Things Journal, 2020
Subjects: Networking and Internet Architecture (cs.NI); Machine Learning (cs.LG)
[272]  arXiv:2008.07045 (replaced) [pdf, other]
Title: Population-Scale Study of Human Needs During the COVID-19 Pandemic: Analysis and Implications
Subjects: Computers and Society (cs.CY); Information Retrieval (cs.IR)
[273]  arXiv:2008.12949 (replaced) [pdf, other]
Title: VR-Caps: A Virtual Environment for Capsule Endoscopy
Comments: 18 pages, 14 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[274]  arXiv:2009.00726 (replaced) [pdf, other]
Title: SPAN: Spatial Pyramid Attention Network forImage Manipulation Localization
Comments: Accepted at ECCV 2020 (this https URL) Code Available at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[275]  arXiv:2009.02296 (replaced) [pdf, other]
Title: Variational Deep Learning for the Identification and Reconstruction of Chaotic and Stochastic Dynamical Systems from Noisy and Partial Observations
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
[276]  arXiv:2009.05169 (replaced) [pdf, other]
Title: Sparsifying Transformer Models with Differentiable Representation Pooling
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)
[277]  arXiv:2009.07517 (replaced) [pdf, other]
Title: MATS: An Interpretable Trajectory Forecasting Representation for Planning and Control
Comments: 14 pages, 6 figures, 1 table. All code, models, and data can be found at this https URL . Conference on Robot Learning (CoRL) 2020
Subjects: Robotics (cs.RO); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG); Systems and Control (eess.SY)
[278]  arXiv:2009.07827 (replaced) [pdf, other]
Title: Multiple Exemplars-based Hallucinationfor Face Super-resolution and Editing
Comments: accepted in ACCV 2020
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[279]  arXiv:2009.08232 (replaced) [pdf, other]
Title: Broadband Finite-Element Impedance Computation for Parasitic Extraction
Subjects: Computational Engineering, Finance, and Science (cs.CE)
[280]  arXiv:2009.09315 (replaced) [pdf, ps, other]
Title: Randomized Subspace Newton Convex Method Applied to Data-Driven Sensor Selection Problem
Journal-ref: IEEE Signal Processing Letter, 2021
Subjects: Systems and Control (eess.SY)
[281]  arXiv:2009.09508 (replaced) [pdf, ps, other]
Title: Achieving Proportionality up to the Maximin Item with Indivisible Goods
Comments: Changes to wording throughout and changes to framing of section 8
Subjects: Computer Science and Game Theory (cs.GT); Artificial Intelligence (cs.AI)
[282]  arXiv:2009.09641 (replaced) [pdf, other]
Title: A conservative fully-discrete numerical method for the regularised shallow water wave equations
Subjects: Numerical Analysis (math.NA)
[283]  arXiv:2010.02824 (replaced) [pdf, other]
Title: Support-set bottlenecks for video-text representation learning
Comments: Accepted as spotlight paper at the International Conference on Learning Representations (ICLR) 2021
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[284]  arXiv:2010.03409 (replaced) [pdf, other]
Title: Learning Mesh-Based Simulation with Graph Networks
Journal-ref: International Conference on Learning Representations (ICLR), 2021
Subjects: Machine Learning (cs.LG); Computational Engineering, Finance, and Science (cs.CE)
[285]  arXiv:2010.03658 (replaced) [pdf, other]
Title: Robust Semi-Supervised Learning with Out of Distribution Data
Comments: Preprint
Subjects: Machine Learning (cs.LG)
[286]  arXiv:2010.03934 (replaced) [pdf, other]
Title: Prioritized Level Replay
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
[287]  arXiv:2010.07892 (replaced) [pdf, other]
Title: Robotic Pick-and-Place With Uncertain Object Instance Segmentation and Shape Completion
Comments: Supplementary material available for download: source code (this https URL), supplemental results (this https URL), and video (this https URL)
Subjects: Robotics (cs.RO)
[288]  arXiv:2010.08710 (replaced) [pdf, other]
Title: Causal Transfer Random Forest: Combining Logged Data and Randomized Experiments for Robust Prediction
Comments: 9 pages, 7 figures, 2 tables, accepted to WSDM 2021
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
[289]  arXiv:2010.11377 (replaced) [pdf, other]
Title: A New Block Preconditioner for Implicit Runge-Kutta Methods for Parabolic PDE Problems
Comments: 20 pages
Subjects: Numerical Analysis (math.NA)
[290]  arXiv:2010.12615 (replaced) [pdf, other]
Title: A Graph Theoretical Approach for Testing Binomiality of Reversible Chemical Reaction Networks
Subjects: Symbolic Computation (cs.SC); Commutative Algebra (math.AC)
[291]  arXiv:2010.15549 (replaced) [pdf, other]
Title: Multi-Constitutive Neural Network for Large Deformation Poromechanics Problem
Comments: Camera-ready (final) paper of the Third Workshop on Machine Learning and the Physical Sciences (NeurIPS 2020), Vancouver Add more figures despite the workshop is closed
Subjects: Machine Learning (cs.LG); Geophysics (physics.geo-ph)
[292]  arXiv:2011.01366 (replaced) [pdf, ps, other]
Title: Recent Advances on the Graph Isomorphism Problem
Comments: arXiv admin note: text overlap with arXiv:2002.06997
Subjects: Data Structures and Algorithms (cs.DS); Discrete Mathematics (cs.DM); Combinatorics (math.CO)
[293]  arXiv:2011.03512 (replaced) [pdf, other]
Title: Do We Need to Compensate for Motion Distortion and Doppler Effects in Spinning Radar Navigation?
Comments: Accepted for publication in ICRA/RA-L 2021. Version 3
Subjects: Robotics (cs.RO); Signal Processing (eess.SP)
[294]  arXiv:2011.05277 (replaced) [pdf, other]
Title: Qualities, challenges and future of genetic algorithms: a literature review
Authors: Aymeric Vie
Subjects: Neural and Evolutionary Computing (cs.NE); Distributed, Parallel, and Cluster Computing (cs.DC); Optimization and Control (math.OC)
[295]  arXiv:2011.05562 (replaced) [pdf, other]
Title: Stability of Gradient Learning Dynamics in Continuous Games: Vector Action Spaces
Comments: extension of arXiv:2011.03650 to vector action spaces. Submitted to IEEE L-CSS
Subjects: Computer Science and Game Theory (cs.GT); Systems and Control (eess.SY)
[296]  arXiv:2011.08126 (replaced) [pdf, ps, other]
Title: Threaded Gröbner Bases: a Macaulay2 package
Comments: 5 pages, package in revision
Subjects: Commutative Algebra (math.AC); Mathematical Software (cs.MS)
[297]  arXiv:2011.08581 (replaced) [pdf, other]
Title: Demonstrations of Cooperative Perception: Safety and Robustness in Connected and Automated Vehicle Operations
Subjects: Robotics (cs.RO)
[298]  arXiv:2011.08828 (replaced) [pdf, other]
Title: Uncertainty estimation for molecular dynamics and sampling
Comments: 17 pages, 9 figures
Subjects: Chemical Physics (physics.chem-ph); Machine Learning (cs.LG); Computational Physics (physics.comp-ph)
[299]  arXiv:2011.09946 (replaced) [pdf]
Title: Data Driven Modeling of Interfacial Traction Separation Relations using a Thermodynamically Consistent Neural Network
Subjects: Computational Engineering, Finance, and Science (cs.CE)
[300]  arXiv:2011.10996 (replaced) [pdf, other]
Title: Time series classification for predictive maintenance on event logs
Comments: 19 pages, 9 figures, submitted to ECMLPKDD 2021 Journal Track
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
[301]  arXiv:2011.11194 (replaced) [pdf, other]
Title: V3H: View Variation and View Heredity for Incomplete Multi-view Clustering
Comments: Accepted by IEEE Transactions on Artificial Intelligence
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)
[302]  arXiv:2011.13228 (replaced) [pdf, other]
Title: MultiStar: Instance Segmentation of Overlapping Objects with Star-Convex Polygons
Comments: Accepted for ISBI 2021
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[303]  arXiv:2011.13388 (replaced) [pdf, other]
Title: 3DSNet: Unsupervised Shape-to-Shape 3D Style Transfer
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[304]  arXiv:2011.14076 (replaced) [pdf, other]
Title: OpenKBP: The open-access knowledge-based planning grand challenge
Comments: 26 pages, 6 figures, 5 tables
Subjects: Medical Physics (physics.med-ph); Computer Vision and Pattern Recognition (cs.CV)
[305]  arXiv:2011.14295 (replaced) [pdf, other]
Title: A comparison of handcrafted, parameterized, and learnable features for speech separation
Subjects: Sound (cs.SD)
[306]  arXiv:2011.14741 (replaced) [pdf, ps, other]
Title: Minimax Converse for Identification via Channels
Authors: Shun Watanabe
Comments: 18 pages, no figure
Subjects: Information Theory (cs.IT)
[307]  arXiv:2011.15101 (replaced) [pdf, ps, other]
Title: Vertex Sparsification for Edge Connectivity in Polynomial Time
Authors: Yang P. Liu
Comments: 16 pages, changed license
Subjects: Data Structures and Algorithms (cs.DS)
[308]  arXiv:2012.02405 (replaced) [pdf, other]
Title: Applying Chebyshev-Tau spectral method to solve the parabolic equation model of wide-angle rational approximation in ocean acoustics
Comments: 16 pages, 5 figures, 1 table
Subjects: Computational Engineering, Finance, and Science (cs.CE); Numerical Analysis (math.NA); Fluid Dynamics (physics.flu-dyn)
[309]  arXiv:2012.02409 (replaced) [pdf, other]
Title: When does gradient descent with logistic loss find interpolating two-layer networks?
Comments: 43 pages, 4 figures
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Optimization and Control (math.OC)
[310]  arXiv:2012.05037 (replaced) [pdf, other]
Title: Rainbow and monochromatic circuits and cuts in binary matroids
Comments: 15 pages, 1 figure
Subjects: Combinatorics (math.CO); Discrete Mathematics (cs.DM)
[311]  arXiv:2012.05199 (replaced) [pdf, other]
Title: A Riemannian Block Coordinate Descent Method for Computing the Projection Robust Wasserstein Distance
Subjects: Machine Learning (cs.LG); Optimization and Control (math.OC); Machine Learning (stat.ML)
[312]  arXiv:2012.06185 (replaced) [pdf, ps, other]
Title: Exploring wav2vec 2.0 on speaker verification and language identification
Comments: Self-supervised, speaker verification, language identification, multi-task learning, wav2vec 2.0
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[313]  arXiv:2012.08023 (replaced) [pdf, other]
Title: Friedrichs Learning: Weak Solutions of Partial Differential Equations via Deep Learning
Subjects: Numerical Analysis (math.NA); Machine Learning (cs.LG)
[314]  arXiv:2012.09550 (replaced) [pdf, other]
Title: Learned Block-based Hybrid Image Compression
Comments: 9 pages, 11 figures
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[315]  arXiv:2012.09850 (replaced) [pdf, other]
Title: Social Distancing and the Internet: What Can Network Performance Measurements Tell Us?
Comments: 12 pages, submitted to TPRC48
Subjects: Networking and Internet Architecture (cs.NI)
[316]  arXiv:2012.10203 (replaced) [pdf, other]
Title: Classification with Strategically Withheld Data
Subjects: Machine Learning (cs.LG); Computer Science and Game Theory (cs.GT)
[317]  arXiv:2012.11213 (replaced) [pdf, ps, other]
Title: Self-Supervised Learning for Visual Summary Identification in Scientific Publications
Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)
[318]  arXiv:2012.11390 (replaced) [pdf, other]
Title: Adversarial Training for a Continuous Robustness Control Problem in Power Systems
Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[319]  arXiv:2012.12215 (replaced) [pdf, other]
Title: Robust Kernel-based Feature Representation for 3D Point Cloud Analysis via Circular Graph Convolutional Network
Comments: 10 pages, 9 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[320]  arXiv:2012.12544 (replaced) [pdf, other]
Title: BaPipe: Exploration of Balanced Pipeline Parallelism for DNN Training
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Artificial Intelligence (cs.AI)
[321]  arXiv:2012.12820 (replaced) [pdf]
Title: Multiclass Spinal Cord Tumor Segmentation on MRI with Deep Learning
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[322]  arXiv:2012.13635 (replaced) [pdf, other]
Title: Logic Tensor Networks
Comments: 68 pages, 28 figures, 6 tables
Subjects: Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[323]  arXiv:2012.14474 (replaced) [pdf, other]
Title: Paraconsistent Foundations for Probabilistic Reasoning, Programming and Concept Formation
Authors: Ben Goertzel
Subjects: Artificial Intelligence (cs.AI)
[324]  arXiv:2101.00401 (replaced) [pdf, other]
Title: Border Basis Computation with Gradient-Weighted Norm
Authors: Hiroshi Kera
Comments: 19 pages, 1 figure
Subjects: Symbolic Computation (cs.SC); Machine Learning (cs.LG); Commutative Algebra (math.AC)
[325]  arXiv:2101.01152 (replaced) [pdf, other]
Title: Provable Generalization of SGD-trained Neural Networks of Any Width in the Presence of Adversarial Label Noise
Comments: 29 pages, 9 figures
Subjects: Machine Learning (cs.LG); Optimization and Control (math.OC); Machine Learning (stat.ML)
[326]  arXiv:2101.01628 (replaced) [pdf]
Title: Local Translation Services for Neglected Languages
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)
[327]  arXiv:2101.01719 (replaced) [pdf, ps, other]
Title: Split block Bloom filters
Authors: Jim Apple
Comments: 3 pages, 1 figure
Subjects: Data Structures and Algorithms (cs.DS)
[328]  arXiv:2101.01867 (replaced) [pdf, other]
Title: dame-flame: A Python Library Providing Fast Interpretable Matching for Causal Inference
Authors: Neha R. Gupta (1), Vittorio Orlandi (1), Chia-Rui Chang (2), Tianyu Wang (1), Marco Morucci (1), Pritam Dey (1), Thomas J. Howell (1), Xian Sun (1), Angikar Ghosal (1), Sudeepa Roy (1), Cynthia Rudin (1), Alexander Volfovsky (1) ((1) Duke University, (2) Harvard University)
Comments: 5 pages, 1 figure; Reference and discussion of CEM corrected
Subjects: Machine Learning (cs.LG); Mathematical Software (cs.MS)
[329]  arXiv:2101.02888 (replaced) [pdf, other]
Title: Predicting Semen Motility using three-dimensional Convolutional Neural Networks
Comments: Corrected typos. Made slight changes as per the comments
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[330]  arXiv:2101.03225 (replaced) [pdf, ps, other]
Title: The extended binary quadratic residue code of length 42 holds a 3-design
Comments: 6 pages. Second version
Subjects: Information Theory (cs.IT); Combinatorics (math.CO)
[331]  arXiv:2101.03577 (replaced) [pdf, other]
Title: Quantum Secure Direct Communication with Mutual Authentication using a Single Basis
Subjects: Quantum Physics (quant-ph); Cryptography and Security (cs.CR)
[332]  arXiv:2101.03641 (replaced) [pdf, other]
Title: Learning Augmented Index Policy for Optimal Service Placement at the Network Edge
Subjects: Networking and Internet Architecture (cs.NI); Machine Learning (cs.LG)
[333]  arXiv:2101.03706 (replaced) [pdf, other]
Title: #StayHome #WithMe: How Do YouTubers Help with COVID-19 Loneliness?
Comments: CHI Conference on Human Factors in Computing Systems (CHI '21), May 8--13, 2021, Yokohama, Japan
Subjects: Human-Computer Interaction (cs.HC)
[334]  arXiv:2101.04223 (replaced) [pdf, other]
Title: Exploiting Multiple Timescales in Hierarchical Echo State Networks
Subjects: Machine Learning (cs.LG)
[335]  arXiv:2101.04271 (replaced) [pdf]
Title: What Do We Mean by "Accessibility Research"? A Literature Survey of Accessibility Papers in CHI and ASSETS from 1994 to 2019
Subjects: Human-Computer Interaction (cs.HC)
[336]  arXiv:2101.04540 (replaced) [pdf]
Title: Capturing social media expressions during the COVID-19 pandemic in Argentina and forecasting mental health and emotions
Comments: 12 pages, 2 figures, 3 tables
Subjects: Computers and Society (cs.CY); Social and Information Networks (cs.SI)
[337]  arXiv:2101.04547 (replaced) [pdf, other]
Title: Of Non-Linearity and Commutativity in BERT
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
[338]  arXiv:2101.04562 (replaced) [pdf, other]
Title: Hyperbolic Deep Neural Networks: A Survey
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[339]  arXiv:2101.04741 (replaced) [pdf, other]
Title: CityFlow-NL: Tracking and Retrieval of Vehicles at City Scale by Natural Language Descriptions
Comments: The code and data we use in this paper are available at: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[340]  arXiv:2101.04798 (replaced) [pdf, ps, other]
Title: Convergence analysis of some tent-based schemes for linear hyperbolic systems
Subjects: Numerical Analysis (math.NA)
[341]  arXiv:2101.04807 (replaced) [pdf, other]
Title: Sparse Sampling Kaczmarz-Motzkin Method with Linear Convergence
Subjects: Numerical Analysis (math.NA); Optimization and Control (math.OC)
[342]  arXiv:2101.04888 (replaced) [pdf, other]
Title: Crooked Indifferentiability Revisited
Subjects: Cryptography and Security (cs.CR)
[343]  arXiv:2101.04899 (replaced) [pdf, ps, other]
Title: Experimental Evaluation of Deep Learning models for Marathi Text Classification
Comments: Accepted at ICMISC 2021
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)
[344]  arXiv:2101.04904 (replaced) [pdf, other]
Title: EEC: Learning to Encode and Regenerate Images for Continual Learning
Comments: Accepted at ICLR 2021. A preliminary version of this work was presented at ICML 2020 Workshop on Lifelong Machine Learning: arXiv:2007.06637
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[345]  arXiv:2101.04954 (replaced) [pdf, other]
Title: EventAnchor: Reducing Human Interactions in Event Annotation of Racket Sports Videos
Subjects: Human-Computer Interaction (cs.HC); Computer Vision and Pattern Recognition (cs.CV)
[346]  arXiv:2101.05032 (replaced) [pdf, other]
Title: Round-Competitive Algorithms for Uncertainty Problems with Parallel Queries
Authors: Thomas Erlebach, Michael Hoffmann, Murilo S. de Lima (School of Informatics, University of Leicester, United Kingdom)
Comments: An extended abstract is to appear in the proceedings of the 38th International Symposium on Theoretical Aspects of Computer Science (STACS 2021); [v2] minor fixes and typesetting
Subjects: Data Structures and Algorithms (cs.DS)
[347]  arXiv:2101.05044 (replaced) [pdf, other]
Title: Publishing patterns reflect political polarization in news media
Subjects: Social and Information Networks (cs.SI); Computers and Society (cs.CY)
[348]  arXiv:2101.05091 (replaced) [pdf]
Title: MRI Images, Brain Lesions and Deep Learning
Comments: Submitted to: Computer Programs and Methods in Biomedicine update (2021)
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Neurons and Cognition (q-bio.NC)
[349]  arXiv:2101.05111 (replaced) [pdf, other]
Title: Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results in the Space Domain
Subjects: Software Engineering (cs.SE)
[350]  arXiv:2101.05141 (replaced) [pdf, other]
Title: Approximation of the spectral fractional powers of the Laplace-Beltrami Operator
Comments: 21 pages, 5 figures
Subjects: Numerical Analysis (math.NA)
[351]  arXiv:2101.05217 (replaced) [pdf, other]
Title: Similarity-based prediction for channel mapping and user positioning
Authors: Luc Le Magoarou (IRT b-com, Hypermedia)
Comments: IEEE Communications Letters, Institute of Electrical and Electronics Engineers, In press
Subjects: Signal Processing (eess.SP); Machine Learning (cs.LG)
[ total of 351 entries: 1-351 ]
[ showing up to 2000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, recent, 2101, contact, help  (Access key information)