New submissions for Tue, 4 Aug 20

[1]  arXiv:2008.00002 [pdf, other]
Title: TA-Dash: An Interactive Dashboard for Spatial-Temporal Traffic Analytics -- Demo Paper
Subjects: Human-Computer Interaction (cs.HC); Social and Information Networks (cs.SI)

In recent years, a large number of research efforts aimed at the development of machine learning models to predict complex spatial-temporal mobility patterns and their impact on road traffic and infrastructure. However, the utility of these models is often diminished due to the lack of accessible user interfaces to view and analyse prediction results. In this paper, we present the Traffic Analytics Dashboard ( TA-Dash), an interactive dashboard that enables the visualisation of complex spatial-temporal urban traffic patterns. We demonstrate the utility of TA-Dash at the example of two recently proposed spatial-temporal models for urban traffic and urban road infrastructure analysis. In particular, the use cases include the analysis, prediction and visualisation of the impact of planned special events on urban road traffic as well as the analysis and visualisation of structural dependencies within urban road networks. The lightweight TA-Dash dashboard aims to address non-expert users involved in urban traffic management and mobility service planning. The TA-Dash builds on a flexible layer-based architecture that is easily adaptable to the visualisation of new models.

[2]  arXiv:2008.00016 [pdf]
Title: Western ideological homogeneity in entrepreneurial finance research: Evidence from highly cited publications
Comments: 27 pages, 8 figures
Subjects: Computers and Society (cs.CY); Digital Libraries (cs.DL)

Entrepreneurs play crucial roles in global sustainable development, but limited financial resources constrain their performance and survival rate. Entrepreneurial finance discipline is, therefore, born to explore the connection between finance and entrepreneurship. Despite the global presence of entrepreneurship, the literature of entrepreneurial finance is suspected to be Western ideologically homogenous. Thus, the objective of this study is to examine the existence of Western ideological homogeneity in entrepreneurial finance literature. Employing the mindsponge mechanism and bibliometric analyses (Y-index and social structure), we analyze 412 highly cited publications extracted from Web of Science database and find Western ideological dominance as well as weak tolerance towards heterogeneity in the set of core ideologies of entrepreneurial finance. These results are consistent across author-, institution-, and country-levels, which reveals strong evidence for the existence of Western ideological homogeneity in the field. We recommend editors, reviewers, and authors to have proactive actions to diversify research topics and enhancing knowledge exchange to avoid the shortfalls of ideological homogeneity. Moreover, the synthesis of mindsponge mechanism and bibliometric analyses are suggested as a possible way to evaluate the state of ideological diversity in other scientific disciplines.

[3]  arXiv:2008.00017 [pdf]
Title: Safety, Security, and Privacy Threats Posed by Accelerating Trends in the Internet of Things
Comments: A Computing Community Consortium (CCC) white paper, 9 pages
Subjects: Computers and Society (cs.CY); Cryptography and Security (cs.CR)

The Internet of Things (IoT) is already transforming industries, cities, and homes. The economic value of this transformation across all industries is estimated to be trillions of dollars and the societal impact on energy efficiency, health, and productivity are enormous. Alongside potential benefits of interconnected smart devices comes increased risk and potential for abuse when embedding sensing and intelligence into every device. One of the core problems with the increasing number of IoT devices is the increased complexity that is required to operate them safely and securely. This increased complexity creates new safety, security, privacy, and usability challenges far beyond the difficult challenges individuals face just securing a single device. We highlight some of the negative trends that smart devices and collections of devices cause and we argue that issues related to security, physical safety, privacy, and usability are tightly interconnected and solutions that address all four simultaneously are needed. Tight safety and security standards for individual devices based on existing technology are needed. Likewise research that determines the best way for individuals to confidently manage collections of devices must guide the future deployments of such systems.

[4]  arXiv:2008.00018 [pdf, ps, other]
Title: Process of Efficiently Parallelizing a Protein Structure Determination Algorithm
Comments: 7 pages published in PDPA2006
Journal-ref: PDPTA 2006: 320-326
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Computational Engineering, Finance, and Science (cs.CE); Numerical Analysis (math.NA); Biomolecules (q-bio.BM)

Computational protein structure determination involves optimization in a problem space much too large to exhaustively search. Existing approaches include optimization algorithms such as gradient descent and simulated annealing, but these typically only find local minima. One novel approach implemented in REDcRAFT is to instead of folding a protein all at the same time, fold it residue by residue. This simulates a protein folding as each residue exits from the generating ribosome. While REDcRAFT exponentially reduces the problem space so it can be explored in polynomial time, it is still extremely computationally demanding. This algorithm does have the advantage that most of the execution time is spent in inherently parallelizable code. However, preliminary results from parallel execution indicate that approximately two-thirds of execution time is dedicated to system overhead. Additionally, by carefully analyzing and timing the structure of the program the major bottlenecks can be identified. After addressing these issues, REDcRAFT becomes a scalable parallel application with nearly two orders of magnitude improvement.

[5]  arXiv:2008.00021 [pdf, other]
Title: Beartooth Relay Protocol: Supporting Real-Time Application Streams over LoRa
Comments: 12 pages, submitted to AdHoc-Now 2020, for associated video, see this https URL
Subjects: Networking and Internet Architecture (cs.NI)

The near-ubiquitous availability of wireless connectivity lets users take advantage of a large variety of mobile applications. This connectivity predominantly comes as cellular and WiFi, limiting users to available infrastructure. At the same time, commercial efforts for infrastructure-less connectivity do not support mobile application traffic. In this paper, we present a new LoRa radio and a relay protocol capable of supporting real-time application traffic on point-to-point and multihop connection. Our solution has the potential to extend mobile application functionality beyond infrastructure coverage areas.

[6]  arXiv:2008.00023 [pdf]
Title: Opportunities and Challenges for Next Generation Computing
Comments: A Computing Community Consortium (CCC) white paper, 7 pages
Subjects: Computers and Society (cs.CY); Hardware Architecture (cs.AR)

Computing has dramatically changed nearly every aspect of our lives, from business and agriculture to communication and entertainment. As a nation, we rely on computing in the design of systems for energy, transportation and defense; and computing fuels scientific discoveries that will improve our fundamental understanding of the world and help develop solutions to major challenges in health and the environment. Computing has changed our world, in part, because our innovations can run on computers whose performance and cost-performance has improved a million-fold over the last few decades. A driving force behind this has been a repeated doubling of the transistors per chip, dubbed Moore's Law. A concomitant enabler has been Dennard Scaling that has permitted these performance doublings at roughly constant power, but, as we will see, both trends face challenges. Consider for a moment the impact of these two trends over the past 30 years. A 1980's supercomputer (e.g. a Cray 2) was rated at nearly 2 Gflops and consumed nearly 200 KW of power. At the time, it was used for high performance and national-scale applications ranging from weather forecasting to nuclear weapons research. A computer of similar performance now fits in our pocket and consumes less than 10 watts. What would be the implications of a similar computing/power reduction over the next 30 years - that is, taking a petaflop-scale machine (e.g. the Cray XK7 which requires about 500 KW for 1 Pflop (=1015 operations/sec) performance) and repeating that process? What is possible with such a computer in your pocket? How would it change the landscape of high capacity computing? In the remainder of this paper, we articulate some opportunities and challenges for dramatic performance improvements of both personal to national scale computing, and discuss some "out of the box" possibilities for achieving computing at this scale.

[7]  arXiv:2008.00025 [pdf, other]
Title: Rethinking Defaults Values: a Low Cost and Efficient Strategy to Define Hyperparameters
Comments: 35 pages, 15 figures
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

Machine Learning (ML) algorithms have been successfully employed by a vast range of practitioners with different backgrounds. One of the reasons for ML popularity is the capability to consistently delivers accurate results, which can be further boosted by adjusting hyperparameters (HP). However, part of practitioners has limited knowledge about the algorithms and does not take advantage of suitable HP settings. In general, HP values are defined by trial and error, tuning, or by using default values. Trial and error is very subjective, time costly and dependent on the user experience. Tuning techniques search for HP values able to maximize the predictive performance of induced models for a given dataset, but with the drawback of a high computational cost and target specificity. To avoid tuning costs, practitioners use default values suggested by the algorithm developer or by tools implementing the algorithm. Although default values usually result in models with acceptable predictive performance, different implementations of the same algorithm can suggest distinct default values. To maintain a balance between tuning and using default values, we propose a strategy to generate new optimized default values. Our approach is grounded on a small set of optimized values able to obtain predictive performance values better than default settings provided by popular tools. The HP candidates are estimated through a pool of promising values tuned from a small and informative set of datasets. After performing a large experiment and a careful analysis of the results, we concluded that our approach delivers better default values. Besides, it leads to competitive solutions when compared with the use of tuned values, being easier to use and having a lower cost.Based on our results, we also extracted simple rules to guide practitioners in deciding whether using our new methodology or a tuning approach.

[8]  arXiv:2008.00030 [pdf, other]
Title: Chance Constrained Policy Optimization for Process Control and Optimization
Comments: arXiv admin note: text overlap with arXiv:2006.02750
Subjects: Systems and Control (eess.SY); Machine Learning (cs.LG)

Chemical process optimization and control are affected by 1) plant-model mismatch, 2) process disturbances, and 3) constraints for safe operation. Reinforcement learning by policy optimization would be a natural way to solve this due to its ability to address stochasticity, plant-model mismatch, and directly account for the effect of future uncertainty and its feedback in a proper closed-loop manner; all without the need of an inner optimization loop. One of the main reasons why reinforcement learning has not been considered for industrial processes (or almost any engineering application) is that it lacks a framework to deal with safety critical constraints. Present algorithms for policy optimization use difficult-to-tune penalty parameters, fail to reliably satisfy state constraints or present guarantees only in expectation. We propose a chance constrained policy optimization (CCPO) algorithm which guarantees the satisfaction of joint chance constraints with a high probability - which is crucial for safety critical tasks. This is achieved by the introduction of constraint tightening (backoffs), which are computed simultaneously with the feedback policy. Backoffs are adjusted with Bayesian optimization using the empirical cumulative distribution function of the probabilistic constraints, and are therefore self-tuned. This results in a general methodology that can be imbued into present policy optimization algorithms to enable them to satisfy joint chance constraints with high probability. We present case studies that analyze the performance of the proposed approach.

[9]  arXiv:2008.00032 [pdf, ps, other]
Title: Sentiment Analysis based Multi-person Multi-criteria Decision Making Methodology: Using Natural Language Processing and Deep Learning for Decision Aid
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

Decision making models are constrained by taking the expert evaluations with pre-defined numerical or linguistic terms. We claim that the use of sentiment analysis will allow decision making models to consider expert evaluations in natural language. Accordingly, we propose the Sentiment Analysis based Multi-person Multi-criteria Decision Making (SA-MpMcDM) methodology, which builds the expert evaluations from their natural language reviews, and even from their numerical ratings if they are available. The SA-MpMcDM methodology incorporates an end-to-end multi-task deep learning model for aspect based sentiment analysis, named DMuABSA model, able to identify the aspect categories mentioned in an expert review, and to distill their opinions and criteria. The individual expert evaluations are aggregated via a criteria weighting through the attention of the experts. We evaluate the methodology in a restaurant decision problem, hence we build the TripR-2020 dataset of restaurant reviews, which we manually annotate and release. We analyze the SA-MpMcDM methodology in different scenarios using and not using natural language and numerical evaluations. The analysis shows that the combination of both sources of information results in a higher quality preference vector.

[10]  arXiv:2008.00036 [pdf, other]
Title: TweepFake: about Detecting Deepfake Tweets
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)

The threat of deepfakes, synthetic, or manipulated media, is becoming increasingly alarming, especially for social media platforms that have already been accused of manipulating public opinion. Even the cheapest text generation techniques (e.g. the search-and-replace method) can deceive humans, as the Net Neutrality scandal proved in 2017. Meanwhile, more powerful generative models have been released, from RNN-based methods to the GPT-2 language model. State-of-the-art language models, transformer-based in particular, can generate synthetic text in response to the model being primed with arbitrary input. Thus, Therefore, it is crucial to develop tools that help to detect media authenticity.
To help the research in this field, we collected a dataset of real Deepfake tweets. It is real in the sense that each deepfake tweet was actually posted on Twitter. We collected tweets from a total of 23 bots, imitating 17 human accounts. The bots are based on various generation techniques, i.e., Markov Chains, RNN, RNN+Markov, LSTM, GPT-2. We also randomly selected tweets from the humans imitated by the bots to have an overall balanced dataset of 25,836 tweets (half human and half bots generated). The dataset is publicly available on Kaggle.
In order to create a solid baseline for detection techniques on the proposed dataset we tested 13 detection methods based on various state-of-the-art approaches. The detection results reported as a baseline using 13 detection methods, confirm that the newest and more sophisticated generative methods based on transformer architecture (e.g., GPT-2) can produce high-quality short texts, difficult to detect.

[11]  arXiv:2008.00044 [pdf, other]
Title: On the Computational Complexity of Linear Discrepancy
Comments: ESA 2020
Subjects: Data Structures and Algorithms (cs.DS); Computational Complexity (cs.CC); Computational Geometry (cs.CG)

Many problems in computer science and applied mathematics require rounding a vector $\mathbf{w}$ of fractional values lying in the interval $[0,1]$ to a binary vector $\mathbf{x}$ so that, for a given matrix $\mathbf{A}$, $\mathbf{A}\mathbf{x}$ is as close to $\mathbf{A}\mathbf{w}$ as possible. For example, this problem arises in LP rounding algorithms used to approximate $\mathsf{NP}$-hard optimization problems and in the design of uniformly distributed point sets for numerical integration. For a given matrix $\mathbf{A}$, the worst-case error over all choices of $\mathbf{w}$ incurred by the best possible rounding is measured by the linear discrepancy of $\mathbf{A}$, a quantity studied in discrepancy theory, and introduced by Lovasz, Spencer, and Vesztergombi (EJC, 1986).
We initiate the study of the computational complexity of linear discrepancy. Our investigation proceeds in two directions: (1) proving hardness results and (2) finding both exact and approximate algorithms to evaluate the linear discrepancy of certain matrices. For (1), we show that linear discrepancy is $\mathsf{NP}$-hard. Thus we do not expect to find an efficient exact algorithm for the general case. Restricting our attention to matrices with a constant number of rows, we present a poly-time exact algorithm for matrices consisting of a single row and matrices with a constant number of rows and entries of bounded magnitude. We also present an exponential-time approximation algorithm for general matrices, and an algorithm that approximates linear discrepancy to within an exponential factor.

[12]  arXiv:2008.00045 [pdf]
Title: From Data to Knowledge to Action: A Global Enabler for the 21st Century
Comments: A Computing Community Consortium (CCC) white paper, 8 pages
Subjects: Computers and Society (cs.CY)

A confluence of advances in the computer and mathematical sciences has unleashed unprecedented capabilities for enabling true evidence-based decision making. These capabilities are making possible the large-scale capture of data and the transformation of that data into insights and recommendations in support of decisions about challenging problems in science, society, and government. Key advances include jumps in the availability of rich streams of data, precipitous drops in the cost of storing and retrieving massive amounts of data, exponential increases in computing power and memory, and jumps in the prowess of methods for performing machine learning and reasoning. These advances have come together to create an inflection point in our ability to harness large amounts of data for generating insights and guiding decision making. The shift of commerce, science, education, art, and entertainment to the web makes available unprecedented quantities of structured and unstructured databases about human activities - much of it available to anyone who wishes to mine it for insights. In the sciences, new evidential paradigms and sensing technologies are making available great quantities of data, via use of fundamentally new kinds of low-cost sensors (e.g., genomic microarrays) or through viewers that provide unprecedented scope and resolution. The data pose a huge opportunity for data-centric analyses. To date, we have only scratched the surface of the potential for learning from these large-scale data sets. Opportunities abound for tapping our new capabilities more broadly to provide insights to decision makers and to enhance the quality of their actions and policies.

[13]  arXiv:2008.00047 [pdf, other]
Title: Class-Oriented Poisoning Attack
Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR); Machine Learning (stat.ML)

Poisoning attacks on machine learning systems compromise the model performance by deliberately injecting malicious samples in the training dataset to influence the training process. Prior works focus on either availability attacks (i.e., lowering the overall model accuracy) or integrity attacks (i.e., enabling specific instance based backdoor). In this paper, we advance the adversarial objectives of the availability attacks to a per-class basis, which we refer to as class-oriented poisoning attacks. We demonstrate that the proposed attack is capable of forcing the corrupted model to predict in two specific ways: (i) classify unseen new images to a targeted "supplanter" class, and (ii) misclassify images from a "victim" class while maintaining the classification accuracy on other non-victim classes. To maximize the adversarial effect, we propose a gradient-based framework that manipulates the logits to retain/eliminate the desired/undesired feature information in the generated poisoning images. Using newly defined metrics at the class level, we illustrate the effectiveness of the proposed class-oriented poisoning attacks on various models (e.g., LeNet-5, Vgg-9, and ResNet-50) over a wide range of datasets (e.g., MNIST, CIFAR-10, and ImageNet-ILSVRC2012).

[14]  arXiv:2008.00049 [pdf, other]
Title: Mitigating the Backfire Effect Using Pacing and Leading
Subjects: Social and Information Networks (cs.SI); Applications (stat.AP)

Online social networks create echo-chambers where people are infrequently exposed to opposing opinions. Even if such exposure occurs, the persuasive effect may be minimal or nonexistent. Recent studies have shown that exposure to opposing opinions causes a backfire effect, where people become more steadfast in their original beliefs. We conducted a longitudinal field experiment on Twitter to test methods that mitigate the backfire effect while exposing people to opposing opinions. Our subjects were Twitter users with anti-immigration sentiment. The backfire effect was defined as an increase in the usage frequency of extreme anti-immigration language in the subjects' posts. We used automated Twitter accounts, or bots, to apply different treatments to the subjects. One bot posted only pro-immigration content, which we refer to as arguing. Another bot initially posted anti-immigration content, then gradually posted more pro-immigration content, which we refer to as pacing and leading. We also applied a contact treatment in conjunction with the messaging based methods, where the bots liked the subjects' posts. We found that the most effective treatment was a combination of pacing and leading with contact. The least effective treatment was arguing with contact. In fact, arguing with contact consistently showed a backfire effect relative to a control group. These findings have many limitations, but they still have important implications for the study of political polarization, the backfire effect, and persuasion in online social networks.

[15]  arXiv:2008.00051 [pdf, ps, other]
Title: Analysis of SGD with Biased Gradient Estimators
Comments: Accepted to ICML 2020 Workshop "Beyond First Order Methods in ML Systems"
Subjects: Machine Learning (cs.LG); Optimization and Control (math.OC); Machine Learning (stat.ML)

We analyze the complexity of biased stochastic gradient methods (SGD), where individual updates are corrupted by deterministic, i.e. biased error terms. We derive convergence results for smooth (non-convex) functions and give improved rates under the Polyak-Lojasiewicz condition. We quantify how the magnitude of the bias impacts the attainable accuracy and convergence rates.
Our framework covers many applications where either only biased gradient updates are available or preferred over unbiased ones for performance reasons. For instance, in the domain of distributed learning, biased gradient compression techniques such as top-k compression have been proposed as a tool to alleviate the communication bottleneck and in derivative-free optimization, only biased gradient estimators can be queried. We discuss a few guiding examples that show the broad applicability of our analysis.

[16]  arXiv:2008.00054 [pdf, other]
Title: Securing CNN Model and Biometric Template using Blockchain
Comments: Published in IEEE BTAS 2019
Subjects: Cryptography and Security (cs.CR)

Blockchain has emerged as a leading technology that ensures security in a distributed framework. Recently, it has been shown that blockchain can be used to convert traditional blocks of any deep learning models into secure systems. In this research, we model a trained biometric recognition system in an architecture which leverages the blockchain technology to provide fault tolerant access in a distributed environment. The advantage of the proposed approach is that tampering in one particular component alerts the whole system and helps in easy identification of `any' possible alteration. Experimentally, with different biometric modalities, we have shown that the proposed approach provides security to both deep learning model and the biometric template.

[17]  arXiv:2008.00055 [pdf]
Title: From Data to Knowledge to Action: Enabling the Smart Grid
Comments: A Computing Community Consortium (CCC) white paper, 8 pages
Subjects: Computers and Society (cs.CY)

Our nation's infrastructure for generating, transmitting, and distributing electricity - "The Grid" - is a relic based in many respects on century-old technology. It consists of expensive, centralized generation via large plants, and a massive transmission and distribution system. It strives to deliver high-quality power to all subscribers simultaneously - no matter what their demand - and must therefore be sized to the peak aggregate demand at each distribution point. Ultimately, the system demands end-to-end synchronization, and it lacks a mechanism for storing ("buffering") energy, thus complicating sharing among grids or independent operation during an "upstream" outage. Recent blackouts demonstrate the existing grid's problems - failures are rare but spectacular. Moreover, the structure cannot accommodate the highly variable nature of renewable energy sources such as solar and wind. Many people are pinning their hopes on the "smart grid" - i.e., a more distributed, adaptive, and market-based infrastructure for the generation, distribution, and consumption of electrical energy. This new approach is designed to yield greater efficiency and resilience, while reducing environmental impact, compared to the existing electricity distribution system. Initial plans for the smart grid suggest it will make extensive use of existing information technology. In particular, recent advances in data analytics - i.e., data mining, machine learning, etc. - have the potential to greatly enhance the smart grid and, ultimately, amplify its impact, by helping us make sense of an increasing wealth of data about how we use energy and the kinds of demands that we are placing upon the current energy grid. Here we describe what the electricity grid could look like in 10 years, and specifically how Federal investment in data analytics approaches are critical to realizing this vision.

[18]  arXiv:2008.00058 [pdf, other]
Title: A Bayesian cognition approach for belief updating of correlation judgement through uncertainty visualizations
Comments: 9 pages, 8 figures, accepted at IEEE Information Visualization 2020
Subjects: Human-Computer Interaction (cs.HC)

Understanding correlation judgement is important to designing effective visualizations of bivariate data. Prior work on correlation perception has not considered how factors including prior beliefs and uncertainty representation impact such judgements. The present work focuses on the impact of uncertainty communication when judging bivariate visualizations. Specifically, we model how users update their beliefs about variable relationships after seeing a scatterplot with and without uncertainty representation. To model and evaluate the belief updating, we present three studies. Study 1 focuses on a proposed ''Line + Cone'' visual elicitation method for capturing users' beliefs in an accurate and intuitive fashion. The findings reveal that our proposed method of belief solicitation reduces complexity and accurately captures the users' uncertainty about a range of bivariate relationships. Study 2 leverages the ``Line + Cone'' elicitation method to measure belief updating on the relationship between different sets of variables when seeing correlation visualization with and without uncertainty representation. We compare changes in users beliefs to the predictions of Bayesian cognitive models which provide normative benchmarks for how users should update their prior beliefs about a relationship in light of observed data. The findings from Study 2 revealed that one of the visualization conditions with uncertainty communication led to users being slightly more confident about their judgement compared to visualization without uncertainty information. Study 3 builds on findings from Study 2 and explores differences in belief update when the bivariate visualization is congruent or incongruent with users' prior belief. Our results highlight the effects of incorporating uncertainty representation, and the potential of measuring belief updating on correlation judgement with Bayesian cognitive models.

[19]  arXiv:2008.00062 [pdf, other]
Title: Partial Reconfiguration for Design Optimization
Subjects: Hardware Architecture (cs.AR)

FPGA designers have traditionally shared a similar design methodology with ASIC designers. Most notably, at design time, FPGA designers commit to a fixed allocation of logic resources to modules in a design. At runtime, some of the occupied resources could be left idle or under-utilized due to hard-to-avoid sources of inefficiencies (e.g., operation dependencies). With partial reconfiguration (PR), FPGA resources can be re-allocated over time. Therefore, using PR, a designer can attempt to reduce idleness and under-utilization with better area-time scheduling.
In this paper, we explain when, how, and why PR-style designs can improve over the performance-area Pareto front of ASIC-style designs (without PR). We first introduce the concept of area-time volume to explain why PR-style designs can improve upon ASIC-style designs. We identify resource under-utilization as an opportunity that can be exploited by PR-style designs. We then present a first-order analytical model to help a designer decide if a PR-style design can be beneficial. When it is the case, the model points to the most suitable PR execution strategy and provides an estimate of the improvement. The model is validated in three case studies.

[20]  arXiv:2008.00067 [pdf, other]
Title: Infusing Reachability-Based Safety into Planning and Control for Multi-agent Interactions
Comments: To appear in IEEE/RSJ International Conference on Intelligent Robots and Systems 2020
Subjects: Robotics (cs.RO); Systems and Control (eess.SY)

Within a robot autonomy stack, the planner and controller are typically designed separately, and serve different purposes. As such, there is often a diffusion of responsibilities when it comes to ensuring safety for the robot. We propose that a planner and controller should share the same interpretation of safety but apply this knowledge in a different yet complementary way. To achieve this, we use Hamilton-Jacobi (HJ) reachability theory at the planning level to provide the robot planner with the foresight to avoid entering regions with possible inevitable collision. However, this alone does not guarantee safety. In conjunction with this HJ reachability-infused planner, we propose a minimally-interventional multi-agent safety-preserving controller also derived via HJ-reachability theory. The safety controller maintains safety for the robot without unduly impacting planner performance. We demonstrate the benefits of our proposed approach in a multi-agent highway scenario where a robot car is rewarded to navigate through traffic as fast as possible, and we show that our approach provides strong safety assurances yet achieves the highest performance compared to other safety controllers.

[21]  arXiv:2008.00072 [pdf, other]
Title: Dynamic Object Tracking and Masking for Visual SLAM
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)

In dynamic environments, performance of visual SLAM techniques can be impaired by visual features taken from moving objects. One solution is to identify those objects so that their visual features can be removed for localization and mapping. This paper presents a simple and fast pipeline that uses deep neural networks, extended Kalman filters and visual SLAM to improve both localization and mapping in dynamic environments (around 14 fps on a GTX 1080). Results on the dynamic sequences from the TUM dataset using RTAB-Map as visual SLAM suggest that the approach achieves similar localization performance compared to other state-of-the-art methods, while also providing the position of the tracked dynamic objects, a 3D map free of those dynamic objects, better loop closure detection with the whole pipeline able to run on a robot moving at moderate speed.

[22]  arXiv:2008.00075 [pdf, other]
Title: The Multiplicative-Additive Lambek Calculus with Subexponential and Bracket Modalities
Comments: Submitted to the Journal of Logic, Language, and Information
Subjects: Logic in Computer Science (cs.LO); Logic (math.LO)

We give a proof-theoretic and algorithmic complexity analysis for systems introduced by Morrill to serve as the core of the CatLog categorial grammar parser. We consider two recent versions of Morrill's calculi, and focus on their fragments including multiplicative (Lambek) connectives, additive conjunction and disjunction, brackets and bracket modalities, and the ! subexponential modality. For both systems, we resolve issues connected with the cut rule and provide necessary modifications, after which we prove admissibility of cut (cut elimination theorem). We also prove algorithmic undecidability for both calculi, and show that categorial grammars based on them can generate arbitrary recursively enumerable languages.

[23]  arXiv:2008.00076 [pdf, other]
Title: Posibility conditions for Open Access
Authors: Jacinto Davila
Comments: 13 pages, 2 figures, 6 tables and accompanying prolog source code
Subjects: Computer Science and Game Theory (cs.GT); Multiagent Systems (cs.MA)

This is an attempt to formalize the conditions of possibility for free, libre, open access to scientific knowledge within a game. The challenge is to enunciate the terms under which agents participating in the Grand conversation of science would be willing to open share, exchange, negotiate or surrender their contributions, considering their corresponding intentions, goals, beliefs and expected utilities. Many conclusions can be drawn from the game here described. We have made many simplifying decisions along the modelling process that must be taken into account as a determining context for those conclusions, of course. It can be safely state, however, that under the current conditions of the game, Editors will keep betting on Toll Access, knowledge distribution models even if all the other Academic agent go for Open Access.

[24]  arXiv:2008.00077 [pdf, other]
Title: Neural Architecture Search in Graph Neural Networks
Subjects: Neural and Evolutionary Computing (cs.NE); Machine Learning (cs.LG)

Performing analytical tasks over graph data has become increasingly interesting due to the ubiquity and large availability of relational information. However, unlike images or sentences, there is no notion of sequence in networks. Nodes (and edges) follow no absolute order, and it is hard for traditional machine learning (ML) algorithms to recognize a pattern and generalize their predictions on this type of data. Graph Neural Networks (GNN) successfully tackled this problem. They became popular after the generalization of the convolution concept to the graph domain. However, they possess a large number of hyperparameters and their design and optimization is currently hand-made, based on heuristics or empirical intuition. Neural Architecture Search (NAS) methods appear as an interesting solution to this problem. In this direction, this paper compares two NAS methods for optimizing GNN: one based on reinforcement learning and a second based on evolutionary algorithms. Results consider 7 datasets over two search spaces and show that both methods obtain similar accuracies to a random search, raising the question of how many of the search space dimensions are actually relevant to the problem.

[25]  arXiv:2008.00078 [pdf, other]
Title: Learning to Rank for Active Learning: A Listwise Approach
Comments: Accepted at ICPR 2020
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Active learning emerged as an alternative to alleviate the effort to label huge amount of data for data hungry applications (such as image/video indexing and retrieval, autonomous driving, etc.). The goal of active learning is to automatically select a number of unlabeled samples for annotation (according to a budget), based on an acquisition function, which indicates how valuable a sample is for training the model. The learning loss method is a task-agnostic approach which attaches a module to learn to predict the target loss of unlabeled data, and select data with the highest loss for labeling. In this work, we follow this strategy but we define the acquisition function as a learning to rank problem and rethink the structure of the loss prediction module, using a simple but effective listwise approach. Experimental results on four datasets demonstrate that our method outperforms recent state-of-the-art active learning approaches for both image classification and regression tasks.

[26]  arXiv:2008.00083 [pdf, other]
Title: MiabNET: Message-in-a-bottle Protocol for MANET
Authors: Dongning Ma
Subjects: Networking and Internet Architecture (cs.NI)

In this short paper, we propose MiabNET, a reactive protocol for Mobile Ad-hoc Networks (MANET). This protocol leverages the concept of "message-in-a-bottle" to spread the routing information though the entire network. The idea of the protocol is briefly described as below: if a node would like to find a route to a destination node not in the routing table, it will initialize a bottle and send this bottle to \textbf{a random one} of its neighbors. If this neighbor does not have the route to the destination, it will send the bottle to one of its random neighbors as well, until the bottle reaches the destination node.

[27]  arXiv:2008.00084 [pdf]
Title: Survey of Spectrum Regulation for Intelligent Transportation Systems
Subjects: Networking and Internet Architecture (cs.NI); Systems and Control (eess.SY)

As the 5G communications technology is developing, vehicular communications that require high reliability, low latency, and massive connectivity are getting increasing interests from academia and industry. Due to the developed technologies, vehicular communications is not only limited to vehicle components in the forms of Vehicle-to-Vehicle (V2V) or Vehicle-to-Infrastructure (V2I) network, but also extended to connect with other users such as pedestrians and cellular user. Dedicated Short-Range Communications (DSRC) is the conventional vehicular communication standard for Intelligent Transportation Systems (ITS). More recently, 3GPP introduced the Cellular-Vehicle-to-Everything (C-V2X) which arises as a competitor to DSRC. Meanwhile, the Federal Communications Commission (FCC) issued a Notice of Proposed Rulemaking (NPRM) to consider deploying Unlicensed National Information Infrastructure (U-NII) devices in the ITS band with two interference mitigation approaches: Detect and Vacate (DAV) and Re-channelization (Re-CH). With multiple standard options and interference mitigation approaches, numerous regulatory taxonomies can be identified and relevant technical challenges are issued. However, these challenges are much broader than the current and future regulatory taxonomies pursued by the different countries. Because of different plans, the technical and regulatory challenges vary. This paper presents a literature survey about the technical challenges, the current and future ITS band usage plans for U.S., Europe, China, Korea, and Japan, and the major research testbeds and plans. This survey shows that the most likely deployment taxonomies are (1) DSRC, C-V2X, Wi-Fi with Re-CH, (2) DSRC and C-V2X with interoperation, (3) C-V2X only, whereas the most difficult technical challenge is the interoperability between the Wi-Fi-like DSRC and 4G LTE-like C-V2X.

[28]  arXiv:2008.00085 [pdf]
Title: Performance Evaluation of Orchestra Scheduling in Time Slotted Channel Hopping Networks
Subjects: Networking and Internet Architecture (cs.NI)

In this paper, we evaluate the performance of networks that use RPL (Routing Protocols for Low Power and Lossy Networks) with TSCH (Time Slotted Channel Hopping) and Orchestra (an autonomous method for building the TSCH schedule). We measure the performance in the transient state when a node dies (i.e., removed from the network) and determine how long it takes for the network to come back to a stable RPL tree and also what the impact is with respect to energy consumption. Our analysis shows that the Orchestra reduces the energy consumption when the RPL is in a transient state, like in the case of when one of the nodes die. Furthermore, we calculate the energy consumption in the transient state without using Orchestra, and then we make a comparison between both outcomes. We show that Orchestra reduces energy consumption by up to one-third compared to not using Orchestra.

[29]  arXiv:2008.00086 [pdf, other]
Title: LearningCC: An online learning approach for congestion control
Authors: Songyang Zhang
Comments: 5 figures
Subjects: Networking and Internet Architecture (cs.NI)

Recently, much effort has been devoted by researchers from both academia and industry to develop novel congestion control methods. LearningCC is presented in this letter, in which the congestion control problem is solved by reinforce learning approach. Instead of adjusting the congestion window with fixed policy, there are serval options for an endpoint to choose. To predict the best option is a hard task. Each option is mapped as an arm of a bandit machine. The endpoint can learn to determine the optimal choice through trial and error method. Experiments are performed on ns3 platform to verify the effectiveness of LearningCC by comparing with other benchmark algorithms. Results indicate it can achieve lower transmission delay than loss based algorithms. Especially, we found LearningCC makes significant improvement in link suffering from random loss.

[30]  arXiv:2008.00087 [pdf, other]
Title: Adaptive Bitrate Video Streaming for Wireless nodes: A Survey
Subjects: Networking and Internet Architecture (cs.NI); Distributed, Parallel, and Cluster Computing (cs.DC)

In today's Internet, video is the most dominant application and in addition to this, wireless networks such as WiFi, Cellular, and Bluetooth have become ubiquitous. Hence, most of the Internet traffic is video over wireless nodes. There is a plethora of research to improve video streaming to achieve high Quality of Experience (QoE) over the Internet. Many of them focus on wireless nodes. Recent measurement studies often show QoE of video suffers in many wireless clients over the Internet. Recently, many research papers have presented models and schemes to optimize the Adaptive BitRate (ABR) based video streaming for wireless and mobile users. In this survey, we present a comprehensive overview of recent work in the area of Internet video specially designed for wireless network. Recent research has suggested that there are some new challenges added by the connectivity of clients through wireless. Also these challenges become more difficult to handle when these nodes are mobile. This survey also discusses new potential areas of future research due to the increasing scarcity of wireless spectrum.

[31]  arXiv:2008.00088 [pdf]
Title: A Comparative Study of AI-based Intrusion Detection Techniques in Critical Infrastructures
Comments: ACM Transaction on Internet Technology, 2020 22 pages, 11 Figures, 3 Tables
Subjects: Networking and Internet Architecture (cs.NI); Distributed, Parallel, and Cluster Computing (cs.DC); Machine Learning (cs.LG)

Volunteer computing uses Internet-connected devices (laptops, PCs, smart devices, etc.), in which their owners volunteer them as storage and computing power resources, has become an essential mechanism for resource management in numerous applications. The growth of the volume and variety of data traffic in the Internet leads to concerns on the robustness of cyberphysical systems especially for critical infrastructures. Therefore, the implementation of an efficient Intrusion Detection System for gathering such sensory data has gained vital importance. In this paper, we present a comparative study of Artificial Intelligence (AI)-driven intrusion detection systems for wirelessly connected sensors that track crucial applications. Specifically, we present an in-depth analysis of the use of machine learning, deep learning and reinforcement learning solutions to recognize intrusive behavior in the collected traffic. We evaluate the proposed mechanisms by using KD'99 as real attack data-set in our simulations. Results present the performance metrics for three different IDSs namely the Adaptively Supervised and Clustered Hybrid IDS (ASCH-IDS), Restricted Boltzmann Machine-based Clustered IDS (RBC-IDS) and Q-learning based IDS (QL-IDS) to detect malicious behaviors. We also present the performance of different reinforcement learning techniques such as State-Action-Reward-State-Action Learning (SARSA) and the Temporal Difference learning (TD). Through simulations, we show that QL-IDS performs with 100% detection rate while SARSA-IDS and TD-IDS perform at the order of 99.5%.

[32]  arXiv:2008.00089 [pdf]
Title: A Centralized Channel Allocation Method in Clustered Ad Hoc Networks
Comments: 8 pages, 10 figures, one table, submitted to LCN 2013
Subjects: Networking and Internet Architecture (cs.NI)

Cognitive radio networks (CRNs) is the next generation of wireless communication. This type of network requires efficent spectrum allocation methods. This paper presents a new meta-heuristic evolutionary method for solving the channel allocation problem in an ad hoc network context. The suggested method is based on a graph-theoretic model and seeks a solution for the spectrum allocation problem in a clustered ad hoc network topology.The method is referred to as imperialist competitive algorithm (ICA)and provides a scheme for allocating the available channels to cluster heads maximizing spectrum efficiency and minimizes co-channel interference. The suggested methods are tested for several scenarios; the performance of the ICA-based scheme is compared with the genetic algorithm based scheme.

[33]  arXiv:2008.00092 [pdf, other]
Title: Deep Depth Estimation from Visual-Inertial SLAM
Comments: 9 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)

This paper addresses the problem of learning to complete a scene's depth from sparse depth points and images of indoor scenes. Specifically, we study the case in which the sparse depth is computed from a visual-inertial simultaneous localization and mapping (VI-SLAM) system. The resulting point cloud has low density, it is noisy, and has non-uniform spatial distribution, as compared to the input from active depth sensors, e.g., LiDAR or Kinect. Since the VI-SLAM produces point clouds only over textured areas, we compensate for the missing depth of the low-texture surfaces by leveraging their planar structures and their surface normals which is an important intermediate representation. The pre-trained surface normal network, however, suffers from large performance degradation when there is a significant difference in the viewing direction (especially the roll angle) of the test image as compared to the trained ones. To address this limitation, we use the available gravity estimate from the VI-SLAM to warp the input image to the orientation prevailing in the training dataset. This results in a significant performance gain for the surface normal estimate, and thus the dense depth estimates. Finally, we show that our method outperforms other state-of-the-art approaches both on training (ScanNet and NYUv2) and testing (collected with Azure Kinect) datasets.

[34]  arXiv:2008.00095 [pdf, other]
Title: Intelligent Management of Mobile Systems through Computational Self-Awareness
Subjects: Hardware Architecture (cs.AR); Systems and Control (eess.SY)

Runtime resource management for many-core systems is increasingly complex. The complexity can be due to diverse workload characteristics with conflicting demands, or limited shared resources such as memory bandwidth and power. Resource management strategies for many-core systems must distribute shared resource(s) appropriately across workloads, while coordinating the high-level system goals at runtime in a scalable and robust manner.
To address the complexity of dynamic resource management in many-core systems, state-of-the-art techniques that use heuristics have been proposed. These methods lack the formalism in providing robustness against unexpected runtime behavior. One of the common solutions for this problem is to deploy classical control approaches with bounds and formal guarantees. Traditional control theoretic methods lack the ability to adapt to (1) changing goals at runtime (i.e., self-adaptivity), and (2) changing dynamics of the modeled system (i.e., self-optimization).
In this chapter, we explore adaptive resource management techniques that provide self-optimization and self-adaptivity by employing principles of computational self-awareness, specifically reflection. By supporting these self-awareness properties, the system can reason about the actions it takes by considering the significance of competing objectives, user requirements, and operating conditions while executing unpredictable workloads.

[35]  arXiv:2008.00096 [pdf, other]
Title: KAPLAN: A 3D Point Descriptor for Shape Completion
Comments: 18 pages, 15 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)

We present a novel 3D shape completion method that operates directly on unstructured point clouds, thus avoiding resource-intensive data structures like voxel grids. To this end, we introduce KAPLAN, a 3D point descriptor that aggregates local shape information via a series of 2D convolutions. The key idea is to project the points in a local neighborhood onto multiple planes with different orientations. In each of those planes, point properties like normals or point-to-plane distances are aggregated into a 2D grid and abstracted into a feature representation with an efficient 2D convolutional encoder. Since all planes are encoded jointly, the resulting representation nevertheless can capture their correlations and retains knowledge about the underlying 3D shape, without expensive 3D convolutions. Experiments on public datasets show that KAPLAN achieves state-of-the-art performance for 3D shape completion.

[36]  arXiv:2008.00097 [pdf, other]
Title: Back-propagation through Signal Temporal Logic Specifications: Infusing Logical Structure into Gradient-Based Methods
Comments: Published in the Workshop on Algorithmic Foundations of Robotics 2020
Subjects: Systems and Control (eess.SY); Computation and Language (cs.CL); Logic in Computer Science (cs.LO)

This paper presents a technique, named STLCG, to compute the quantitative semantics of Signal Temporal Logic (STL) formulas using computation graphs. STLCG provides a platform which enables the incorporation of logical specifications into robotics problems that benefit from gradient-based solutions. Specifically, STL is a powerful and expressive formal language that can specify spatial and temporal properties of signals generated by both continuous and hybrid systems. The quantitative semantics of STL provide a robustness metric, i.e., how much a signal satisfies or violates an STL specification. In this work, we devise a systematic methodology for translating STL robustness formulas into computation graphs. With this representation, and by leveraging off-the-shelf automatic differentiation tools, we are able to back-propagate through STL robustness formulas and hence enable a natural and easy-to-use integration with many gradient-based approaches used in robotics. We demonstrate, through examples stemming from various robotics applications, that STLCG is versatile, computationally efficient, and capable of injecting human-domain knowledge into the problem formulation.

[37]  arXiv:2008.00101 [pdf, other]
Title: Telemanipulation with Chopsticks: Analyzing Human Factors in User Demonstrations
Comments: IROS 2020
Subjects: Robotics (cs.RO); Human-Computer Interaction (cs.HC)

Chopsticks constitute a simple yet versatile tool that humans have used for thousands of years to perform a variety of challenging tasks ranging from food manipulation to surgery. Applying such a simple tool in a diverse repertoire of scenarios requires significant adaptability. Towards developing autonomous manipulators with comparable adaptability to humans, we study chopsticks-based manipulation to gain insights into human manipulation strategies. We conduct a within-subjects user study with 25 participants, evaluating three different data-collection methods: normal chopsticks, motion-captured chopsticks, and a novel chopstick telemanipulation interface. We analyze factors governing human performance across a variety of challenging chopstick-based grasping tasks. Although participants rated teleoperation as the least comfortable and most difficult-to-use method, teleoperation enabled users to achieve the highest success rates on three out of five objects considered. Further, we notice that subjects quickly learned and adapted to the teleoperation interface. Finally, while motion-captured chopsticks could provide a better reflection of how humans use chopsticks, the teleoperation interface can produce quality on-hardware demonstrations from which the robot can directly learn.

[38]  arXiv:2008.00103 [pdf, ps, other]
Title: F*: An Interpretable Transformation of the F-measure
Comments: 4 pages
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Information Retrieval (cs.IR); Machine Learning (stat.ML)

The F-measure is widely used to assess the performance of classification algorithms. However, some researchers find it lacking in intuitive interpretation, questioning the appropriateness of combining two aspects of performance as conceptually distinct as precision and recall, and also questioning whether the harmonic mean is the best way to combine them. To ease this concern, we describe a simple transformation of the F-measure, which we call F* (F-star), which has an immediate practical interpretation.

[39]  arXiv:2008.00104 [pdf, other]
Title: Optimizing Long-term Social Welfare in Recommender Systems: A Constrained Matching Approach
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR); Machine Learning (stat.ML)

Most recommender systems (RS) research assumes that a user's utility can be maximized independently of the utility of the other agents (e.g., other users, content providers). In realistic settings, this is often not true---the dynamics of an RS ecosystem couple the long-term utility of all agents. In this work, we explore settings in which content providers cannot remain viable unless they receive a certain level of user engagement. We formulate the recommendation problem in this setting as one of equilibrium selection in the induced dynamical system, and show that it can be solved as an optimal constrained matching problem. Our model ensures the system reaches an equilibrium with maximal social welfare supported by a sufficiently diverse set of viable providers. We demonstrate that even in a simple, stylized dynamical RS model, the standard myopic approach to recommendation---always matching a user to the best provider---performs poorly. We develop several scalable techniques to solve the matching problem, and also draw connections to various notions of user regret and fairness, arguing that these outcomes are fairer in a utilitarian sense.

[40]  arXiv:2008.00106 [pdf, other]
Title: Utilising Visual Attention Cues for Vehicle Detection and Tracking
Comments: Accepted in ICPR2020
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Advanced Driver-Assistance Systems (ADAS) have been attracting attention from many researchers. Vision-based sensors are the closest way to emulate human driver visual behavior while driving. In this paper, we explore possible ways to use visual attention (saliency) for object detection and tracking. We investigate: 1) How a visual attention map such as a \emph{subjectness} attention or saliency map and an \emph{objectness} attention map can facilitate region proposal generation in a 2-stage object detector; 2) How a visual attention map can be used for tracking multiple objects. We propose a neural network that can simultaneously detect objects as and generate objectness and subjectness maps to save computational power. We further exploit the visual attention map during tracking using a sequential Monte Carlo probability hypothesis density (PHD) filter. The experiments are conducted on KITTI and DETRAC datasets. The use of visual attention and hierarchical features has shown a considerable improvement of $\approx$8\% in object detection which effectively increased tracking performance by $\approx$4\% on KITTI dataset.

[41]  arXiv:2008.00109 [pdf, other]
Title: Characterization of Assistive Robot Arm Teleoperation: A Preliminary Study to Inform Shared Control
Comments: 10 pages, 7 figures, 3 tables
Subjects: Robotics (cs.RO)

Assistive robotic devices can increase the independence of individuals with motor impairments. However, each person is unique in their level of injury, preferences, and skills, which moreover can change over time. Further, the amount of assistance required can vary throughout the day due to pain or fatigue, or over longer periods due to rehabilitation, debilitating conditions, or aging. Therefore, in order to become an effective team member, the assistive machine should be able to learn from and adapt to the human user. To do so, we need to be able to characterize the user's control commands to determine when and how autonomy should change to best assist the user. We perform a 20 person pilot study in order to establish a set of meaningful performance measures which can be used to characterize the user's control signals and as cues for the autonomy to modify the level and amount of assistance. Our study includes 8 spinal cord injured and 12 uninjured individuals. The results unveil a set of objective, runtime-computable metrics that are correlated with user-perceived task difficulty, and thus could be used by an autonomy system when deciding whether assistance is required. The results further show that metrics which evaluate the user interaction with the robotic device, robot execution, and the perceived task difficulty show differences among spinal cord injured and uninjured groups, and are affected by the type of control interface used. The results will be used to develop an adaptable, user-centered, and individually customized shared-control algorithms.

[42]  arXiv:2008.00113 [pdf, other]
Title: Multi-officer Routing for Patrolling High Risk Areas Jointly Learned from Check-ins, Crime and Incident Response Data
Comments: 21 pages, 7 figures
Subjects: Social and Information Networks (cs.SI); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

A well-crafted police patrol route design is vital in providing community safety and security in the society. Previous works have largely focused on predicting crime events with historical crime data. The usage of large-scale mobility data collected from Location-Based Social Network, or check-ins, and Point of Interests (POI) data for designing an effective police patrol is largely understudied. Given that there are multiple police officers being on duty in a real-life situation, this makes the problem more complex to solve. In this paper, we formulate the dynamic crime patrol planning problem for multiple police officers using check-ins, crime, incident response data, and POI information. We propose a joint learning and non-random optimisation method for the representation of possible solutions where multiple police officers patrol the high crime risk areas simultaneously first rather than the low crime risk areas. Later, meta-heuristic Genetic Algorithm (GA) and Cuckoo Search (CS) are implemented to find the optimal routes. The performance of the proposed solution is verified and compared with several state-of-art methods using real-world datasets.

[43]  arXiv:2008.00115 [pdf, other]
Title: DeepCOVIDNet: An Interpretable Deep Learning Model for Predictive Surveillance of COVID-19 Using Heterogeneous Features and their Interactions
Comments: 16 pages, 7 figures, 2 tables
Subjects: Computers and Society (cs.CY); Machine Learning (cs.LG)

In this paper, we propose a deep learning model to forecast the range of increase in COVID-19 infected cases in future days and we present a novel method to compute equidimensional representations of multivariate time series and multivariate spatial time series data. Using this novel method, the proposed model can both take in a large number of heterogeneous features, such as census data, intra-county mobility, inter-county mobility, social distancing data, past growth of infection, among others, and learn complex interactions between these features. Using data collected from various sources, we estimate the range of increase in infected cases seven days into the future for all U.S. counties. In addition, we use the model to identify the most influential features for prediction of the growth of infection. We also analyze pairs of features and estimate the amount of observed second-order interaction between them. Experiments show that the proposed model obtains satisfactory predictive performance and fairly interpretable feature analysis results; hence, the proposed model could complement the standard epidemiological models for national-level surveillance of pandemics, such as COVID-19. The results and findings obtained from the deep learning model could potentially inform policymakers and researchers in devising effective mitigation and response strategies. To fast-track further development and experimentation, the code used to implement the proposed model has been made fully open source.

[44]  arXiv:2008.00120 [pdf, other]
Title: The Tactician (extended version): A Seamless, Interactive Tactic Learner and Prover for Coq
Comments: 19 pages, 2 figures. This is an extended version of a paper published in CICM-2020. For the project website, see this https URL
Journal-ref: In CICM. volume 12236 of Lecture Notes in Computer Science, pages 271-277. Springer, 2020
Subjects: Artificial Intelligence (cs.AI); Logic in Computer Science (cs.LO)

We present Tactician, a tactic learner and prover for the Coq Proof Assistant. Tactician helps users make tactical proof decisions while they retain control over the general proof strategy. To this end, Tactician learns from previously written tactic scripts and gives users either suggestions about the next tactic to be executed or altogether takes over the burden of proof synthesis. Tactician's goal is to provide users with a seamless, interactive, and intuitive experience together with robust and adaptive proof automation. In this paper, we give an overview of Tactician from the user's point of view, regarding both day-to-day usage and issues of package dependency management while learning in the large. Finally, we give a peek into Tactician's implementation as a Coq plugin and machine learning platform.

[45]  arXiv:2008.00123 [pdf, other]
Title: Noise-response Analysis for Rapid Detection of Backdoors in Deep Neural Networks
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

The pervasiveness of deep neural networks (DNNs) in technology, matched with the ubiquity of cloud-based training and transfer learning, is giving rise to a new frontier for cybersecurity whereby `structural malware' is manifest as compromised weights and activation pathways for unsecure DNNs. In particular, DNNs can be designed to have backdoors in which an adversary can easily and reliably fool a classifier by adding to any image a pattern of pixels called a trigger. Since DNNs are black-box algorithms, it is generally difficult to detect a backdoor or any other type of structural malware. To efficiently provide a reliable signal for the absence/presence of backdoors, we propose a rapid feature-generation step in which we study how DNNs respond to noise-infused images with varying noise intensity. This results in titration curves, which are a type of `fingerprinting' for DNNs. We find that DNNs with backdoors are more sensitive to input noise and respond in a characteristic way that reveals the backdoor and where it leads (i.e,. its target). Our empirical results demonstrate that we can accurately detect a backdoor with high confidence orders-of-magnitude faster than existing approaches (i.e., seconds versus hours). Our method also yields a titration-score that can automate the detection of compromised DNNs, whereas existing backdoor-detection strategies are not automated.

[46]  arXiv:2008.00128 [pdf, other]
Title: White-Box Evaluation of Fingerprint Recognition Systems
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Typical evaluations of fingerprint recognition systems consist of end-to-end black-box evaluations, which assess performance in terms of overall identification or authentication accuracy. However, these black-box tests of system performance do not reveal insights into the performance of the individual modules, including image acquisition, feature extraction, and matching. On the other hand, white-box evaluations, the topic of this paper, measure the individual performance of each constituent module in isolation. While a few studies have conducted white-box evaluations of the fingerprint reader, feature extractor, and matching components, no existing study has provided a full system, white-box analysis of the uncertainty introduced at each stage of a fingerprint recognition system. In this work, we extend previous white-box evaluations of fingerprint recognition system components and provide a unified, in-depth analysis of fingerprint recognition system performance based on the aggregated white-box evaluation results. In particular, we analyze the uncertainty introduced at each stage of the fingerprint recognition system due to adverse capture conditions (i.e., varying illumination, moisture, and pressure) at the time of acquisition. Our experiments show that a system that performs better overall, in terms of black-box recognition performance, does not necessarily perform best at each module in the fingerprint recognition system pipeline, which can only be seen with white-box analysis of each sub-module. Findings such as these enable researchers to better focus their efforts in improving fingerprint recognition systems.

[47]  arXiv:2008.00129 [pdf, other]
Title: GraphQL Live Querying with DynamoDB
Authors: Austin Silveria
Subjects: Databases (cs.DB)

We present a method of implementing GraphQL live queries at the database level. Our DynamoDB simulation in Go mimics a distributed key-value store and implements live queries to expose possible pitfalls. Two key components for implementing live queries are storing fields selected in a live query and determining which object fields have been updated in each database write. A stream(key, fields) request to the system contains fields to include in the live query stream and on subsequent put(key, object) operations, the database asynchronously determines which fields were updated and pushes a new query view to the stream if those fields overlap with the stream() request. Following a discussion of our implementation, we explore motivations for using live queries such as simplifying software communication, minimizing data transfer, and enabling real-time data and describe an architecture for building software with GraphQL and live queries.

[48]  arXiv:2008.00135 [pdf]
Title: Dissipating with Relations: Implication for the Entity-Relationship Model
Authors: Sabah Al-Fedaghi
Comments: 10 pages, 20 figures
Journal-ref: International Journal of Computer Science and Information Security, Vol. 18, No. 6, June 2020
Subjects: Software Engineering (cs.SE)

Difficulties arise when conceptual modeling lacks ontological clarity and rigorous definitions, which is especially the case in the relationship construct. Evidence shows that use of relationships is often problematic when it comes to communicating the form of meaning of an application domain. Research on this topic is important because relationships are central to a number of approaches and commonly used by practitioners. In this paper, we study the notion of relation or relationship in the context of conceptual modeling. Specifically, we focus on the notion of relationship used in the entity-relationship (ER) model. The ER model is scrutinized through a new form of conceptual modeling called the thinging machine (TM) to pursue further understanding of the semantics of the relationship concept. The ER model is composed of three fundamental categories (i.e., entity, relationship and attribute), whereas TM is built from one ontological category called the thing/machine (thimac). Several ER diagrams are re-casted as TM diagrams, creating a categorical collision with interesting implications regarding the status of the conception of relationship in a conceptual model. The re-modeling shows that the relational construct is dissipated into TM flows of things and chronology of events.

[49]  arXiv:2008.00136 [pdf, other]
Title: BatNet: Data transmission between smartphones over ultrasound
Subjects: Cryptography and Security (cs.CR); Computers and Society (cs.CY); Networking and Internet Architecture (cs.NI); Sound (cs.SD)

In this paper, we present BatNet, a data transmission mechanism using ultrasound signals over the built-in speakers and microphones of smartphones. Using phase shift keying with an 8-point constellation and frequencies between 20--24kHz, it can transmit data at over 600bit/s up to 6m. The target application is a censorship-resistant mesh network. We also evaluated it for Covid contact tracing but concluded that in this application ultrasonic communications do not appear to offer enough advantage over Bluetooth Low Energy to be worth further development.

[50]  arXiv:2008.00137 [pdf, other]
Title: MementoEmbed and Raintale for Web Archive Storytelling
Comments: 54 pages, 5 tables, 46 figures
Subjects: Digital Libraries (cs.DL); Human-Computer Interaction (cs.HC); Information Retrieval (cs.IR)

For traditional library collections, archivists can select a representative sample from a collection and display it in a featured physical or digital library space. Web archive collections may consist of thousands of archived pages, or mementos. How should an archivist display this sample to drive visitors to their collection? Search engines and social media platforms often represent web pages as cards consisting of text snippets, titles, and images. Web storytelling is a popular method for grouping these cards in order to summarize a topic. Unfortunately, social media platforms are not archive-aware and fail to consistently create a good experience for mementos. They also allow no UI alterations for their cards. Thus, we created MementoEmbed to generate cards for individual mementos and Raintale for creating entire stories that archivists can export to a variety of formats.

[51]  arXiv:2008.00138 [pdf, other]
Title: Vulnerability Under Adversarial Machine Learning: Bias or Variance?
Comments: 18 pages
Subjects: Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

Prior studies have unveiled the vulnerability of the deep neural networks in the context of adversarial machine learning, leading to great recent attention into this area. One interesting question that has yet to be fully explored is the bias-variance relationship of adversarial machine learning, which can potentially provide deeper insights into this behaviour. The notion of bias and variance is one of the main approaches to analyze and evaluate the generalization and reliability of a machine learning model. Although it has been extensively used in other machine learning models, it is not well explored in the field of deep learning and it is even less explored in the area of adversarial machine learning.
In this study, we investigate the effect of adversarial machine learning on the bias and variance of a trained deep neural network and analyze how adversarial perturbations can affect the generalization of a network. We derive the bias-variance trade-off for both classification and regression applications based on two main loss functions: (i) mean squared error (MSE), and (ii) cross-entropy. Furthermore, we perform quantitative analysis with both simulated and real data to empirically evaluate consistency with the derived bias-variance tradeoffs. Our analysis sheds light on why the deep neural networks have poor performance under adversarial perturbation from a bias-variance point of view and how this type of perturbation would change the performance of a network. Moreover, given these new theoretical findings, we introduce a new adversarial machine learning algorithm with lower computational complexity than well-known adversarial machine learning strategies (e.g., PGD) while providing a high success rate in fooling deep neural networks in lower perturbation magnitudes.

[52]  arXiv:2008.00139 [pdf, other]
Title: SHARI -- An Integration of Tools to Visualize the Story of the Day
Comments: 19 pages, 16 figures, 1 Table
Subjects: Digital Libraries (cs.DL); Human-Computer Interaction (cs.HC); Information Retrieval (cs.IR)

Tools such as Google News and Flipboard exist to convey daily news, but what about the past? In this paper, we describe how to combine several existing tools with web archive holdings to perform news analysis and visualization of the "biggest story" for a given date. StoryGraph clusters news articles together to identify a common news story. Hypercane leverages ArchiveNow to store URLs produced by StoryGraph in web archives. Hypercane analyzes these URLs to identify the most common terms, entities, and highest quality images for social media storytelling. Raintale then uses the output of these tools to produce a visualization of the news story for a given day. We name this process SHARI (StoryGraph Hypercane ArchiveNow Raintale Integration).

[53]  arXiv:2008.00140 [pdf, other]
Title: Experiments in Extractive Summarization: Integer Linear Programming, Term/Sentence Scoring, and Title-driven Models
Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)

In this paper, we revisit the challenging problem of unsupervised single-document summarization and study the following aspects: Integer linear programming (ILP) based algorithms, Parameterized normalization of term and sentence scores, and Title-driven approaches for summarization. We describe a new framework, NewsSumm, that includes many existing and new approaches for summarization including ILP and title-driven approaches. NewsSumm's flexibility allows to combine different algorithms and sentence scoring schemes seamlessly. Our results combining sentence scoring with ILP and normalization are in contrast to previous work on this topic, showing the importance of a broader search for optimal parameters. We also show that the new title-driven reduction idea leads to improvement in performance for both unsupervised and supervised approaches considered.

[54]  arXiv:2008.00141 [pdf, other]
Title: Actor-Action Video Classification CSC 249/449 Spring 2020 Challenge Report
Subjects: Computer Vision and Pattern Recognition (cs.CV)

This technical report summarizes submissions and compiles from Actor-Action video classification challenge held as a final project in CSC 249/449 Machine Vision course (Spring 2020) at University of Rochester

[55]  arXiv:2008.00142 [pdf, other]
Title: Bayesian-Assisted Inference from Visualized Data
Subjects: Human-Computer Interaction (cs.HC); Methodology (stat.ME)

A Bayesian view of data interpretation suggests that a visualization user should update their existing beliefs about a parameter's value in accordance with the amount of information about the parameter value captured by the new observations. Extending recent work applying Bayesian models to understand and evaluate belief updating from visualizations, we show how the predictions of Bayesian inference can be used to guide more rational belief updating. We design a Bayesian inference-assisted uncertainty analogy that numerically relates uncertainty in observed data to the user's subjective uncertainty, and a posterior visualization that prescribes how a user should update their beliefs given their prior beliefs and the observed data. In a pre-registered experiment on 4,800 people, we find that when a newly observed data sample is relatively small (N=158), both techniques reliably improve people's Bayesian updating on average compared to the current best practice of visualizing uncertainty in the observed data. For large data samples (N=5208), where people's updated beliefs tend to deviate more strongly from the prescriptions of a Bayesian model, we find evidence that the effectiveness of the two forms of Bayesian assistance may depend on people's proclivity toward trusting the source of the data. We discuss how our results provide insight into individual processes of belief updating and subjective uncertainty, and how understanding these aspects of interpretation paves the way for more sophisticated interactive visualizations for analysis and communication.

[56]  arXiv:2008.00143 [pdf]
Title: Efficient Independent Vector Extraction of Dominant Target Speech
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)

The complete decomposition performed by blind source separation is computationally demanding and superfluous when only the speech of one specific target speaker is desired. In this paper, we propose a computationally efficient blind speech extraction method based on a proper modification of the commonly utilized independent vector analysis algorithm, under the mild assumption that the average power of signal of interest outweighs interfering speech sources. Considering that the minimum distortion principle cannot be implemented since the full demixing matrix is not available, we also design a one-unit scaling operation to solve the scaling ambiguity. Simulations validate the efficacy of the proposed method in extracting the dominant speech.

[57]  arXiv:2008.00144 [pdf, other]
Title: Solving Elliptic Equations with Brownian Motion: Bias Reduction and Temporal Difference Learning
Comments: 18 pages, 6 figures
Subjects: Numerical Analysis (math.NA); Probability (math.PR)

The Feynman-Kac formula provides a way to understand solutions to elliptic partial differential equations in terms of expectations of continuous time Markov processes. This connection allows for the creation of numerical schemes for solutions based on samples of these Markov processes which have advantages over traditional numerical methods in some cases. However, na\"ive numerical implementations suffer from statistical bias and sampling error. We present methods to discretize the stochastic process appearing in the Feynman-Kac formula that reduce the bias of the numerical scheme. We also propose using temporal difference learning to assemble information from random samples in a way that is more efficient than the traditional Monte Carlo method.

[58]  arXiv:2008.00146 [pdf, ps, other]
Title: CROSSLINE: Breaking ''Security-by-Crash'' based Memory Isolation in AMD SEV
Comments: 14 pages, 5 figures, security
Subjects: Cryptography and Security (cs.CR)

AMD's Secure Encrypted Virtualization (SEV) is an emerging security feature on AMD processors that allows virtual machines to run on encrypted memory and perform confidential computing even with an untrusted hypervisor. This paper first demystifies SEV's improper use of address space identifier (ASID) for controlling accesses of a VM to encrypted memory pages, cache lines, and TLB entries. We then present the CROSSLINE attacks, a novel class of attacks against SEV that allow the adversary to launch an attacker VM and change its ASID to that of the victim VM to impersonate the victim. We present two variants of CROSSLINE attacks: CROSSLINE V1 decrypts victim's page tables or memory blocks following the format of a page table entry; CROSSLINE V2 constructs encryption and decryption oracles by executing instructions of the victim VM. We have successfully performed CROSSLINE attacks on SEV and SEV-ES processors.

[59]  arXiv:2008.00147 [pdf, other]
Title: Achieving Covertness and Secrecy: A New Paradigm for Secure Wireless Communication
Subjects: Information Theory (cs.IT)

This paper explores a novel secure wireless communication paradigm where the physical layer security technology is applied to counteract both detection and eavesdropping attacks, such that the critical covertness and secrecy properties of the communication are jointly guaranteed. To understand the fundamental security performance under this paradigm, we first define a new metric-covert secrecy rate (CSR) to represent the maximum transmission rate subject to the constraints of both covertness and secrecy, and then provide theoretical modeling for covertness outage probability and secrecy outage probability to depict the covertness and secrecy performances of the paradigm. We further conduct detailed theoretical analysis to identify the CSR under various scenarios characterized by the detector-eavesdropper relationships and the secure transmission schemes adopted in transmitters. Finally, numerical results are provided to illustrate the achievable performances under the new paradigm.

[60]  arXiv:2008.00149 [pdf, other]
Title: Hybridization and postprocessing in finite element exterior calculus
Comments: 40 pages
Subjects: Numerical Analysis (math.NA)

We hybridize the methods of finite element exterior calculus for the Hodge-Laplace problem on differential $k$-forms in $\mathbb{R}^n$. In the cases $k = 0$ and $k = n$, we recover well-known primal and mixed hybrid methods for the scalar Poisson equation, while for $0 < k < n$, we obtain new hybrid finite element methods, including methods for the vector Poisson equation in $n = 2$ and $n = 3$ dimensions. We also generalize Stenberg postprocessing from $k = n$ to arbitrary $k$, proving new superconvergence estimates. Finally, we discuss how this hybridization framework may be extended to include nonconforming and hybridizable discontinuous Galerkin methods.

[61]  arXiv:2008.00150 [pdf]
Title: Cluster-Based Information Retrieval by using (K-means)- Hierarchical Parallel Genetic Algorithms Approach
Subjects: Artificial Intelligence (cs.AI); Information Retrieval (cs.IR)

Cluster-based information retrieval is one of the Information retrieval(IR) tools that organize, extract features and categorize the web documents according to their similarity. Unlike traditional approaches, cluster-based IR is fast in processing large datasets of document. To improve the quality of retrieved documents, increase the efficiency of IR and reduce irrelevant documents from user search. in this paper, we proposed a (K-means) - Hierarchical Parallel Genetic Algorithms Approach (HPGA) that combines the K-means clustering algorithm with hybrid PG of multi-deme and master/slave PG algorithms. K-means uses to cluster the population to k subpopulations then take most clusters relevant to the query to manipulate in a parallel way by the two levels of genetic parallelism, thus, irrelevant documents will not be included in subpopulations, as a way to improve the quality of results. Three common datasets (NLP, CISI, and CACM) are used to compute the recall, precision, and F-measure averages. Finally, we compared the precision values of three datasets with Genetic-IR and classic-IR. The proposed approach precision improvements with IR-GA were 45% in the CACM, 27% in the CISI, and 25% in the NLP. While, by comparing with Classic-IR, (k-means)-HPGA got 47% in CACM, 28% in CISI, and 34% in NLP.

[62]  arXiv:2008.00151 [pdf, other]
Title: A Visual Analytics Framework for Contrastive Network Analysis
Comments: This manuscript is currently under review
Subjects: Social and Information Networks (cs.SI); Graphics (cs.GR); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG)

A common network analysis task is comparison of two networks to identify unique characteristics in one network with respect to the other. For example, when comparing protein interaction networks derived from normal and cancer tissues, one essential task is to discover protein-protein interactions unique to cancer tissues. However, this task is challenging when the networks contain complex structural (and semantic) relations. To address this problem, we design ContraNA, a visual analytics framework leveraging both the power of machine learning for uncovering unique characteristics in networks and also the effectiveness of visualization for understanding such uniqueness. The basis of ContraNA is cNRL, which integrates two machine learning schemes, network representation learning (NRL) and contrastive learning (CL), to generate a low-dimensional embedding that reveals the uniqueness of one network when compared to another. ContraNA provides an interactive visualization interface to help analyze the uniqueness by relating embedding results and network structures as well as explaining the learned features by cNRL. We demonstrate the usefulness of ContraNA with two case studies using real-world datasets. We also evaluate through a controlled user study with 12 participants on network comparison tasks. The results show that participants were able to both effectively identify unique characteristics from complex networks and interpret the results obtained from cNRL.

[63]  arXiv:2008.00152 [pdf, other]
Title: Cyber-Resilient Transactive Energy System Design over Insecure Communication Links
Comments: 11 pages, 8 figures, journal submission
Subjects: Cryptography and Security (cs.CR); Systems and Control (eess.SY)

In this paper, the privacy and security issues associated with transactive energy systems over insecure communications are addressed. In particular, it is ensured that, during market-based interactions: (1) each agent's bidding information remains private; and (2) any extraneous data injection attack can be easily detected. A unified cryptography-based approach that can simultaneously achieve both objectives is developed, where privacy preservation is realized by the Paillier encryption scheme, and attack detection is achieved by the Paillier digital signature scheme. Simulation results verify the effectiveness of the proposed cyber-resilient design for transactive energy systems.

[64]  arXiv:2008.00155 [pdf, other]
Title: Step-truncation integrators for evolution equations on low-rank tensor manifolds
Comments: 20 pages, 9 figures
Subjects: Numerical Analysis (math.NA); Computational Physics (physics.comp-ph)

We develop a new class of algorithms, which we call step-truncation methods, to integrate in time an initial value problem for an ODE or a PDE on a low-rank tensor manifold. The new methods are based on performing a time step with a conventional time-stepping scheme followed by a truncation operation into a tensor manifold with prescribed rank. By considering such truncation operation as a nonlinear operator in the space of tensors, we prove various consistency results and errors estimates for a wide range of step-truncation algorithms. In particular, we establish consistency between the best step-truncation method and the best tangent space projection integrator via perturbation analysis. Numerical applications are presented and discussed for a Fokker-Planck equation on a torus of dimension two and four.

[65]  arXiv:2008.00156 [pdf, other]
Title: MIPS: Instance Placement for Stream Processing Systems based on Monte Carlo Tree Search
Subjects: Networking and Internet Architecture (cs.NI)

Stream processing engines enable modern systems to conduct large-scale analytics over unbounded data streams in real time. They often view an application as a direct acyclic graph with streams flowing through pipelined instances of various processing units. One key challenge that emerges is instance placement, i.e., to decide the placement of instances across servers with minimum traffic across servers and maximum resource utilization. The challenge roots in not only its intrinsic complexity but also the impact between successive application deployments. Most updated engines such as Apache Heron exploits a more modularized scheduler design that decomposes the task into two stages: One decides the instance-to-container mapping while the other focuses on the container-to-server mapping that is delegated to standalone resource managers. The unaligned objectives and scheduler designs in the two stages may lead to long response times or low utilization. However, so far little work has appeared to address the challenge. Inspired by the recent success of applications of Monte Carlo Tree Search (MCTS) methods in various fields, we develop a novel model to characterize such systems, formulate the problem, and cast each stage of mapping into a sequential decision process. By adopting MCTS methods, we propose MIPS, an MCTS-based Instance Placement Scheme to decide the two-staged mapping in a timely yet efficient manner. In addition, we discuss practical issues and refine MIPS to further improve its performance. Results from extensive simulations show, given mild-value of samples, MIPS outperforms existing schemes with a significant traffic reduction and utilization improvement. To our best knowledge, this paper is the first to study the two-staged mapping problem and to apply MCTS to solving the challenge.

[66]  arXiv:2008.00157 [pdf, other]
Title: L-CNN: A Lattice cross-fusion strategy for multistream convolutional neural networks
Comments: 5 pages, 3 figures
Journal-ref: Electronics Letters, vol. 55, no. 22, pp. 1180-1182, 2029
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

This paper proposes a fusion strategy for multistream convolutional networks, the Lattice Cross Fusion. This approach crosses signals from convolution layers performing mathematical operation-based fusions right before pooling layers. Results on a purposely worsened CIFAR-10, a popular image classification data set, with a modified AlexNet-LCNN version show that this novel method outperforms by 46% the baseline single stream network, with faster convergence, stability, and robustness.

[67]  arXiv:2008.00158 [pdf, ps, other]
Title: TexMesh: Reconstructing Detailed Human Texture and Geometry from RGB-D Video
Comments: 22 pages, 16 figures, to be published in ECCV 2020
Subjects: Computer Vision and Pattern Recognition (cs.CV)

We present TexMesh, a novel approach to reconstruct detailed human meshes with high-resolution full-body texture from RGB-D video. TexMesh enables high quality free-viewpoint rendering of humans. Given the RGB frames, the captured environment map, and the coarse per-frame human mesh from RGB-D tracking, our method reconstructs spatiotemporally consistent and detailed per-frame meshes along with a high-resolution albedo texture. By using the incident illumination we are able to accurately estimate local surface geometry and albedo, which allows us to further use photometric constraints to adapt a synthetically trained model to real-world sequences in a self-supervised manner for detailed surface geometry and high-resolution texture estimation. In practice, we train our models on a short example sequence for self-adaptation and the model runs at interactive framerate afterwards. We validate TexMesh on synthetic and real-world data, and show it outperforms the state of art quantitatively and qualitatively.

[68]  arXiv:2008.00159 [pdf, ps, other]
Title: POTUS: Predictive Online Tuple Scheduling for Data Stream Processing Systems
Subjects: Networking and Internet Architecture (cs.NI)

Most online service providers deploy their own data stream processing systems in the cloud to conduct large-scale and real-time data analytics. However, such systems, e.g., Apache Heron, often adopt naive scheduling schemes to distribute data streams (in the units of tuples) among processing instances, which may result in workload imbalance and system disruption. Hence, there still exists a mismatch between the temporal variations of data streams and such inflexible scheduling scheme designs. Besides, the fundamental benefits of predictive scheduling to data stream processing systems also remain unexplored. In this paper, we focus on the problem of tuple scheduling with predictive service in Apache Heron. With a careful choice in the granularity of system modeling and decision making, we formulate the problem as a stochastic network optimization problem and propose POTUS, an online predictive scheduling scheme that aims to minimize the response time of data stream processing by steering data streams in a distributed fashion. Theoretical analysis and simulation results show that POTUS achieves an ultra-low response time with queue stability guarantee. Moreover, POTUS only requires mild-value of future information to effectively reduce the response time, even with mis-prediction.

[69]  arXiv:2008.00161 [pdf, other]
Title: Online User-AP Association with Predictive Scheduling in Wireless Caching Networks
Subjects: Networking and Internet Architecture (cs.NI)

For wireless caching networks, the scheme design for content delivery is non-trivial in the face of the following tradeoff. On one hand, to optimize overall throughput, users can associate their nearby APs with great channel capacities; however, this may lead to unstable queue backlogs on APs and prolong request delays. On the other hand, to ensure queue stability, some users may have to associate APs with inferior channel states, which would incur throughput loss. Moreover, for such systems, how to conduct predictive scheduling to reduce delays and the fundamental limits of its benefits remain unexplored. In this paper, we formulate the problem of online user-AP association and resource allocation for content delivery with predictive scheduling under a fixed content placement as a stochastic network optimization problem. By exploiting its unique structure, we transform the problem into a series of modular maximization sub-problems with matroid constraints. Then we devise PUARA, a Predictive User-AP Association and Resource Allocation scheme which achieves a provably near-optimal throughput with queue stability. Our theoretical analysis and simulation results show that PUARA can not only perform a tunable control between throughput maximization and queue stability but also incur a notable delay reduction with predicted information.

[70]  arXiv:2008.00164 [pdf, other]
Title: Byzantine-Resilient Distributed Hypothesis Testing With Time-Varying Network Topology
Subjects: Systems and Control (eess.SY)

We study the problem of distributed hypothesis testing over a network of mobile agents. Each agent follows a planned trajectory and makes noisy local observations whose distribution is conditioned on the unknown true hypothesis (out of a finite set of candidate hypotheses) and the agent's current location. Due to the limited communication and sensing ranges, the mobile agent team induces a communication graph with a time-varying topology and needs to collaboratively detect the true hypothesis. In particular, we consider a scenario where there exists an unknown subset of compromised agents that may deliberately share altered information to undermine the team objective. We propose two distributed algorithms where each agent maintains and updates two sets of beliefs, namely local and actual beliefs. In both algorithms, at every time step, each agent shares its actual belief with other agents within its communication range, makes a local observation, and updates its local belief as a function of its local observation and local belief. Then both algorithms can use the shared information to update actual beliefs under certain conditions. One requires receiving a certain number of shared beliefs at each time instant; the other accumulates shared beliefs over time and updates after the number of shared beliefs exceeds a prescribed threshold. Otherwise, both algorithms rely on the agent's current local belief and actual beliefs to update the new actual belief. We prove under mild assumptions that the actual belief for every non-compromised agent converges almost surely to the true hypothesis. We guarantee this convergence without requiring that the underlying time-varying network topology is connected. We illustrate and compare the proposed algorithms with a simulation of a team of unmanned aerial vehicles aiming to classify adversarial agents among themselves.

[71]  arXiv:2008.00168 [pdf]
Title: Land Cover Classification from Remote Sensing Images Based on Multi-Scale Fully Convolutional Network
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)

In this paper, a Multi-Scale Fully Convolutional Network (MSFCN) with multi-scale convolutional kernel is proposed to exploit discriminative representations from two-dimensional (2D) satellite images.

[72]  arXiv:2008.00170 [pdf]
Title: Impact and Implementation of Reserved Lanes for Automated Driving on Signalized Urban Arterials
Authors: Slobodan Gutesa
Comments: 19 pages, 16 figures
Subjects: Systems and Control (eess.SY)

An automated vehicle refers to a vehicle that can achieve a safe movement on a roadway facility without the influence of a human driver. With emerging trend of the connected vehicle concept over the past decade, numerous state-of-the-art applications focusing on automated vehicle-based intersection control have been proposed. The main purpose of this study is to estimate and evaluate impact of designated lanes for automated vehicles and recommend some viable lane configuration scenarios for signalized urban arterials. The automated driving was simulated in PTV Vissim using trajectory-driven control strategy. The concept evaluation through microsimulation reveals significant mobility improvements compared to operational scenario without lane reservation. Findings imply that for signalized corridors observed in this study, total travel time reductions are ranging from 5.1% to 19.4% depending on C/AV market penetration, and test-bed configuration parameters.

[73]  arXiv:2008.00171 [pdf, other]
Title: DeACT: Architecture-Aware Virtual Memory Support for Fabric Attached Memory Systems
Subjects: Hardware Architecture (cs.AR)

The exponential growth of data has driven technology providers to develop new protocols, such as cache coherent interconnects and memory semantic fabrics, to help users and facilities leverage advances in memory technologies to satisfy these growing memory and storage demands. Using these new protocols, fabric-attached memories (FAM) can be directly attached to a system interconnect and be easily integrated with a variety of processing elements (PEs). Moreover, systems that support FAM can be smoothly upgraded and allow multiple PEs to share the FAM memory pools using well-defined protocols. The sharing of FAM between PEs allows efficient data sharing, improves memory utilization, reduces cost by allowing flexible integration of different PEs and memory modules from several vendors, and makes it easier to upgrade the system. One promising use-case for FAMs is in High-Performance Compute (HPC) systems, where the underutilization of memory is a major challenge. However, adopting FAMs in HPC systems brings new challenges. In addition to cost, flexibility, and efficiency, one particular problem that requires rethinking is virtual memory support for security and performance. To address these challenges, this paper presents decoupled access control and address translation (DeACT), a novel virtual memory implementation that supports HPC systems equipped with FAM. Compared to the state-of-the-art two-level translation approach, DeACT achieves speedup of up to 4.59x (1.8x on average) without compromising security.

[74]  arXiv:2008.00175 [pdf, ps, other]
Title: State-of-The-Art Fuzzy Active Contour Models for Image Segmentation
Journal-ref: Soft Computing, 1-17 (2020)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)

Image segmentation is the initial step for every image analysis task. A large variety of segmentation algorithm has been proposed in the literature during several decades with some mixed success. Among them, the fuzzy energy based active contour models get attention to the researchers during last decade which results in development of various methods. A good segmentation algorithm should perform well in a large number of images containing noise, blur, low contrast, region in-homogeneity, etc. However, the performances of the most of the existing fuzzy energy based active contour models have been evaluated typically on the limited number of images. In this article, our aim is to review the existing fuzzy active contour models from the theoretical point of view and also evaluate them experimentally on a large set of images under the various conditions. The analysis under a large variety of images provides objective insight into the strengths and weaknesses of various fuzzy active contour models. Finally, we discuss several issues and future research direction on this particular topic.

[75]  arXiv:2008.00176 [pdf, other]
Title: Custom Tailored Suite of Random Forests for Prefetcher Adaptation
Comments: 4 pages, 4 figures
Subjects: Hardware Architecture (cs.AR); Machine Learning (cs.LG); Performance (cs.PF)

To close the gap between memory and processors, and in turn improve performance, there has been an abundance of work in the area of data/instruction prefetcher designs. Prefetchers are deployed in each level of the memory hierarchy, but typically, each prefetcher gets designed without comprehensively accounting for other prefetchers in the system. As a result, these individual prefetcher designs do not always complement each other, and that leads to low average performance gains and/or many negative outliers. In this work, we propose SuitAP (Suite of random forests for Adaptation of Prefetcher system configuration), which is a hardware prefetcher adapter that uses a suite of random forests to determine at runtime which prefetcher should be ON at each memory level, such that they complement each other. Compared to a design with no prefetchers, using SuitAP we improve IPC by 46% on average across traces generated from SPEC2017 suite with 12KB overhead. Moreover, we also reduce negative outliers using SuitAP.

[76]  arXiv:2008.00177 [pdf, other]
Title: Multi-node Bert-pretraining: Cost-efficient Approach
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Machine Learning (stat.ML)

Recently, large scale Transformer-based language models such as BERT, GPT-2, and XLNet have brought about exciting leaps in state-of-the-art results for many Natural Language Processing (NLP) tasks. One of the common trends in these recent models is a significant increase in model complexity, which introduces both more weights and computation. Moreover, with the advent of large-scale unsupervised datasets, training time is further extended due to the increased amount of data samples within a single training epoch. As a result, to train these models within a reasonable time, machine learning (ML) programmers often require advanced hardware setups such as the premium GPU-enabled NVIDIA DGX workstations or specialized accelerators such as Google's TPU Pods. Our work addresses this limitation and demonstrates that the BERT pre-trained model can be trained within 2 weeks on an academic-size cluster of widely available GPUs through careful algorithmic and software optimizations. In this paper, we present these optimizations on how to improve single device training throughput, distribute the training workload over multiple nodes and GPUs, and overcome the communication bottleneck introduced by the large data exchanges over the network. We show that we are able to perform pre-training on BERT within a reasonable time budget (12 days) in an academic setting, but with a much less expensive and less aggressive hardware resource requirement than in previously demonstrated industrial settings based on NVIDIA DGX machines or Google's TPU Pods.

[77]  arXiv:2008.00178 [pdf, other]
Title: Contrastive Explanations in Neural Networks
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

Visual explanations are logical arguments based on visual features that justify the predictions made by neural networks. Current modes of visual explanations answer questions of the form $`Why \text{ } P?'$. These $Why$ questions operate under broad contexts thereby providing answers that are irrelevant in some cases. We propose to constrain these $Why$ questions based on some context $Q$ so that our explanations answer contrastive questions of the form $`Why \text{ } P, \text{} rather \text{ } than \text{ } Q?'$. In this paper, we formalize the structure of contrastive visual explanations for neural networks. We define contrast based on neural networks and propose a methodology to extract defined contrasts. We then use the extracted contrasts as a plug-in on top of existing $`Why \text{ } P?'$ techniques, specifically Grad-CAM. We demonstrate their value in analyzing both networks and data in applications of large-scale recognition, fine-grained recognition, subsurface seismic analysis, and image quality assessment.

[78]  arXiv:2008.00180 [pdf, ps, other]
Title: Correlated Data in Differential Privacy: Definition and Analysis
Subjects: Cryptography and Security (cs.CR)

Differential privacy is a rigorous mathematical framework for evaluating and protecting data privacy. In most existing studies, there is a vulnerable assumption that records in a dataset are independent when differential privacy is applied. However, in real-world datasets, records are likely to be correlated, which may lead to unexpected data leakage. In this survey, we investigate the issue of privacy loss due to data correlation under differential privacy models. Roughly, we classify existing literature into three lines: 1) using parameters to describe data correlation in differential privacy, 2) using models to describe data correlation in differential privacy, and 3) describing data correlation based on the framework of Pufferfish. Firstly, a detailed example is given to illustrate the issue of privacy leakage on correlated data in real scenes. Then our main work is to analyze and compare these methods, and evaluate situations that these diverse studies are applied. Finally, we propose some future challenges on correlated differential privacy.

[79]  arXiv:2008.00181 [pdf, other]
Title: Relation-aware Meta-learning for Market Segment Demand Prediction with Limited Records
Comments: First two authors contributed equally
Subjects: Machine Learning (cs.LG)

Recently, E-commerce platforms have extensive impacts on our human life. To provide an efficient platform, one of the most fundamental problem is how to balance the demand and supply in market segments. While conventional machine learning models have achieved a great success on data-sufficient segments, it may fail in a large-portion of segments in E-commerce platforms, where there are not sufficient records to learn well-trained models. In this paper, we tackle this problem in the context of market segment demand prediction. The goal is to facilitate the learning process in the target segments even facing a shortage of related training data by leveraging the learned knowledge from data-sufficient source segments. Specifically, we propose a novel algorithm, RMLDP, to incorporate a multi-pattern fusion network (MPFN) with a meta-learning paradigm. The multi-pattern fusion network considers both local and global temporal patterns for segment demand prediction. In the meta-learning paradigm, the transferable knowledge is regarded as the model parameter initializations of MPFN, which are learned from diverse source segments. Furthermore, we capture the segment relations by combining data-driven segment representation and segment knowledge graph representation and tailor the segment-specific relations to customize transferable model parameter initializations. Thus, even with limited data, the target segment can quickly find the most relevant transferred knowledge and adapt to the optimal parameters. Extensive experiments are conducted on two large-scale industrial datasets. The results show that our RMLDP outperforms a set of state-of-the-art baselines. In addition, RMLDP has also been deployed in Taobao, a real-world E-commerce platform. The online A/B testing results further demonstrate the practicality of RMLDP.

[80]  arXiv:2008.00188 [pdf, other]
Title: Augmented Skeleton Based Contrastive Action Learning with Momentum LSTM for Unsupervised Action Recognition
Comments: Our codes are available at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Action recognition via 3D skeleton data is an emerging important topic in these years. Most existing methods either extract hand-crafted descriptors or learn action representations by supervised learning paradigms that require massive labeled data. In this paper, we for the first time propose a contrastive action learning paradigm named AS-CAL that can leverage different augmentations of unlabeled skeleton data to learn action representations in an unsupervised manner. Specifically, we first propose to contrast similarity between augmented instances (query and key) of the input skeleton sequence, which are transformed by multiple novel augmentation strategies, to learn inherent action patterns (''pattern-invariance'') of different skeleton transformations. Second, to encourage learning the pattern-invariance with more consistent action representations, we propose a momentum LSTM, which is implemented as the momentum-based moving average of LSTM based query encoder, to encode long-term action dynamics of the key sequence. Third, we introduce a queue to store the encoded keys, which allows our model to flexibly reuse proceeding keys and build a more consistent dictionary to improve contrastive learning. Last, by temporally averaging the hidden states of action learned by the query encoder, a novel representation named Contrastive Action Encoding (CAE) is proposed to represent human's action effectively. Extensive experiments show that our approach typically improves existing hand-crafted methods by 10-50% top-1 accuracy, and it can achieve comparable or even superior performance to numerous supervised learning methods.

[81]  arXiv:2008.00189 [pdf, ps, other]
Title: Performance Analysis of Intelligent Reflecting Surface Aided Communication Systems
Subjects: Information Theory (cs.IT)

This letter presents a detailed performance analysis of the intelligent reflecting surface (IRS) aided single-input single-output communication systems, taking into account of the direct link between the transmitter and receiver. A closed-form upper bound is derived for the ergodic capacity, and an accurate approximation is obtained for the outage probability. In addition, simplified expressions are presented in the asymptotic regime. Numerical results are provided to validate the correctness of the theoretical analysis. It is found that increasing the number of reflecting elements can significantly boost the ergodic capacity and outage probability performance, and a strong line-of-sight component is also beneficial. In addition, it is desirable to deploy the IRS close to the transmitter or receiver, rather than in the middle.

[82]  arXiv:2008.00190 [pdf, ps, other]
Title: Nearest Empirical Distribution: An Asymptotically Optimal Algorithm For Supervised Classification of Data Vectors with Independent Non-Identically Distributed Elements
Subjects: Machine Learning (cs.LG); Information Theory (cs.IT)

In this paper, we propose a classifier for supervised classification of data vectors with mutually independent but non-identically distributed elements. For the proposed classifier, we derive an upper bound on the error probability and show that the error probability goes to zero as the length of the data vectors grows, even when there is only one training data vector per label available. As a result, the proposed classifier is asymptomatically optimal for this type of data vectors. Our numerical examples show that the performance of the proposed classifier outperforms conventional classification algorithms when the number of training data is small and the length of the data vectors is sufficiently high.

[83]  arXiv:2008.00191 [pdf, other]
Title: Dynamic Legged Manipulation of a Ball Through Multi-Contact Optimization
Subjects: Robotics (cs.RO)

The feet of robots are typically used to design locomotion strategies, such as balancing, walking, and running. However, they also have great potential to perform manipulation tasks. In this paper, we propose a model predictive control (MPC) framework for a quadrupedal robot to dynamically balance on a ball and simultaneously manipulate it to follow various trajectories such as straight lines, sinusoids, circles and in-place turning. We numerically validate our controller on the Mini Cheetah robot using different gaits including trotting, bounding, and pronking on the ball.

[84]  arXiv:2008.00192 [pdf, other]
Title: PanoNet: Real-time Panoptic Segmentation through Position-Sensitive Feature Embedding
Subjects: Computer Vision and Pattern Recognition (cs.CV)

We propose a simple, fast, and flexible framework to generate simultaneously semantic and instance masks for panoptic segmentation. Our method, called PanoNet, incorporates a clean and natural structure design that tackles the problem purely as a segmentation task without the time-consuming detection process. We also introduce position-sensitive embedding for instance grouping by accounting for both object's appearance and its spatial location. Overall, PanoNet yields high panoptic quality results of high-resolution Cityscapes images in real-time, significantly faster than all other methods with comparable performance. Our approach well satisfies the practical speed and memory requirement for many applications like autonomous driving and augmented reality.

[85]  arXiv:2008.00196 [pdf, other]
Title: Data-Driven Bandit Learning for Proactive Cache Placement in Fog-Assisted IoT Systems
Subjects: Networking and Internet Architecture (cs.NI); Signal Processing (eess.SP)

In Fog-assisted IoT systems, it is a common practice to cache popular content at the network edge to achieve high quality of service. Due to uncertainties in practice such as unknown file popularities, cache placement scheme design is still an open problem with unresolved challenges: 1) how to maintain time-averaged storage costs under budgets, 2) how to incorporate online learning to aid cache placement to minimize performance loss (a.k.a. regret), and 3) how to exploit offline history information to further reduce regret. In this paper, we formulate the cache placement problem with unknown file popularities as a constrained combinatorial multi-armed bandit (CMAB) problem. To solve the problem, we employ virtual queue techniques to manage time-averaged constraints, and adopt data-driven bandit learning methods to integrate offline history information into online learning to handle exploration-exploitation tradeoff. With an effective combination of online control and data-driven online learning, we devise a Cache Placement scheme with Data-driven Bandit Learning called CPDBL. Our theoretical analysis and simulations show that CPDBL achieves a sublinear time-averaged regret under long-term storage cost constraints.

[86]  arXiv:2008.00199 [pdf, other]
Title: Green Offloading in Fog-Assisted IoT Systems: An Online Perspective Integrating Learning and Control
Subjects: Networking and Internet Architecture (cs.NI); Artificial Intelligence (cs.AI); Signal Processing (eess.SP)

In fog-assisted IoT systems, it is a common practice to offload tasks from IoT devices to their nearby fog nodes to reduce task processing latencies and energy consumptions. However, the design of online energy-efficient scheme is still an open problem because of various uncertainties in system dynamics such as processing capacities and transmission rates. Moreover, the decision-making process is constrained by resource limits on fog nodes and IoT devices, making the design even more complicated. In this paper, we formulate such a task offloading problem with unknown system dynamics as a combinatorial multi-armed bandit (CMAB) problem with long-term constraints on time-averaged energy consumptions. Through an effective integration of online learning and online control, we propose a \textit{Learning-Aided Green Offloading} (LAGO) scheme. In LAGO, we employ bandit learning methods to handle the exploitation-exploration tradeoff and utilize virtual queue techniques to deal with the long-term constraints. Our theoretical analysis shows that LAGO can reduce the average task latency with a tunable sublinear regret bound over a finite time horizon and satisfy the long-term time-averaged energy constraints. We conduct extensive simulations to verify such theoretical results.

[87]  arXiv:2008.00202 [pdf, other]
Title: Contextual Document Similarity for Content-based Literature Recommender Systems
Authors: Malte Ostendorff
Comments: In Proceedings of the Doctoral Consortium at ACM/IEEE Joint Conference on Digital Libraries (JCDL 2020)
Journal-ref: Proceedings of the Doctoral Consortium at ACM/IEEE Joint Conference on Digital Libraries (JCDL 2020)
Subjects: Information Retrieval (cs.IR); Digital Libraries (cs.DL)

To cope with the ever-growing information overload, an increasing number of digital libraries employ content-based recommender systems. These systems traditionally recommend related documents with the help of similarity measures. However, current document similarity measures simply distinguish between similar and dissimilar documents. This simplification is especially crucial for extensive documents, which cover various facets of a topic and are often found in digital libraries. Still, these similarity measures neglect to what facet the similarity relates. Therefore, the context of the similarity remains ill-defined. In this doctoral thesis, we explore contextual document similarity measures, i.e., methods that determine document similarity as a triple of two documents and the context of their similarity. The context is here a further specification of the similarity. For example, in the scientific domain, research papers can be similar with respect to their background, methodology, or findings. The measurement of similarity in regards to one or more given contexts will enhance recommender systems. Namely, users will be able to explore document collections by formulating queries in terms of documents and their contextual similarities. Thus, our research objective is the development and evaluation of a recommender system based on contextual similarity. The underlying techniques will apply established similarity measures and as well as neural approaches while utilizing semantic features obtained from links between documents and their text.

[88]  arXiv:2008.00204 [pdf, other]
Title: PORA: Predictive Offloading and Resource Allocation in Dynamic Fog Computing Systems
Subjects: Networking and Internet Architecture (cs.NI)

In multi-tiered fog computing systems, to accelerate the processing of computation-intensive tasks for real-time IoT applications, resource-limited IoT devices can offload part of their workloads to nearby fog nodes, %with greater computation capacities, whereafter such workloads may be offloaded to upper-tier fog nodes with greater computation capacities. Such hierarchical offloading, though promising to shorten processing latencies, may also induce excessive power consumptions and latencies for wireless transmissions. With the temporal variation of various system dynamics, such a trade-off makes it rather challenging to conduct effective and online offloading decision making. Meanwhile, the fundamental benefits of predictive offloading to fog computing systems still remain unexplored. In this paper, we focus on the problem of dynamic offloading and resource allocation with traffic prediction in multi-tiered fog computing systems. By formulating the problem as a stochastic network optimization problem, we aim to minimize the time-average power consumptions with stability guarantee for all queues in the system. We exploit unique problem structures and propose PORA, an efficient and distributed predictive offloading and resource allocation scheme for multi-tiered fog computing systems. Our theoretical analysis and simulation results show that PORA incurs near-optimal power consumptions with queue stability guarantee. Furthermore, PORA requires only mild-value of predictive information to achieve a notable latency reduction, even with prediction errors.

[89]  arXiv:2008.00206 [pdf, other]
Title: HMOR: Hierarchical Multi-Person Ordinal Relations for Monocular Multi-Person 3D Pose Estimation
Comments: To appear on ECCV2020
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Remarkable progress has been made in 3D human pose estimation from a monocular RGB camera. However, only a few studies explored 3D multi-person cases. In this paper, we attempt to address the lack of a global perspective of the top-down approaches by introducing a novel form of supervision - Hierarchical Multi-person Ordinal Relations (HMOR). The HMOR encodes interaction information as the ordinal relations of depths and angles hierarchically, which captures the body-part and joint level semantic and maintains global consistency at the same time. In our approach, an integrated top-down model is designed to leverage these ordinal relations in the learning process. The integrated model estimates human bounding boxes, human depths, and root-relative 3D poses simultaneously, with a coarse-to-fine architecture to improve the accuracy of depth estimation. The proposed method significantly outperforms state-of-the-art methods on publicly available multi-person 3D pose datasets. In addition to superior performance, our method costs lower computation complexity and fewer model parameters.

[90]  arXiv:2008.00207 [pdf, other]
Title: Online Task Scheduling for Fog Computing with Multi-Resource Fairness
Subjects: Networking and Internet Architecture (cs.NI); Artificial Intelligence (cs.AI)

In fog computing systems, one key challenge is online task scheduling, i.e., to decide the resource allocation for tasks that are continuously generated from end devices. The design is challenging because of various uncertainties manifested in fog computing systems; e.g., tasks' resource demands remain unknown before their actual arrivals. Recent works have applied deep reinforcement learning (DRL) techniques to conduct online task scheduling and improve various objectives. However, they overlook the multi-resource fairness for different tasks, which is key to achieving fair resource sharing among tasks but in general non-trivial to achieve. Thusly, it is still an open problem to design an online task scheduling scheme with multi-resource fairness. In this paper, we address the above challenges. Particularly, by leveraging DRL techniques and adopting the idea of dominant resource fairness (DRF), we propose FairTS, an online task scheduling scheme that learns directly from experience to effectively shorten average task slowdown while ensuring multi-resource fairness among tasks. Simulation results show that FairTS outperforms state-of-the-art schemes with an ultra-low task slowdown and better resource fairness.

[91]  arXiv:2008.00208 [pdf, other]
Title: Service Chain Composition with Failures in NFV Systems: A Game-Theoretic Perspective
Subjects: Networking and Internet Architecture (cs.NI); Artificial Intelligence (cs.AI)

For state-of-the-art network function virtualization (NFV) systems, it remains a key challenge to conduct effective service chain composition for different network services (NSs) with ultra-low request latencies and minimum network congestion. To this end, existing solutions often require full knowledge of the network state, while ignoring the privacy issues and overlooking the non-cooperative behaviors of users. What is more, they may fall short in the face of unexpected failures such as user unavailability and virtual machine breakdown. In this paper, we formulate the problem of service chain composition in NFV systems with failures as a non-cooperative game. By showing that such a game is a weighted potential game and exploiting the unique problem structure, we propose two effective distributed schemes that guide the service chain compositions of different NSs towards the Nash equilibrium (NE) state with both near-optimal latencies and minimum congestion. Besides, we develop two novel learning-aided schemes as comparisons, which are based on deep reinforcement learning (DRL) and Monte Carlo tree search (MCTS) techniques, respectively. Our theoretical analysis and simulation results demonstrate the effectiveness of our proposed schemes, as well as the adaptivity when faced with failures.

[92]  arXiv:2008.00211 [pdf]
Title: Device to Remotely Track and Locate the Position of a Child for Safety
Comments: 6 pages; 8 figures
Subjects: Computers and Society (cs.CY); Hardware Architecture (cs.AR)

Parents are always worried about the wellbeing of their children. As per the Statistics Report 2017 by Missing Children Europe Organization, a child is reported missing every 2 minutes. Due to the imminent threat, parents are prone to buy their children mobile phones to keep in touch with them. However, giving a Mobile phone to a child can cause issues including cyber bullying, improper use of social networks, access to mature age and illicit content on the internet and possibly, phone theft. As an effort to tackle some of those issues, this paper proposes a solution which enables parents to call, locate and track their children using a child-friendly mobile device. The common scenario the device would come to play is in enhancing the safety of a child who would travel alone on a typical route; for instance a child who walks from home to school and back. The device can be calibrated to keep track of a typical route of travel. Then, if the device de-tects some deviation from the usual route, it would trigger a notification to parents. A probability matrix based nov-el algorithm is introduced to detect route deviation. De-sign details of the mobile device, along with the details of the route deviation detection algorithm are presented in this paper.

[93]  arXiv:2008.00212 [pdf, other]
Title: An adaptive BDF2 implicit time-stepping method for the phase field crystal model
Comments: 29 pages, 18 figures, 2 tables
Subjects: Numerical Analysis (math.NA)

An adaptive BDF2 implicit time-stepping method is analyzed for the phase field crystal model. The suggested method is proved to preserve a modified energy dissipation law at the discrete levels if the time-step ratios $r_k:=\tau_k/\tau_{k-1}<3.561$, a recent zero-stability restriction of variable-step BDF2 scheme for ordinary differential problems. By using the discrete orthogonal convolution kernels and the corresponding convolution inequalities, an optimal $L^2$ norm error estimate is established under the weak step-ratio restriction $0<r_k<3.561$ ensuring the energy stability. This is the first time such error estimate is theoretically proved for a nonlinear parabolic equation. On the basis of ample tests on random time meshes, a useful adaptive time-stepping strategy is suggested to efficiently capture the multi-scale behaviors and to accelerate the numerical simulations.

[94]  arXiv:2008.00214 [pdf, other]
Title: Dissecting contact tracing apps in the Android platform
Comments: Revised manuscript according to comments from the moderators
Subjects: Cryptography and Security (cs.CR)

The paper at hand offers an analysis of all Android contact tracing apps deployed hitherto by European countries. Each app was closely scrutinised both statically and dynamically by means of dynamic instrumentation. The results reported from static analysis include permissions, API calls, and possible connections to external URLs. Dynamic analysis collected data pertaining to Java classes, network traffic, and intents. We present several key findings regarding static analysis. On the other hand, due also to the fact that we utilised virtual machines to run the apps, the dynamic analysis did not yield significant results and is to be further addressed in future work.

[95]  arXiv:2008.00217 [pdf, other]
Title: Efficient Adversarial Attacks for Visual Object Tracking
Journal-ref: eccv 2020
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)

Visual object tracking is an important task that requires the tracker to find the objects quickly and accurately. The existing state-ofthe-art object trackers, i.e., Siamese based trackers, use DNNs to attain high accuracy. However, the robustness of visual tracking models is seldom explored. In this paper, we analyze the weakness of object trackers based on the Siamese network and then extend adversarial examples to visual object tracking. We present an end-to-end network FAN (Fast Attack Network) that uses a novel drift loss combined with the embedded feature loss to attack the Siamese network based trackers. Under a single GPU, FAN is efficient in the training speed and has a strong attack performance. The FAN can generate an adversarial example at 10ms, achieve effective targeted attack (at least 40% drop rate on OTB) and untargeted attack (at least 70% drop rate on OTB).

[96]  arXiv:2008.00223 [pdf, other]
Title: Unsupervised Deep Cross-modality Spectral Hashing
Comments: Accepted to IEEE Transaction on Image Processing (TIP)
Subjects: Computer Vision and Pattern Recognition (cs.CV)

This paper presents a novel framework, namely Deep Cross-modality Spectral Hashing (DCSH), to tackle the unsupervised learning problem of binary hash codes for efficient cross-modal retrieval. The framework is a two-step hashing approach which decouples the optimization into (1) binary optimization and (2) hashing function learning. In the first step, we propose a novel spectral embedding-based algorithm to simultaneously learn single-modality and binary cross-modality representations. While the former is capable of well preserving the local structure of each modality, the latter reveals the hidden patterns from all modalities. In the second step, to learn mapping functions from informative data inputs (images and word embeddings) to binary codes obtained from the first step, we leverage the powerful CNN for images and propose a CNN-based deep architecture to learn text modality. Quantitative evaluations on three standard benchmark datasets demonstrate that the proposed DCSH method consistently outperforms other state-of-the-art methods.

[97]  arXiv:2008.00226 [pdf, other]
Title: Regularization by Denoising via Fixed-Point Projection (RED-PRO)
Comments: 34 Pages, 6 figures, 9 tables
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

Inverse problems in image processing are typically cast as optimization tasks, consisting of data fidelity and stabilizing regularization terms. A recent regularization strategy of great interest utilizes the power of denoising engines. Two such methods are the Plug-and-Play Prior (PnP) and Regularization by Denoising (RED). While both have shown state-of-the-art results in various recovery tasks, their theoretical justification is incomplete. In this paper, we aim to enrich the understanding of RED and its connection to PnP. Towards that end, we reformulate RED as a convex optimization problem utilizing a projection (RED- PRO) onto the fixed-point set of demicontractive denoisers. We offer a simple iterative solution to this problem, and establish a novel unification of RED-PRO and PnP, while providing guarantees for their convergence to the globally optimal solution. We also present several relaxations of RED-PRO that allow for handling denoisers with limited fixed-point sets. Finally, we demonstrate RED-PRO for the tasks of image deblurring and super-resolution, showing improved results with respect to the original RED framework.

[98]  arXiv:2008.00229 [pdf, other]
Title: Standardized Green View Index and Quantification of Different Metrics of Urban Green Vegetation
Comments: 14 pages, 9 figures
Subjects: Computers and Society (cs.CY)

Urban greenery is considered an important factor in relation to sustainable development and people's quality of life in the city. Although ways to measure urban greenery have been proposed, the characteristics of each metric have not been fully established, rendering previous researches vulnerable to changes in greenery metrics. To make estimation more robust, this study aims to (1) propose an improved indicator of greenery visibility for analytical use (standardized GVI; sGVI), and (2) quantify the relation between sGVI and other greenery metrics. Analyzing a data set for Yokohama city, Japan, it is shown that the sGVI, a weighted form of GVI aggregated to an area, mitigates the bias of densely located measurement sites. Also, by comparing sGVI and NDVI at city block level, we found that sGVI captures the presence of vegetation better in the city center, whereas NDVI is better in capturing vegetation in parks and forests. These tools provide a foundation for accessing the effect of vegetation in urban landscapes in a more robust matter, enabling comparison on any arbitrary geographical scale.

[99]  arXiv:2008.00230 [pdf, other]
Title: RGB-D Salient Object Detection: A Survey
Comments: 24 pages, 12 figures. Code: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Salient object detection (SOD), which simulates the human visual perception system to locate the most attractive object(s) in a scene, has been widely applied to various computer vision tasks. Now, with the advent of depth sensors, depth maps with affluent spatial information that can be beneficial in boosting the performance of SOD, can easily be captured. Although various RGB-D based SOD models with promising performance have been proposed over the past several years, an in-depth understanding of these models and challenges in this topic remains lacking. In this paper, we provide a comprehensive survey of RGB-D based SOD models from various perspectives, and review related benchmark datasets in detail. Further, considering that the light field can also provide depth maps, we review SOD models and popular benchmark datasets from this domain as well. Moreover, to investigate the SOD ability of existing models, we carry out a comprehensive evaluation, as well as attribute-based evaluation of several representative RGB-D based SOD models. Finally, we discuss several challenges and open directions of RGB-D based SOD for future research. All collected models, benchmark datasets, source code links, datasets constructed for attribute-based evaluation, and codes for evaluation will be made publicly available at https://github.com/taozh2017/RGBDSODsurvey

[100]  arXiv:2008.00234 [pdf]
Title: Ergodic Annealing
Subjects: Artificial Intelligence (cs.AI); Theoretical Economics (econ.TH); Probability (math.PR); Machine Learning (stat.ML)

Simulated Annealing is the crowning glory of Markov Chain Monte Carlo Methods for the solution of NP-hard optimization problems in which the cost function is known. Here, by replacing the Metropolis engine of Simulated Annealing with a reinforcement learning variation -- that we call Macau Algorithm -- we show that the Simulated Annealing heuristic can be very effective also when the cost function is unknown and has to be learned by an artificial agent.

[101]  arXiv:2008.00238 [pdf, other]
Title: An Explainable Machine Learning Model for Early Detection of Parkinson's Disease using LIME on DaTscan Imagery
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)

Parkinson's disease (PD) is a degenerative and progressive neurological condition. Early diagnosis can improve treatment for patients and is performed through dopaminergic imaging techniques like the SPECT DaTscan. In this study, we propose a machine learning model that accurately classifies any given DaTscan as having Parkinson's disease or not, in addition to providing a plausible reason for the prediction. This is kind of reasoning is done through the use of visual indicators generated using Local Interpretable Model-Agnostic Explainer (LIME) methods. DaTscans were drawn from the Parkinson's Progression Markers Initiative database and trained on a CNN (VGG16) using transfer learning, yielding an accuracy of 95.2%, a sensitivity of 97.5%, and a specificity of 90.9%. Keeping model interpretability of paramount importance, especially in the healthcare field, this study utilises LIME explanations to distinguish PD from non-PD, using visual superpixels on the DaTscans. It could be concluded that the proposed system, in union with its measured interpretability and accuracy may effectively aid medical workers in the early diagnosis of Parkinson's Disease.

[102]  arXiv:2008.00240 [pdf, ps, other]
Title: On the filtered polynomial interpolation at Chebyshev nodes
Comments: 20 pages, 19 figures given in 8 eps files
Subjects: Numerical Analysis (math.NA)

The paper deals with a special filtered approximation method, which originates interpolation polynomials at Chebyshev zeros by using de la Vall\'ee Poussin filters. These polynomials can be an useful device for many theoretical and applicative problems since they combine the advantages of the classical Lagrange interpolation, with the uniform convergence in spaces of locally continuous functions equipped with suitable, Jacobi--weighted, uniform norms. The uniform boundedness of the related Lebesgue constants, which equals to the uniform convergence and is missing from Lagrange interpolation, has been already proved in literature under different, but only sufficient, assumptions. Here, we state the necessary and sufficient conditions to get it. These conditions are easy to check since they are simple inequalities on the exponents of the Jacobi weight defining the norm. Moreover, they are necessary and sufficient to get filtered interpolating polynomials with a near best approximation error, which tends to zero as the number $n$ of nodes tends to infinity. In addition, the convergence rate is comparable with the error of best polynomial approximation of degree $n$, hence the approximation order improves with the smoothness of the sought function. Several numerical experiments are given in order to test the theoretical results, to make a comparison with the Lagrange interpolation at the same nodes and to show how the Gibbs phenomenon can be strongly reduced.

[103]  arXiv:2008.00247 [pdf, other]
Title: Meta-DRN: Meta-Learning for 1-Shot Image Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)

Modern deep learning models have revolutionized the field of computer vision. But, a significant drawback of most of these models is that they require a large number of labelled examples to generalize properly. Recent developments in few-shot learning aim to alleviate this requirement. In this paper, we propose a novel lightweight CNN architecture for 1-shot image segmentation. The proposed model is created by taking inspiration from well-performing architectures for semantic segmentation and adapting it to the 1-shot domain. We train our model using 4 meta-learning algorithms that have worked well for image classification and compare the results. For the chosen dataset, our proposed model has a 70% lower parameter count than the benchmark, while having better or comparable mean IoU scores using all 4 of the meta-learning algorithms.

[104]  arXiv:2008.00248 [pdf, ps, other]
Title: Analysis and Optimization of Cache-Enabled Networks with Random DTX Scheme
Subjects: Information Theory (cs.IT)

In this paper, we focus on the meta distribution for the cache-enabled networks where the locations of base stations (BSs) are modeled as Poisson point process (PPP). Under the random caching framework, we derive the moments of the conditional successful transmission probability (STP), the exact meta distribution and its beta approximation by utilizing stochastic geometry. The closed-form expression of the mean local delay is also derived. We consider the maximization of the STP and the minimization of the mean local delay by optimizing the caching probability and the BS active probability, respectively. For the former, a convex optimization problem is formulated and the optimal caching probability and BS active probability are achieved. Moreover, most popular caching (MPC) is proved to optimal under the constraint that the mean local delay is finite. For the latter, a non-convex optimization problem is formulated and an iterative algorithm is proposed to obtain the optimal solution. The backhaul delay has a significant influence on the caching strategy. MPC is proved to be optimal when the backhaul delay is relatively low and the uniform caching (UC) is the optimal caching strategy when the backhaul delay is very large. Finally, the numerical results reveal the effect of the key network parameters on the cache-enabled networks in terms of STP, variance, meta distribution and mean local delay.

[105]  arXiv:2008.00255 [pdf, ps, other]
Title: Theta palindromes in theta conjugates
Comments: Any suggestions and comments are welcome
Subjects: Formal Languages and Automata Theory (cs.FL); Discrete Mathematics (cs.DM)

A DNA string is a Watson-Crick (WK) palindrome when the complement of its reverse is equal to itself. The Watson-Crick mapping $\theta$ is an involution that is also an antimorphism. $\theta$-conjugates of a word is a generalisation of conjugates of a word that incorporates the notion of WK-involution $\theta$. In this paper, we study the distribution of palindromes and Watson-Crick palindromes, also known as $\theta$-palindromes among both the set of conjugates and $\theta$-conjugates of a word $w$. We also consider some general properties of the set $C_{\theta}(w)$, i.e., the set of $\theta$-conjugates of a word $w$, and characterize words $w$ such that $|C_{\theta}(w)|=|w|+1$, i.e., with the maximum number of elements in $C_{\theta}(w)$. We also find the structure of words that have at least one (WK)-palindrome in $C_{\theta}(w)$.

[106]  arXiv:2008.00261 [pdf, other]
Title: Distilling Visual Priors from Self-Supervised Learning
Comments: This is the 2nd place tech report for VIPriors Image Classification Challenge ECCVW2020
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Convolutional Neural Networks (CNNs) are prone to overfit small training datasets. We present a novel two-phase pipeline that leverages self-supervised learning and knowledge distillation to improve the generalization ability of CNN models for image classification under the data-deficient setting. The first phase is to learn a teacher model which possesses rich and generalizable visual representations via self-supervised learning, and the second phase is to distill the representations into a student model in a self-distillation manner, and meanwhile fine-tune the student model for the image classification task. We also propose a novel margin loss for the self-supervised contrastive learning proxy task to better learn the representation under the data-deficient scenario. Together with other tricks, we achieve competitive performance in the VIPriors image classification challenge.

[107]  arXiv:2008.00266 [pdf, ps, other]
Title: On parity decision trees for Fourier-sparse Boolean functions
Subjects: Computational Complexity (cs.CC)

We study parity decision trees for Boolean functions. The motivation of our study is the log-rank conjecture for XOR functions and its connection to Fourier analysis and parity decision tree complexity. Let f be a Boolean function with Fourier support S and Fourier sparsity k.
1) We prove via the probabilistic method that there exists a parity decision tree of depth O(sqrt k) that computes f. This matches the best known upper bound on the parity decision tree complexity of Boolean functions (Tsang, Wong, Xie, and Zhang, FOCS 2013). Moreover, while previous constructions (Tsang et al., FOCS 2013, Shpilka, Tal, and Volk, Comput. Complex. 2017) build the trees by carefully choosing the parities to be queried in each step, our proof shows that a naive sampling of the parities suffices.
2) We generalize the above result by showing that if the Fourier spectra of Boolean functions satisfy a natural "folding property", then the above proof can be adapted to establish existence of a tree of complexity polynomially smaller than O(sqrt k). We make a conjecture in this regard which, if true, implies that the communication complexity of an XOR function is bounded above by the fourth root of the rank of its communication matrix, improving upon the previously known upper bound of square root of rank (Tsang et al., FOCS 2013, Lovett, J. ACM. 2016).
3) It can be shown by elementary techniques that for any Boolean function f and all pairs (alpha, beta) of parities in S, there exists another pair (gamma, delta) of parities in S such that alpha + beta = gamma + delta. We show, among other results, that there must exist several gamma in F_2^n such that there are at least three pairs (alpha_1, alpha_2) of parities in S with alpha_1 + alpha_2 = gamma.

[108]  arXiv:2008.00267 [pdf, other]
Title: From Shadow Segmentation to Shadow Removal
Comments: Accepted at ECCV 2020. All code, trained models, and data are available (soon) at: this https URL edu/~cvl/projects/FSS2SR/index.html
Subjects: Computer Vision and Pattern Recognition (cs.CV)

The requirement for paired shadow and shadow-free images limits the size and diversity of shadow removal datasets and hinders the possibility of training large-scale, robust shadow removal algorithms. We propose a shadow removal method that can be trained using only shadow and non-shadow patches cropped from the shadow images themselves. Our method is trained via an adversarial framework, following a physical model of shadow formation. Our central contribution is a set of physics-based constraints that enables this adversarial training. Our method achieves competitive shadow removal results compared to state-of-the-art methods that are trained with fully paired shadow and shadow-free images. The advantages of our training regime are even more pronounced in shadow removal for videos. Our method can be fine-tuned on a testing video with only the shadow masks generated by a pre-trained shadow detector and outperforms state-of-the-art methods on this challenging test. We illustrate the advantages of our method on our proposed video shadow removal dataset.

[109]  arXiv:2008.00270 [pdf, ps, other]
Title: Fast Classical and Quantum Algorithms for Online $k$-server Problem on Trees
Subjects: Data Structures and Algorithms (cs.DS); Quantum Physics (quant-ph)

We consider online algorithms for the $k$-server problem on trees. Chrobak and Larmore proposed a $k$-competitive algorithm for this problem that has the optimal competitive ratio. However, a naive implementation of their algorithm has $O(n)$ time complexity for processing each query, where $n$ is the number of nodes in the tree. We propose a new time-efficient implementation of this algorithm that has $O(n\log n)$ time complexity for preprocessing and $O\left(k^2 + k\cdot \log n\right)$ time for processing a query. We also propose a quantum algorithm for the case where the nodes of the tree are presented using string paths. In this case, no preprocessing is needed, and the time complexity for each query is $O(k^2\sqrt{n}\log n)$. When the number of queries is $o\left(\frac{\sqrt{n}}{k^2\log n}\right)$, we obtain a quantum speed-up on the total runtime compared to our classical algorithm.
Our algorithm builds on a result of independent interest: we give a quantum algorithm to find the first marked element in a collection of $m$ objects, that works even in the presence of two-sided bounded errors on the input oracle. It has worst-case complexity $O(\sqrt{m})$. In the particular case of one-sided errors on the input, it has expected time complexity $O(\sqrt{x})$ where $x$ is the position of the first marked element.

[110]  arXiv:2008.00277 [pdf, other]
Title: Guided Pattern Mining for API Misuse Detection by Change-Based Code Analysis
Subjects: Software Engineering (cs.SE)

Lack of experience, inadequate documentation, and sub-optimal API design frequently cause developers to make mistakes when re-using third-party implementations. Such API misuses can result in unintended behavior, performance losses, or software crashes. Therefore, current research aims to automatically detect such misuses by comparing the way a developer used an API to previously inferred patterns of the correct API usage. While research has made significant progress, these techniques have not yet been adopted in practice. In part, this is due to the still high numbers of false-positive patterns, but also due to the lack of a process capable of seamlessly integrating with software development processes. In this paper, we target both problems: (a) by providing a method which increases the likelihood of finding relevant and true-positive patterns concerning a given set of code changes and (b) by introducing a just-in-time API misuse detection process which analyzes changes at the time of commit. Particularly, we introduce different, lightweight code search and filtering strategies and evaluated them on 37 real-world API misuses to determine their usefulness in finding relevant API usage patterns. Our main results are (1) commit-based search with subsequent filtering effectively decreases the amount of code to be analyzed, (2) in particular method-level filtering is superior to file-level filtering, (3) project-internal and project-external code search find solutions for different types of misuses and thus are complementary, (4) incorporating prior knowledge of the misused API into the search has a negligible effect.

[111]  arXiv:2008.00278 [pdf, ps, other]
Title: Numerical Computation of Solitary Wave Solutions of the Rosenau Equation
Comments: 14 pages, 12 figures
Journal-ref: Wave Motion 98, Article number: 102618 (2020)
Subjects: Numerical Analysis (math.NA); Mathematical Physics (math-ph)

We construct numerically solitary wave solutions of the Rosenau equation using the Petviashvili iteration method. We first summarize the theoretical results available in the literature for the existence of solitary wave solutions. We then apply two numerical algorithms based on the Petviashvili method for solving the Rosenau equation with single or double power law nonlinearity. Numerical calculations rely on a uniform discretization of a finite computational domain. Through some numerical experiments we observe that the algorithm converges rapidly and it is robust to very general forms of the initial guess.

[112]  arXiv:2008.00279 [pdf, other]
Title: An Empirical Study of Clarifying Question-Based Systems
Comments: Parts of content are published on CIKM 2020
Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)

Search and recommender systems that take the initiative to ask clarifying questions to better understand users' information needs are receiving increasing attention from the research community. However, to the best of our knowledge, there is no empirical study to quantify whether and to what extent users are willing or able to answer these questions. In this work, we conduct an online experiment by deploying an experimental system, which interacts with users by asking clarifying questions against a product repository. We collect both implicit interaction behavior data and explicit feedback from users showing that: (a) users are willing to answer a good number of clarifying questions (11-21 on average), but not many more than that; (b) most users answer questions until they reach the target product, but also a fraction of them stops due to fatigue or due to receiving irrelevant questions; (c) part of the users' answers (12-17%) are actually opposite to the description of the target product; while (d) most of the users (66-84%) find the question-based system helpful towards completing their tasks. Some of the findings of the study contradict current assumptions on simulated evaluations in the field, while they point towards improvements in the evaluation framework and can inspire future interactive search/recommender system designs.

[113]  arXiv:2008.00285 [pdf, other]
Title: Dividing Bads is Harder than Dividing Goods: On the Complexity of Fair and Efficient Division of Chores
Subjects: Computer Science and Game Theory (cs.GT)

We study the chore division problem where a set of agents needs to divide a set of chores (bads) among themselves fairly and efficiently. We assume that agents have linear disutility (cost) functions. Like for the case of goods, competitive division is known to be arguably the best mechanism for the bads as well. However, unlike goods, there are multiple competitive divisions with very different disutility value profiles in bads. Although all competitive divisions satisfy the standard notions of fairness and efficiency, some divisions are significantly fairer and efficient than the others. This raises two important natural questions: Does there exist a competitive division in which no agent is assigned a chore that she hugely dislikes? Are there simple sufficient conditions for the existence and polynomial-time algorithms assuming them?
We investigate both these questions in this paper. We show that the first problem is strongly NP-hard. Further, we derive a simple sufficient condition for the existence, and we show that finding a competitive division is PPAD-hard assuming the condition. These results are in sharp contrast to the case of goods where both problems are strongly polynomial-time solvable. To the best of our knowledge, these are the first hardness results for the chore division problem, and, in fact, for any economic model under linear preferences.

[114]  arXiv:2008.00293 [pdf, ps, other]
Title: The test set for the TransCoder system
Authors: Ernest Davis
Subjects: Computation and Language (cs.CL)

The TransCoder system translates source code between Java, C++, and Python 3. The test set that was used to evaluate its quality is missing important features of Java, including the ability to define and use classes and the ability to call user-defined functions other than recursively. Therefore, the accuracy of TransCoder over programs with those features remains unknown.

[115]  arXiv:2008.00294 [pdf, ps, other]
Title: Quadrature methods for integro-differential equations of Prandtl's type in weighted spaces of continuous functions
Comments: 34 pages
Subjects: Numerical Analysis (math.NA)

The paper deals with the approximate solution of integro-differential equations of Prandtl's type. Quadrature methods involving ``optimal'' Lagrange interpolation processes are proposed and conditions under which they are stable and convergent in suitable weighted spaces of continuous functions are proved.
The efficiency of the method has been tested by some numerical experiments, some of them including comparisons with other numerical procedures. In particular, as an application, we have implemented the method for solving Prandtl's equation governing the circulation air flow along the contour of a plane wing profile, in the case of elliptic or rectangular wing-shape.

[116]  arXiv:2008.00297 [pdf, other]
Title: The Price of Tailoring the Index to Your Data: Poisoning Attacks on Learned Index Structures
Subjects: Cryptography and Security (cs.CR); Databases (cs.DB); Data Structures and Algorithms (cs.DS); Machine Learning (cs.LG)

The concept of learned index structures relies on the idea that the input-output functionality of a database index can be viewed as a prediction task and, thus, be implemented using a machine learning model instead of traditional algorithmic techniques. This novel angle for a decades-old problem has inspired numerous exciting results in the intersection of machine learning and data structures. However, the main advantage of learned index structures, i.e., the ability to adjust to the data at hand via the underlying ML-model, can become a disadvantage from a security perspective as it could be exploited.
In this work, we present the first study of poisoning attacks on learned index structures. The required poisoning approach is different from all previous works since the model under attack is trained on a cumulative distribution function (CDF) and, thus, every injection on the training set has a cascading impact on multiple data values. We formulate the first poisoning attacks on linear regression models trained on the CDF, which is a basic building block of the proposed learned index structures. We generalize our poisoning techniques to attack a more advanced two-stage design of learned index structures called recursive model index (RMI), which has been shown to outperform traditional B-Trees. We evaluate our attacks on real-world and synthetic datasets under a wide variety of parameterizations of the model and show that the error of the RMI increases up to $300\times$ and the error of its second-stage models increases up to $3000\times$.

[117]  arXiv:2008.00299 [pdf]
Title: Eigen-CAM: Class Activation Map using Principal Components
Comments: 7 pages, 4 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

Deep neural networks are ubiquitous due to the ease of developing models and their influence on other domains. At the heart of this progress is convolutional neural networks (CNNs) that are capable of learning representations or features given a set of data. Making sense of such complex models (i.e., millions of parameters and hundreds of layers) remains challenging for developers as well as the end-users. This is partially due to the lack of tools or interfaces capable of providing interpretability and transparency. A growing body of literature, for example, class activation map (CAM), focuses on making sense of what a model learns from the data or why it behaves poorly in a given task. This paper builds on previous ideas to cope with the increasing demand for interpretable, robust, and transparent models. Our approach provides a simpler and intuitive (or familiar) way of generating CAM. The proposed Eigen-CAM computes and visualizes the principle components of the learned features/representations from the convolutional layers. Empirical studies were performed to compare the Eigen-CAM with the state-of-the-art methods (such as Grad-CAM, Grad-CAM++, CNN-fixations) by evaluating on benchmark datasets such as weakly-supervised localization and localizing objects in the presence of adversarial noise. Eigen-CAM was found to be robust against classification errors made by fully connected layers in CNNs, does not rely on the backpropagation of gradients, class relevance score, maximum activation locations, or any other form of weighting features. In addition, it works with all CNN models without the need to modify layers or retrain models. Empirical results show up to 12% improvement over the best method among the methods compared on weakly supervised object localization.

[118]  arXiv:2008.00302 [pdf, other]
Title: Accurate and Efficient Intracranial Hemorrhage Detection and Subtype Classification in 3D CT Scans with Convolutional and Long Short-Term Memory Neural Networks
Subjects: Computer Vision and Pattern Recognition (cs.CV)

In this paper, we present our system for the RSNA Intracranial Hemorrhage Detection challenge. The proposed system is based on a lightweight deep neural network architecture composed of a convolutional neural network (CNN) that takes as input individual CT slices, and a Long Short-Term Memory (LSTM) network that takes as input feature embeddings provided by the CNN. For efficient processing, we consider various feature selection methods to produce a subset of useful CNN features for the LSTM. Furthermore, we reduce the CT slices by a factor of 2x, allowing ourselves to train the model faster. Even if our model is designed to balance speed and accuracy, we report a weighted mean log loss of 0.04989 on the final test set, which places us in the top 30 ranking (2%) from a total of 1345 participants. Although our computing infrastructure does not allow it, processing CT slices at their original scale is likely to improve performance. In order to enable others to reproduce our results, we provide our code as open source at https://github.com/warchildmd/ihd. After the challenge, we conducted a subjective intracranial hemorrhage detection assessment by radiologists, indicating that the performance of our deep model is on par with that of doctors specialized in reading CT scans. Another contribution of our work is to integrate Grad-CAM visualizations in our system, providing useful explanations for its predictions. We therefore consider our system as a viable option when a fast diagnosis or a second opinion on intracranial hemorrhage detection are needed.

[119]  arXiv:2008.00304 [pdf, other]
Title: SemEval-2020 Task 7: Assessing Humor in Edited News Headlines
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

This paper describes the SemEval-2020 shared task "Assessing Humor in Edited News Headlines." The task's dataset contains news headlines in which short edits were applied to make them funny, and the funniness of these edited headlines was rated using crowdsourcing. This task includes two subtasks, the first of which is to estimate the funniness of headlines on a humor scale in the interval 0-3. The second subtask is to predict, for a pair of edited versions of the same original headline, which is the funnier version. To date, this task is the most popular shared computational humor task, attracting 48 teams for the first subtask and 31 teams for the second.

[120]  arXiv:2008.00305 [pdf, other]
Title: Self-supervised Learning of Point Clouds via Orientation Estimation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG)

Point clouds provide a compact and efficient representation of 3D shapes. While deep neural networks have achieved impressive results on point cloud learning tasks, they require massive amounts of manually labeled data, which can be costly and time-consuming to collect. In this paper, we leverage 3D self-supervision for learning downstream tasks on point clouds with fewer labels. A point cloud can be rotated in infinitely many ways, which provides a rich label-free source for self-supervision. We consider the auxiliary task of predicting rotations that in turn leads to useful features for other tasks such as shape classification and 3D keypoint prediction. Using experiments on ShapeNet and ModelNet, we demonstrate that our approach outperforms the state-of-the-art. Moreover, features learned by our model are complementary to other self-supervised methods and combining them leads to further performance improvement.

[121]  arXiv:2008.00307 [pdf, other]
Title: Multi-Temporal Analysis and Scaling Relations of 100,000,000,000 Network Packets
Comments: 6 pages, 6 figures,3 tables, 49 references, accepted to IEEE HPEC 2020
Subjects: Networking and Internet Architecture (cs.NI); Distributed, Parallel, and Cluster Computing (cs.DC); Social and Information Networks (cs.SI)

Our society has never been more dependent on computer networks. Effective utilization of networks requires a detailed understanding of the normal background behaviors of network traffic. Large-scale measurements of networks are computationally challenging. Building on prior work in interactive supercomputing and GraphBLAS hypersparse hierarchical traffic matrices, we have developed an efficient method for computing a wide variety of streaming network quantities on diverse time scales. Applying these methods to 100,000,000,000 anonymized source-destination pairs collected at a network gateway reveals many previously unobserved scaling relationships. These observations provide new insights into normal network background traffic that could be used for anomaly detection, AI feature engineering, and testing theoretical models of streaming networks.

[122]  arXiv:2008.00308 [pdf, other]
Title: Learning-based link prediction analysis for Facebook100 network
Comments: 8 pages, 7 figures, 3 tables
Subjects: Social and Information Networks (cs.SI); Machine Learning (cs.LG)

In social network science, Facebook is one of the most interesting and widely used social networks and media platforms. In the previous decade Facebook data contributed to significant evolution of social network research. Paired with this topic we have experienced growing popularity in the link prediction techniques, which are important tools in link mining and analysis. This paper gives a comprehensive overview of link prediction analysis on the Facebook100 network, which was derived in 2005. We study performance and evaluate multiple machine learning algorithms on this network. We use networks embeddings and topology-based techniques such as node2vec and vectors of similarity metrics. Using these techniques similarity features for our classification models are derived. Further we discuss our approach and present results. Lastly, we compare and review our models, where overall performance and classification rates are presented.

[123]  arXiv:2008.00311 [pdf, ps, other]
Title: Learning with Safety Constraints: Sample Complexity of Reinforcement Learning for Constrained MDPs
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

Many physical systems have underlying safety considerations that require that the policy employed ensures the satisfaction of a set of constraints. The analytical formulation usually takes the form of a Constrained Markov Decision Process (CMDP), where the constraints are some function of the occupancy measure generated by the policy. We focus on the case where the CMDP is unknown, and RL algorithms obtain samples to discover the model and compute an optimal constrained policy. Our goal is to characterize the relationship between safety constraints and the number of samples needed to ensure a desired level of accuracy---both objective maximization and constraint satisfaction---in a PAC sense. We explore generative model based class of RL algorithms wherein samples are taken initially to estimate a model. Our main finding is that compared to the best known bounds of the unconstrained regime, the sample complexity of constrained RL algorithms are increased by a factor that is logarithmic in the number of constraints, which suggests that the approach may be easily utilized in real systems.

[124]  arXiv:2008.00312 [pdf, other]
Title: Trojaning Language Models for Fun and Profit
Comments: 19 pages (14 for main text, 5 for appendix), 6 figures, under review
Subjects: Cryptography and Security (cs.CR); Computation and Language (cs.CL); Machine Learning (cs.LG)

Recent years have witnessed a new paradigm of building natural language processing (NLP) systems: general-purpose, pre-trained language models (LMs) are fine-tuned with simple downstream models to attain state-of-the-art performance for a variety of target tasks. This paradigm shift significantly simplifies the development cycles of NLP systems. Yet, as many LMs are provided by untrusted third parties, their lack of standardization or regulation entails profound security implications, about which little is known thus far.
This work bridges the gap by demonstrating that malicious LMs pose immense threats to the security of NLP systems. Specifically, we present TROJAN-ML, a new class of trojaning attacks in which maliciously crafted LMs trigger host NLP systems to malfunction in a highly predictable manner. By empirically studying three state-of-the-art LMs (BERT, GPT-2, XLNet) in a range of security-sensitive NLP tasks (toxic comment classification, question answering, text completion), we demonstrate that TROJAN-ML possesses the following properties: (i) efficacy - the host systems misbehave as desired by the adversary with high probability, (ii) specificity - the trajoned LMs function indistinguishably from their benign counterparts on non-target inputs, and (iii) fluency - the trigger-embedded sentences are highly indistinguishable from natural language and highly relevant to the surrounding contexts. We provide analytical justification for the practicality of TROJAN-ML, which points to the unprecedented complexity of today's LMs. We further discuss potential countermeasures and their challenges, which lead to several promising research directions.

[125]  arXiv:2008.00324 [pdf, other]
Title: Improving Skeleton-based Action Recognitionwith Robust Spatial and Temporal Features
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Recently skeleton-based action recognition has made signif-icant progresses in the computer vision community. Most state-of-the-art algorithms are based on Graph Convolutional Networks (GCN), andtarget at improving the network structure of the backbone GCN lay-ers. In this paper, we propose a novel mechanism to learn more robustdiscriminative features in space and time. More specifically, we add aDiscriminative Feature Learning (DFL) branch to the last layers of thenetwork to extract discriminative spatial and temporal features to helpregularize the learning. We also formally advocate the use of Direction-Invariant Features (DIF) as input to the neural networks. We show thataction recognition accuracy can be improved when these robust featuresare learned and used. We compare our results with those of ST-GCNand related methods on four datasets: NTU-RGBD60, NTU-RGBD120,SYSU 3DHOI and Skeleton-Kinetics.

[126]  arXiv:2008.00325 [pdf, other]
Title: Bringing UMAP Closer to the Speed of Light with GPU Acceleration
Subjects: Machine Learning (cs.LG); Data Structures and Algorithms (cs.DS); Machine Learning (stat.ML)

The Uniform Manifold Approximation and Projection (UMAP) algorithm has become widely popular for its ease of use, quality of results, and support for exploratory, unsupervised, supervised, and semi-supervised learning. While many algorithms can be ported to a GPU in a simple and direct fashion, such efforts have resulted in inefficent and inaccurate versions of UMAP. We show a number of techniques that can be used to make a faster and more faithful GPU version of UMAP, and obtain speedups of up to 100x in practice. Many of these design choices/lessons are general purpose and may inform the conversion of other graph and manifold learning algorithms to use GPUs. Our implementation has been made publicly available as part of the open source RAPIDS cuML library(https://github.com/rapidsai/cuml).

[127]  arXiv:2008.00326 [pdf, other]
Title: PERCH 2.0 : Fast and Accurate GPU-based Perception via Search for Object Pose Estimation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)

Pose estimation of known objects is fundamental to tasks such as robotic grasping and manipulation. The need for reliable grasping imposes stringent accuracy requirements on pose estimation in cluttered, occluded scenes in dynamic environments. Modern methods employ large sets of training data to learn features in order to find correspondence between 3D models and observed data. However these methods require extensive annotation of ground truth poses. An alternative is to use algorithms that search for the best explanation of the observed scene in a space of possible rendered scenes. A recently developed algorithm, PERCH (PErception Via SeaRCH) does so by using depth data to converge to a globally optimum solution using a search over a specially constructed tree. While PERCH offers strong guarantees on accuracy, the current formulation suffers from low scalability owing to its high runtime. In addition, the sole reliance on depth data for pose estimation restricts the algorithm to scenes where no two objects have the same shape. In this work, we propose PERCH 2.0, a novel perception via search strategy that takes advantage of GPU acceleration and RGB data. We show that our approach can achieve a speedup of 100x over PERCH, as well as better accuracy than the state-of-the-art data-driven approaches on 6-DoF pose estimation without the need for annotating ground truth poses in the training data. Our code and video are available at https://sbpl-cruz.github.io/perception/.

[128]  arXiv:2008.00329 [pdf, other]
Title: CuttleSys: Data-Driven Resource Management forInteractive Applications on Reconfigurable Multicores
Subjects: Hardware Architecture (cs.AR)

Multi-tenancy for latency-critical applications leads to re-source interference and unpredictable performance. Core reconfiguration opens up more opportunities for colocation,as it allows the hardware to adjust to the dynamic performance and power needs of a specific mix of co-scheduled applications. However, reconfigurability also introduces challenges, as even for a small number of reconfigurable cores, exploring the design space becomes more time- and resource-demanding.
We present CuttleSys, a runtime for reconfigurable multi-cores that leverages scalable and lightweight data mining to quickly identify suitable core and cache configurations for a set of co-scheduled applications. The runtime combines collaborative filtering to infer the behavior of each job on every core and cache configuration, with Dynamically Dimensioned Search to efficiently explore the configuration space. We evaluate CuttleSys on multicores with tens of reconfigurable cores and show up to 2.46x and 1.55x performance improvements compared to core-level gating and oracle-like asymmetric multicores respectively, under stringent power constraints.

[129]  arXiv:2008.00331 [pdf, ps, other]
Title: Learning from Mixtures of Private and Public Populations
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

We initiate the study of a new model of supervised learning under privacy constraints. Imagine a medical study where a dataset is sampled from a population of both healthy and unhealthy individuals. Suppose healthy individuals have no privacy concerns (in such case, we call their data "public") while the unhealthy individuals desire stringent privacy protection for their data. In this example, the population (data distribution) is a mixture of private (unhealthy) and public (healthy) sub-populations that could be very different.
Inspired by the above example, we consider a model in which the population $\mathcal{D}$ is a mixture of two sub-populations: a private sub-population $\mathcal{D}_{\sf priv}$ of private and sensitive data, and a public sub-population $\mathcal{D}_{\sf pub}$ of data with no privacy concerns. Each example drawn from $\mathcal{D}$ is assumed to contain a privacy-status bit that indicates whether the example is private or public. The goal is to design a learning algorithm that satisfies differential privacy only with respect to the private examples.
Prior works in this context assumed a homogeneous population where private and public data arise from the same distribution, and in particular designed solutions which exploit this assumption. We demonstrate how to circumvent this assumption by considering, as a case study, the problem of learning linear classifiers in $\mathbb{R}^d$. We show that in the case where the privacy status is correlated with the target label (as in the above example), linear classifiers in $\mathbb{R}^d$ can be learned, in the agnostic as well as the realizable setting, with sample complexity which is comparable to that of the classical (non-private) PAC-learning. It is known that this task is impossible if all the data is considered private.

[130]  arXiv:2008.00332 [pdf, other]
Title: Data Oblivious Algorithms for Multicores
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Cryptography and Security (cs.CR); Data Structures and Algorithms (cs.DS)

As secure processors such as Intel SGX (with hyperthreading) become widely adopted, there is a growing appetite for private analytics on big data. Most prior works on data-oblivious algorithms adopt the classical PRAM model to capture parallelism. However, it is widely understood that PRAM does not best capture realistic multicore processors, nor does it reflect parallel programming models adopted in practice.
In this paper, we initiate the study of parallel data oblivious algorithms on realistic multicores, best captured by the binary fork-join model of computation. We first show that data-oblivious sorting can be accomplished by a binary fork-join algorithm with optimal total work and optimal (cache-oblivious) cache complexity, and in O(log n log log n) span (i.e., parallel time) that matches the best-known insecure algorithm. Using our sorting algorithm as a core primitive, we show how to data-obliviously simulate general PRAM algorithms in the binary fork-join model with non-trivial efficiency. We also present results for several applications including list ranking, Euler tour, tree contraction, connected components, and minimum spanning forest. For a subset of these applications, our data-oblivious algorithms asymptotically outperform the best known insecure algorithms. For other applications, we show data oblivious algorithms whose performance bounds match the best known insecure algorithms.
Complementing these asymptotically efficient results, we present a practical variant of our sorting algorithm that is self-contained and potentially implementable. It has optimal caching cost, and it is only a log log n factor off from optimal work and about a log n factor off in terms of span; moreover, it achieves small constant factors in its bounds.

[131]  arXiv:2008.00333 [pdf, other]
Title: A spectral clustering approach for the evolution of the COVID-19 pandemic in the state of Rio Grande do Sul, Brazil
Comments: 16 pages
Subjects: Social and Information Networks (cs.SI); Physics and Society (physics.soc-ph)

The aim of this paper is to analyse the evolution of the COVID-19 pandemic in Rio Grande do Sul by applying graph-theoretical tools, particularly spectral clustering techniques, on weighted graphs defined on the set of 167 municipalities in the state with population 10,000 or more, which are based on data provided by government agencies and other sources. To respond to this outbreak, the state has adopted a system by which pre-determined regions are assigned flags on a weekly basis, and different measures go into effect according to the flag assigned. Our results suggest that considering a flexible approach to the regions themselves might be a useful additional tool to give more leeway to cities with lower incidence rates, while keeping the focus on public safety. Moreover, simulations show the dampening effect of isolation on the dissemination of the disease.

[132]  arXiv:2008.00334 [pdf, other]
Title: Uncertainty-based Traffic Accident Anticipation with Spatio-Temporal Relational Learning
Authors: Wentao Bao, Qi Yu, Yu Kong
Comments: Accepted by ACM MM 2020
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Traffic accident anticipation aims to predict accidents from dashcam videos as early as possible, which is critical to safety-guaranteed self-driving systems. With cluttered traffic scenes and limited visual cues, it is of great challenge to predict how long there will be an accident from early observed frames. Most existing approaches are developed to learn features of accident-relevant agents for accident anticipation, while ignoring the features of their spatial and temporal relations. Besides, current deterministic deep neural networks could be overconfident in false predictions, leading to high risk of traffic accidents caused by self-driving systems. In this paper, we propose an uncertainty-based accident anticipation model with spatio-temporal relational learning. It sequentially predicts the probability of traffic accident occurrence with dashcam videos. Specifically, we propose to take advantage of graph convolution and recurrent networks for relational feature learning, and leverage Bayesian neural networks to address the intrinsic variability of latent relational representations. The derived uncertainty-based ranking loss is found to significantly boost model performance by improving the quality of relational features. In addition, we collect a new Car Crash Dataset (CCD) for traffic accident anticipation which contains environmental attributes and accident reasons annotations. Experimental results on both public and the newly-compiled datasets show state-of-the-art performance of our model. Our code and CCD dataset are available at https://github.com/Cogito2012/UString.

[133]  arXiv:2008.00335 [pdf, other]
Title: V2I Connectivity-Based Dynamic Queue-Jump Lane for Emergency Vehicles: A Deep Reinforcement Learning Approach
Comments: 20 pages, 6 figures
Subjects: Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Systems and Control (eess.SY)

Emergency vehicle (EMV) service is a key function of cities and is exceedingly challenging due to urban traffic congestion. A main reason behind EMV service delay is the lack of communication and cooperation between vehicles blocking EMVs. In this paper, we study the improvement of EMV service under V2I connectivity. We consider the establishment of dynamic queue jump lanes (DQJLs) based on real-time coordination of connected vehicles. We develop a novel Markov decision process formulation for the DQJL problem, which explicitly accounts for the uncertainty of drivers' reaction to approaching EMVs. We propose a deep neural network-based reinforcement learning algorithm that efficiently computes the optimal coordination instructions. We also validate our approach on a micro-simulation testbed using Simulation of Urban Mobility (SUMO). Validation results show that with our proposed methodology, the centralized control system saves approximately 15\% EMV passing time than the benchmark system.

[134]  arXiv:2008.00343 [pdf, other]
Title: Extracting actionable information from microtexts
Comments: Radboud Univeversity, Dissertation, 2019. Go to this https URL for the original version
Subjects: Computation and Language (cs.CL)

Microblogs such as Twitter represent a powerful source of information. Part of this information can be aggregated beyond the level of individual posts. Some of this aggregated information is referring to events that could or should be acted upon in the interest of e-governance, public safety, or other levels of public interest. Moreover, a significant amount of this information, if aggregated, could complement existing information networks in a non-trivial way. This dissertation proposes a semi-automatic method for extracting actionable information that serves this purpose. First, we show that predicting time to event is possible for both in-domain and cross-domain scenarios. Second, we suggest a method which facilitates the definition of relevance for an analyst's context and the use of this definition to analyze new data. Finally, we propose a method to integrate the machine learning based relevant information classification method with a rule-based information classification technique to classify microtexts. Fully automatizing microtext analysis has been our goal since the first day of this research project. Our efforts in this direction informed us about the extent this automation can be realized. We mostly first developed an automated approach, then we extended and improved it by integrating human intervention at various steps of the automated approach. Our experience confirms previous work that states that a well-designed human intervention or contribution in design, realization, or evaluation of an information system either improves its performance or enables its realization. As our studies and results directed us toward its necessity and value, we were inspired from previous studies in designing human involvement and customized our approaches to benefit from human input.

[135]  arXiv:2008.00345 [pdf, other]
Title: Overview of CLEF 2019 Lab ProtestNews: Extracting Protests from News in a Cross-context Setting
Comments: Conference and Labs of the Evaluation Forum (CLEF 2019), Overview of the Protest News analysis
Subjects: Computation and Language (cs.CL)

We present an overview of the CLEF-2019 Lab ProtestNews on Extracting Protests from News in the context of generalizable natural language processing. The lab consists of document, sentence, and token level information classification and extraction tasks that were referred as task 1, task 2, and task 3 respectively in the scope of this lab. The tasks required the participants to identify protest relevant information from English local news at one or more aforementioned levels in a cross-context setting, which is cross-country in the scope of this lab. The training and development data were collected from India and test data was collected from India and China. The lab attracted 58 teams to participate in the lab. 12 and 9 of these teams submitted results and working notes respectively. We have observed neural networks yield the best results and the performance drops significantly for majority of the submissions in the cross-country setting, which is China.

[136]  arXiv:2008.00348 [pdf, other]
Title: Self-supervised Visual Attribute Learning for Fashion Compatibility
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Many self-supervised learning (SSL) methods have been successful in learning semantically meaningful visual representations by solving pretext tasks. However, state-of-the-art SSL methods focus on object recognition or detection tasks, which aim to learn object shapes, but ignore visual attributes such as color and texture via color distortion augmentation. However, learning these visual attributes could be more important than learning object shapes for other vision tasks, such as fashion compatibility. To address this deficiency, we propose Self-supervised Tasks for Outfit Compatibility (STOC) without any supervision. Specifically, STOC aims to learn colors and textures of fashion items and embed similar items nearby. STOC outperforms state-of-the-art SSL by 9.5% and a supervised Siamese Network by 3% on a fill-in-the-blank outfit completion task on our unsupervised benchmark.

[137]  arXiv:2008.00351 [pdf, ps, other]
Title: Cross-context News Corpus for Protest Events related Knowledge Base Construction
Comments: Presented at Automated Knowledge Base Construction (AKBC 2020) conference. See: this https URL
Subjects: Computation and Language (cs.CL)

We describe a gold standard corpus of protest events that comprise of various local and international sources from various countries in English. The corpus contains document, sentence, and token level annotations. This corpus facilitates creating machine learning models that automatically classify news articles and extract protest event-related information, constructing knowledge bases which enable comparative social and political science studies. For each news source, the annotation starts on random samples of news articles and continues with samples that are drawn using active learning. Each batch of samples was annotated by two social and political scientists, adjudicated by an annotation supervisor, and was improved by identifying annotation errors semi-automatically. We found that the corpus has the variety and quality to develop and benchmark text classification and event extraction systems in a cross-context setting, which contributes to the generalizability and robustness of automated text processing systems. This corpus and the reported results will set the currently lacking common ground in automated protest event collection studies.

[138]  arXiv:2008.00354 [pdf]
Title: A Micro-PMU Placement Scheme for Distribution Systems Considering Practical Constraints
Comments: Accepted in IEEE PESGM 2020
Subjects: Systems and Control (eess.SY)

This paper presents an innovative approach to micro-phasor measurement unit (micro-PMU) placement in unbalanced distribution networks. The methodology accounts for the presence of single-and-two-phase laterals and acknowledges the fact that observing one phase in a distribution circuit does not translate to observing the other phases. Other practical constraints such as presence of distributed loads, unknown regulator/ transformer tap ratios, zero-injection phases (ZIPs), modern smart meters, and multiple switch configurations are also incorporated. The proposed micro-PMU placement problem is solved using integer linear programming (ILP), guaranteeing optimality of results. The uniqueness of the developed algorithm is that it not only minimizes the micro-PMU installations, but also identifies the minimum number of phases that must be monitored by them.

[139]  arXiv:2008.00357 [pdf, other]
Title: A Causal Lens for Peeking into Black Box Predictive Models: Predictive Model Interpretation via Causal Attribution
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

With the increasing adoption of predictive models trained using machine learning across a wide range of high-stakes applications, e.g., health care, security, criminal justice, finance, and education, there is a growing need for effective techniques for explaining such models and their predictions. We aim to address this problem in settings where the predictive model is a black box; That is, we can only observe the response of the model to various inputs, but have no knowledge about the internal structure of the predictive model, its parameters, the objective function, and the algorithm used to optimize the model. We reduce the problem of interpreting a black box predictive model to that of estimating the causal effects of each of the model inputs on the model output, from observations of the model inputs and the corresponding outputs. We estimate the causal effects of model inputs on model output using variants of the Rubin Neyman potential outcomes framework for estimating causal effects from observational data. We show how the resulting causal attribution of responsibility for model output to the different model inputs can be used to interpret the predictive model and to explain its predictions. We present results of experiments that demonstrate the effectiveness of our approach to the interpretation of black box predictive models via causal attribution in the case of deep neural network models trained on one synthetic data set (where the input variables that impact the output variable are known by design) and two real-world data sets: Handwritten digit classification, and Parkinson's disease severity prediction. Because our approach does not require knowledge about the predictive model algorithm and is free of assumptions regarding the black box predictive model except that its input-output responses be observable, it can be applied, in principle, to any black box predictive model.

[140]  arXiv:2008.00358 [pdf, other]
Title: Relational Algorithms for k-means Clustering
Subjects: Data Structures and Algorithms (cs.DS); Databases (cs.DB); Machine Learning (cs.LG)

The majority of learning tasks faced by data scientists involve relational data, yet most standard algorithms for standard learning problems are not designed to accept relational data as input. The standard practice to address this issue is to join the relational data to create the type of geometric input that standard learning algorithms expect. Unfortunately, this standard practice has exponential worst-case time and space complexity. This leads us to consider what we call the Relational Learning Question: ``Which standard learning algorithms can be efficiently implemented on relational data, and for those that can not, is there an alternative algorithm that can be efficiently implemented on relational data and that has similar performance guarantees to the standard algorithm?'' In this paper, we address the relational learning question for two well-known algorithms for the standard $k$-means clustering problem. We first show that the $k$-means++ algorithm can be efficiently implemented on relational data. In contrast, we show that the adaptive $k$-means algorithm likely can not be efficiently implemented on relational data, as this would imply $P = \#P$. However, we show that a slight variation of this adaptive $k$-means algorithm can be efficiently implemented on relational data, and that this alternative algorithm has the same performance guarantee as the original algorithm, that is that it outputs an $O(1)$-approximate sketch.

[141]  arXiv:2008.00362 [pdf, other]
Title: Animating Through Warping: an Efficient Method for High-Quality Facial Expression Animation
Comments: 18 pages, 13 figures, Accepted to ACM Multimedia 2020
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)

Advances in deep neural networks have considerably improved the art of animating a still image without operating in 3D domain. Whereas, prior arts can only animate small images (typically no larger than 512x512) due to memory limitations, difficulty of training and lack of high-resolution (HD) training datasets, which significantly reduce their potential for applications in movie production and interactive systems. Motivated by the idea that HD images can be generated by adding high-frequency residuals to low-resolution results produced by a neural network, we propose a novel framework known as Animating Through Warping (ATW) to enable efficient animation of HD images.
Specifically, the proposed framework consists of two modules, a novel two-stage neural-network generator and a novel post-processing module known as Animating Through Warping (ATW). It only requires the generator to be trained on small images and can do inference on an image of any size. During inference, an HD input image is decomposed into a low-resolution component(128x128) and its corresponding high-frequency residuals. The generator predicts the low-resolution result as well as the motion field that warps the input face to the desired status (e.g., expressions categories or action units). Finally, the ResWarp module warps the residuals based on the motion field and adding the warped residuals to generates the final HD results from the naively up-sampled low-resolution results. Experiments show the effectiveness and efficiency of our method in generating high-resolution animations. Our proposed framework successfully animates a 4K facial image, which has never been achieved by prior neural models. In addition, our method generally guarantee the temporal coherency of the generated animations. Source codes will be made publicly available.

[142]  arXiv:2008.00363 [pdf, other]
Title: Looking in the Right place for Anomalies: Explainable AI through Automatic Location Learning
Comments: 5 pages, Paper presented as a poster at the International Symposium on Biomedical Imaging, 2020, Paper Number 655
Journal-ref: 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

Deep learning has now become the de facto approach to the recognition of anomalies in medical imaging. Their 'black box' way of classifying medical images into anomaly labels poses problems for their acceptance, particularly with clinicians. Current explainable AI methods offer justifications through visualizations such as heat maps but cannot guarantee that the network is focusing on the relevant image region fully containing the anomaly. In this paper, we develop an approach to explainable AI in which the anomaly is assured to be overlapping the expected location when present. This is made possible by automatically extracting location-specific labels from textual reports and learning the association of expected locations to labels using a hybrid combination of Bi-Directional Long Short-Term Memory Recurrent Neural Networks (Bi-LSTM) and DenseNet-121. Use of this expected location to bias the subsequent attention-guided inference network based on ResNet101 results in the isolation of the anomaly at the expected location when present. The method is evaluated on a large chest X-ray dataset.

[143]  arXiv:2008.00364 [pdf, other]
Title: A Text Classification Survey: From Shallow to Deep Learning
Subjects: Computation and Language (cs.CL)

Text classification is the most fundamental and essential task in natural language processing. The last decade has seen a surge of research in this area due to the unprecedented success of deep learning. Numerous methods, datasets, and evaluation metrics have been proposed in the literature, raising the need for a comprehensive and updated survey. This paper fills the gap by reviewing the state of the art approaches from 1961 to 2020, focusing on models from shallow to deep learning. We create a taxonomy for text classification according to the text involved and the models used for feature extraction and classification. We then discuss each of these categories in detail, dealing with both the technical developments and benchmark datasets that support tests of predictions. A comprehensive comparison between different techniques, as well as identifying the pros and cons of various evaluation metrics are also provided in this survey. Finally, we conclude by summarizing key implications, future research directions, and the challenges facing the research area.

[144]  arXiv:2008.00368 [pdf, other]
Title: Privacy-Aware Data Cleaning-as-a-Service (Extended Version)
Subjects: Databases (cs.DB)

Data cleaning is a pervasive problem for organizations as they try to reap value from their data. Recent advances in networking and cloud computing technology have fueled a new computing paradigm called Database-as-a-Service, where data management tasks are outsourced to large service providers. In this paper, we consider a Data Cleaning-as-a-Service model that allows a client to interact with a data cleaning provider who hosts curated, and sensitive data. We present PACAS: a Privacy-Aware data Cleaning-As-a-Service model that facilitates interaction between the parties with client query requests for data, and a service provider using a data pricing scheme that computes prices according to data sensitivity. We propose new extensions to the model to define generalized data repairs that obfuscate sensitive data to allow data sharing between the client and service provider. We present a new semantic distance measure to quantify the utility of such repairs, and we re-define the notion of consistency in the presence of generalized values. The PACAS model uses (X,Y,L)-anonymity that extends existing data publishing techniques to consider the semantics in the data while protecting sensitive values. Our evaluation over real data show that PACAS safeguards semantically related sensitive values, and provides lower repair errors compared to existing privacy-aware cleaning techniques.

[145]  arXiv:2008.00371 [pdf]
Title: Análisis jurídico de la discriminación algorítmica en los procesos de selección laboral
Comments: in Spanish
Subjects: Computers and Society (cs.CY)

The use of machine learning systems in processing job applications has made the process agile and efficient, but at the same time it has created problems in terms of equality, reliability and transparency. In this paper we explain some of the uses of ML in job selection processes in the United States, and we present some the racial and sexual biases that have been detected. There are both practical and legal obstacles that impede the detection and analysis of these biases. It is also unclear how to approach algorithmic discrimination from a legal point of view. A possible analytical tool is provided by the American doctrine of Disparate Impact, but we show some of its limitations and problems when adapted to other legal systems, such as Colombian law. To conclude, we offer some desiderata that any legal analysis of algorithmic discrimination should provide.

[146]  arXiv:2008.00372 [pdf, other]
Title: Better Together: Online Probabilistic Clique Change Detection in 3D Landmark-Based Maps
Comments: Accepted as Contributed Paper at IROS 2020
Subjects: Robotics (cs.RO)

Many modern simultaneous localization and mapping (SLAM) techniques rely on sparse landmark-based maps due to their real-time performance. However, these techniques frequently assert that these landmarks are fixed in position over time, known as the static-world assumption. This is rarely, if ever, the case in most real-world environments. Even worse, over long deployments, robots are bound to observe traditionally static landmarks change, for example when an autonomous vehicle encounters a construction zone. This work addresses this challenge, accounting for changes in complex three-dimensional environments with the creation of a probabilistic filter that operates on the features that give rise to landmarks. To accomplish this, landmarks are clustered into cliques and a filter is developed to estimate their persistence jointly among observations of the landmarks in a clique. This filter uses estimated spatial-temporal priors of geometric objects, allowing for dynamic and semi-static objects to be removed from a formally static map. The proposed algorithm is validated in a 3D simulated environment.

[147]  arXiv:2008.00376 [pdf, other]
Title: Velocity Regulation of 3D Bipedal Walking Robots with Uncertain Dynamics Through Adaptive Neural Network Controller
Comments: "Accepted at 2020 International Conference on Intelligent Robots and Systems (IROS 2020). Supplemental Video: this https URL"
Subjects: Robotics (cs.RO)

This paper presents a neural-network based adaptive feedback control structure to regulate the velocity of 3D bipedal robots under dynamics uncertainties. Existing Hybrid Zero Dynamics (HZD)-based controllers regulate velocity through the implementation of heuristic regulators that do not consider model and environmental uncertainties, which may significantly affect the tracking performance of the controllers. In this paper, we address the uncertainties in the robot dynamics from the perspective of the reduced dimensional representation of virtual constraints and propose the integration of an adaptive neural network-based controller to regulate the robot velocity in the presence of model parameter uncertainties. The proposed approach yields improved tracking performance under dynamics uncertainties. The shallow adaptive neural network used in this paper does not require training a priori and has the potential to be implemented on the real-time robotic controller. A comparative simulation study of a 3D Cassie robot is presented to illustrate the performance of the proposed approach under various scenarios.

[148]  arXiv:2008.00380 [pdf]
Title: Vision and Inertial Sensing Fusion for Human Action Recognition : A Review
Comments: 14 pages,4 figures,2 tables. Submitted to IEEE Sensors Journal
Subjects: Human-Computer Interaction (cs.HC); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

Human action recognition is used in many applications such as video surveillance, human computer interaction, assistive living, and gaming. Many papers have appeared in the literature showing that the fusion of vision and inertial sensing improves recognition accuracies compared to the situations when each sensing modality is used individually. This paper provides a survey of the papers in which vision and inertial sensing are used simultaneously within a fusion framework in order to perform human action recognition. The surveyed papers are categorized in terms of fusion approaches, features, classifiers, as well as multimodality datasets considered. Challenges as well as possible future directions are also stated for deploying the fusion of these two sensing modalities under realistic conditions.

[149]  arXiv:2008.00386 [pdf, other]
Title: Bayesian Optimization for Selecting Efficient Machine Learning Models
Comments: Published at CIKM MoST-Rec 2019
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

The performance of many machine learning models depends on their hyper-parameter settings. Bayesian Optimization has become a successful tool for hyper-parameter optimization of machine learning algorithms, which aims to identify optimal hyper-parameters during an iterative sequential process. However, most of the Bayesian Optimization algorithms are designed to select models for effectiveness only and ignore the important issue of model training efficiency. Given that both model effectiveness and training time are important for real-world applications, models selected for effectiveness may not meet the strict training time requirements necessary to deploy in a production environment. In this work, we present a unified Bayesian Optimization framework for jointly optimizing models for both prediction effectiveness and training efficiency. We propose an objective that captures the tradeoff between these two metrics and demonstrate how we can jointly optimize them in a principled Bayesian Optimization framework. Experiments on model selection for recommendation tasks indicate models selected this way significantly improves model training efficiency while maintaining strong effectiveness as compared to state-of-the-art Bayesian Optimization algorithms.

[150]  arXiv:2008.00394 [pdf, other]
Title: Point Cloud Completion by Learning Shape Priors
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)

In view of the difficulty in reconstructing object details in point cloud completion, we propose a shape prior learning method for object completion. The shape priors include geometric information in both complete and the partial point clouds. We design a feature alignment strategy to learn the shape prior from complete points, and a coarse to fine strategy to incorporate partial prior in the fine stage. To learn the complete objects prior, we first train a point cloud auto-encoder to extract the latent embeddings from complete points. Then we learn a mapping to transfer the point features from partial points to that of the complete points by optimizing feature alignment losses. The feature alignment losses consist of a L2 distance and an adversarial loss obtained by Maximum Mean Discrepancy Generative Adversarial Network (MMD-GAN). The L2 distance optimizes the partial features towards the complete ones in the feature space, and MMD-GAN decreases the statistical distance of two point features in a Reproducing Kernel Hilbert Space. We achieve state-of-the-art performances on the point cloud completion task. Our code is available at https://github.com/xiaogangw/point-cloud-completion-shape-prior.

[151]  arXiv:2008.00395 [pdf, other]
Title: Balancing Common Treatment and Epidemic Control in Medical Procurement during COVID-19: Transform-and-Divide Evolutionary Optimization
Subjects: Neural and Evolutionary Computing (cs.NE)

Balancing common disease treatment and epidemic control is a key objective of medical supplies procurement in hospitals during a pandemic such as COVID-19. This problem can be formulated as a bi-objective optimization problem for simultaneously optimizing the effects of common disease treatment and epidemic control. However, due to the large number of supplies, difficulties in evaluating the effects, and the strict budget constraint, it is difficult for existing evolutionary multiobjective algorithms to efficiently approximate the Pareto front of the problem. In this paper, we present an approach that first transforms the original high-dimensional, constrained multiobjective optimization problem to a low-dimensional, unconstrained multiobjective optimization problem, and then evaluates each solution to the transformed problem by solving a set of simple single-objective optimization subproblems, such that the problem can be efficiently solved by existing evolutionary multiobjective algorithms. We applied the transform-and-divide evolutionary optimization approach to six hospitals in Zhejiang Province, China, during the peak of COVID-19. Results showed that the proposed approach exhibits significantly better performance than that of directly solving the original problem. Our study has also shown that transform-and-divide evolutionary optimization based on problem-specific knowledge can be an efficient solution approach to many other complex problems and, therefore, enlarge the application field of evolutionary algorithms.

[152]  arXiv:2008.00397 [pdf, ps, other]
Title: SeqDialN: Sequential Visual Dialog Networks in Joint Visual-Linguistic Representation Space
Comments: 18 pages, 4 figures, 5 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

In this work, we formulate a visual dialog as an information flow in which each piece of information is encoded with the joint visual-linguistic representation of a single dialog round. Based on this formulation, we consider the visual dialog task as a sequence problem consisting of ordered visual-linguistic vectors. For featurization, we use a Dense Symmetric Co-Attention network as a lightweight vison-language joint representation generator to fuse multimodal features (i.e., image and text), yielding better computation and data efficiencies. For inference, we propose two Sequential Dialog Networks (SeqDialN): the first uses LSTM for information propagation (IP) and the second uses a modified Transformer for multi-step reasoning (MR). Our architecture separates the complexity of multimodal feature fusion from that of inference, which allows simpler design of the inference engine. IP based SeqDialN is our baseline with a simple 2-layer LSTM design that achieves decent performance. MR based SeqDialN, on the other hand, recurrently refines the semantic question/history representations through the self-attention stack of Transformer and produces promising results on the visual dialog task. On VisDial v1.0 test-std dataset, our best single generative SeqDialN achieves 62.54% NDCG and 48.63% MRR; our ensemble generative SeqDialN achieves 63.78% NDCG and 49.98% MRR, which set a new state-of-the-art generative visual dialog model. We fine-tune discriminative SeqDialN with dense annotations and boost the performance up to 72.41% NDCG and 55.11% MRR. In this work, we discuss the extensive experiments we have conducted to demonstrate the effectiveness of our model components. We also provide visualization for the reasoning process from the relevant conversation rounds and discuss our fine-tuning methods. Our code is available at https://github.com/xiaoxiaoheimei/SeqDialN

[153]  arXiv:2008.00401 [pdf, other]
Title: Multilingual Translation with Extensible Multilingual Pretraining and Finetuning
Comments: 10 pages (main) + 5 pages (appendices). 9 tables and 2 figures
Subjects: Computation and Language (cs.CL)

Recent work demonstrates the potential of multilingual pretraining of creating one model that can be used for various tasks in different languages. Previous work in multilingual pretraining has demonstrated that machine translation systems can be created by finetuning on bitext. In this work, we show that multilingual translation models can be created through multilingual finetuning. Instead of finetuning on one direction, a pretrained model is finetuned on many directions at the same time. Compared to multilingual models trained from scratch, starting from pretrained models incorporates the benefits of large quantities of unlabeled monolingual data, which is particularly important for low resource languages where bitext is not available. We demonstrate that pretrained models can be extended to incorporate additional languages without loss of performance. We double the number of languages in mBART to support multilingual machine translation models of 50 languages. Finally, we create the ML50 benchmark, covering low, mid, and high resource languages, to facilitate reproducible research by standardizing training and evaluation data. On ML50, we demonstrate that multilingual finetuning improves on average 1 BLEU over the strongest baselines (being either multilingual from scratch or bilingual finetuning) while improving 9.3 BLEU on average over bilingual baselines from scratch.

[154]  arXiv:2008.00404 [pdf, other]
Title: Detecting Relevant Feature Interactions for Recommender Systems via Graph Neural Networks
Comments: 12 pages, 5 figures
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

Feature interactions are essential for achieving high accuracy in recommender systems (RS), so they have been taken into consideration in many existing RS, where all feature interactions are modeled. Nevertheless, not all feature interactions have positive effects for RS: modeling the irrelevant feature interactions may introduce noises and degrade the accuracy. To overcome this problem, in this work, we propose a graph neural network-based model, L0-SIGN, to detect the relevance of feature interactions and utilize only the relevant ones for RS, with features as nodes and feature interactions as edges. Generally, our model consists of two components: an L0 regularization based edge prediction model to explicitly detect relevant feature interactions; and a graph classification model, SIGN, to effectively model and aggregate the detected ones for recommendations. These two components positively influence each other to ensure that the most relevant feature interactions will be detected and modeled. In addition, we further prove that the effectiveness of our model is theoretically sound. We first show that our model is a variational approximation of information bottleneck principle, i.e., the detected feature interactions are guaranteed to be most relevant. We then show that our model follows the definition of statistical interactions, proving that the modeling of detected feature interactions in L0-SIGN is effective. Experimental results show that (i) L0-SIGN outperforms existing baselines in terms of accuracy, and (ii) the detected feature interactions are beneficial for performance gain and interpretability.

[155]  arXiv:2008.00407 [pdf, other]
Title: Removing Backdoor-Based Watermarks in Neural Networks with Limited Data
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Deep neural networks have been widely applied and achieved great success in various fields. As training deep models usually consumes massive data and computational resources, trading the trained deep models is highly demanded and lucrative nowadays. Unfortunately, the naive trading schemes typically involves potential risks related to copyright and trustworthiness issues, e.g., a sold model can be illegally resold to others without further authorization to reap huge profits. To tackle this problem, various watermarking techniques are proposed to protect the model intellectual property, amongst which the backdoor-based watermarking is the most commonly-used one. However, the robustness of these watermarking approaches is not well evaluated under realistic settings, such as limited in-distribution data availability and agnostic of watermarking patterns. In this paper, we benchmark the robustness of watermarking, and propose a novel backdoor-based watermark removal framework using limited data, dubbed WILD. The proposed WILD removes the watermarks of deep models with only a small portion of training data, and the output model can perform the same as models trained from scratch without watermarks injected. In particular, a novel data augmentation method is utilized to mimic the behavior of watermark triggers. Combining with the distribution alignment between the normal and perturbed (e.g., occluded) data in the feature space, our approach generalizes well on all typical types of trigger contents. The experimental results demonstrate that our approach can effectively remove the watermarks without compromising the deep model performance for the original task with the limited access to training data.

[156]  arXiv:2008.00408 [pdf]
Title: Blackbox Trojanising of Deep Learning Models : Using non-intrusive network structure and binary alterations
Authors: Jonathan Pan
Comments: 6 pages, 2 Figures
Subjects: Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR)

Recent advancements in Artificial Intelligence namely in Deep Learning has heightened its adoption in many applications. Some are playing important roles to the extent that we are heavily dependent on them for our livelihood. However, as with all technologies, there are vulnerabilities that malicious actors could exploit. A form of exploitation is to turn these technologies, intended for good, to become dual-purposed instruments to support deviant acts like malicious software trojans. As part of proactive defense, researchers are proactively identifying such vulnerabilities so that protective measures could be developed subsequently. This research explores a novel blackbox trojanising approach using a simple network structure modification to any deep learning image classification model that would transform a benign model into a deviant one with a simple manipulation of the weights to induce specific types of errors. Propositions to protect the occurrence of such simple exploits are discussed in this research. This research highlights the importance of providing sufficient safeguards to these models so that the intended good of AI innovation and adoption may be protected.

[157]  arXiv:2008.00409 [pdf, other]
Title: P-Cloth: Interactive Complex Cloth Simulation on Multi-GPU Systems using Dynamic Matrix Assembly and Pipelined Implicit Integrators
Subjects: Graphics (cs.GR)

We present a novel parallel algorithm for cloth simulation that exploits multiple GPUs for fast computation and the handling of very high resolution meshes. To accelerate implicit integration, we describe new parallel algorithms for sparse matrix-vector multiplication (SpMV) and for dynamic matrix assembly on a multi-GPU workstation. Our algorithms use a novel work queue generation scheme for a fat-tree GPU interconnect topology. Furthermore, we present a novel collision handling scheme that uses spatial hashing for discrete and continuous collision detection along with a non-linear impact zone solver. Our parallel schemes can distribute the computation and storage overhead among multiple GPUs and enable us to perform almost interactive simulation on complex cloth meshes, which can hardly be handled on a single GPU due to memory limitations. We have evaluated the performance with two multi-GPU workstations (with 4 and 8 GPUs, respectively) on cloth meshes with 0.5-1.65M triangles. Our approach can reliably handle the collisions and generate vivid wrinkles and folds at 2-5 fps, which is significantly faster than prior cloth simulation systems. We observe almost linear speedups with respect to the number of GPUs.

[158]  arXiv:2008.00410 [pdf, other]
Title: Interpretable Rule Discovery Through Bilevel Optimization of Split-Rules of Nonlinear Decision Trees for Classification Problems
Comments: Total 26 pages and 30 figures. Main Paper: 12 pages, 12 figures. Supplementary Document: 14 pages, 18 figures
Subjects: Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

For supervised classification problems involving design, control, other practical purposes, users are not only interested in finding a highly accurate classifier, but they also demand that the obtained classifier be easily interpretable. While the definition of interpretability of a classifier can vary from case to case, here, by a humanly interpretable classifier we restrict it to be expressed in simplistic mathematical terms. As a novel approach, we represent a classifier as an assembly of simple mathematical rules using a non-linear decision tree (NLDT). Each conditional (non-terminal) node of the tree represents a non-linear mathematical rule (split-rule) involving features in order to partition the dataset in the given conditional node into two non-overlapping subsets. This partitioning is intended to minimize the impurity of the resulting child nodes. By restricting the structure of split-rule at each conditional node and depth of the decision tree, the interpretability of the classifier is assured. The non-linear split-rule at a given conditional node is obtained using an evolutionary bilevel optimization algorithm, in which while the upper-level focuses on arriving at an interpretable structure of the split-rule, the lower-level achieves the most appropriate weights (coefficients) of individual constituents of the rule to minimize the net impurity of two resulting child nodes. The performance of the proposed algorithm is demonstrated on a number of controlled test problems, existing benchmark problems, and industrial problems. Results on two to 500-feature problems are encouraging and open up further scopes of applying the proposed approach to more challenging and complex classification tasks.

[159]  arXiv:2008.00414 [pdf, ps, other]
Title: On the Security of Networked Control Systems in Smart Vehicle and its Adaptive Cruise Control
Comments: This paper has been accepted and is to appear in IEEE Transactions on Intelligent Transportation Systems
Subjects: Cryptography and Security (cs.CR); Systems and Control (eess.SY)

ith the benefits of Internet of Vehicles (IoV) paradigm, come along unprecedented security challenges. Among many applications of inter-connected systems, vehicular networks and smart cars are examples that are already rolled out. Smart vehicles not only have networks connecting their internal components e.g. via Controller Area Network (CAN) bus, but also are connected to the outside world through road side units and other vehicles. In some cases, the internal and external network packets pass through the same hardware and are merely isolated by software defined rules. Any misconfiguration opens a window for the hackers to intrude into vehicles' internal components e.g. central lock system, Engine Control Unit (ECU), Anti-lock Braking System (ABS) or Adaptive Cruise Control (ACC) system. Compromise of any of these can lead to disastrous outcomes. In this paper, we study the security of smart vehicles' adaptive cruise control systems in the presence of covert attacks. We define two covert/stealth attacks in the context of cruise control and propose a novel intrusion detection and compensation method to disclose and respond to such attacks. More precisely, we focus on the covert cyber attacks that compromise the integrity of cruise controller and employ a neural network identifier in the IDS engine to estimate the system output dynamically and compare it against the ACC output. If any anomaly is detected, an embedded substitute controller kicks in and takes over the control. We conducted extensive experiments in MATLAB to evaluate the effectiveness of the proposed scheme in a simulated environment.

[160]  arXiv:2008.00418 [pdf, other]
Title: Blind Face Restoration via Deep Multi-scale Component Dictionaries
Comments: In ECCV 2020. Code is available at: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Recent reference-based face restoration methods have received considerable attention due to their great capability in recovering high-frequency details on real low-quality images. However, most of these methods require a high-quality reference image of the same identity, making them only applicable in limited scenes. To address this issue, this paper suggests a deep face dictionary network (termed as DFDNet) to guide the restoration process of degraded observations. To begin with, we use K-means to generate deep dictionaries for perceptually significant face components (\ie, left/right eyes, nose and mouth) from high-quality images. Next, with the degraded input, we match and select the most similar component features from their corresponding dictionaries and transfer the high-quality details to the input via the proposed dictionary feature transfer (DFT) block. In particular, component AdaIN is leveraged to eliminate the style diversity between the input and dictionary features (\eg, illumination), and a confidence score is proposed to adaptively fuse the dictionary feature to the input. Finally, multi-scale dictionaries are adopted in a progressive manner to enable the coarse-to-fine restoration. Experiments show that our proposed method can achieve plausible performance in both quantitative and qualitative evaluation, and more importantly, can generate realistic and promising results on real degraded images without requiring an identity-belonging reference. The source code and models are available at \url{https://github.com/csxmli2016/DFDNet}.

[161]  arXiv:2008.00420 [pdf, ps, other]
Title: Forbidden Induced Subgraphs and the Łoś-Tarski Theorem
Subjects: Logic in Computer Science (cs.LO)

Let $\mathscr C$ be a class of finite and infinite graphs that is closed under induced subgraphs. The well-known {\L}o\'s-Tarski Theorem from classical model theory implies that $\mathscr C$ is definable in first-order logic (FO) by a sentence $\varphi$ if and only if $\mathscr C$ has a finite set of forbidden induced finite subgraphs. It provides a powerful tool to show nontrivial characterizations of graphs of small vertex cover, of bounded tree-depth, of bounded shrub-depth, etc. in terms of forbidden induced finite subgraphs. Furthermore, by the Completeness Theorem, we can compute from $\varphi$ the corresponding forbidden induced subgraphs. We show that this machinery fails on finite graphs.
- There is a class $\mathscr C$ of finite graphs which is definable in FO and closed under induced subgraphs but has no finite set of forbidden induced subgraphs.
- Even if we only consider classes $\mathscr C$ of finite graphs which can be characterized by a finite set of forbidden induced subgraphs, such a characterization cannot be computed from an FO-sentence $\varphi$, which defines $\mathscr C$, and the size of the characterization cannot be bounded by $f(|\varphi|)$ for any computable function $f$.
Besides their importance in graph theory, the above results also significantly strengthen similar known results for arbitrary structures.

[162]  arXiv:2008.00421 [pdf, ps, other]
Title: Concolic Testing in CLP
Subjects: Logic in Computer Science (cs.LO); Programming Languages (cs.PL); Software Engineering (cs.SE)

Concolic testing is a popular software verification technique based on a combination of concrete and symbolic execution. Its main focus is finding bugs and generating test cases with the aim of maximizing code coverage. A previous approach to concolic testing in logic programming was not sound because it only dealt with positive constraints (by means of substitutions) but could not represent negative constraints. In this paper, we present a novel framework for concolic testing of CLP programs that generalizes the previous technique. In the CLP setting, one can represent both positive and negative constraints in a natural way, thus giving rise to a sound and (potentially) more efficient technique. Defining verification and testing techniques for CLP programs is increasingly relevant since this framework is becoming popular as an intermediate representation to analyze programs written in other programming paradigms.

[163]  arXiv:2008.00425 [pdf, other]
Title: Concentration-Bound Analysis for Probabilistic Programs and Probabilistic Recurrence Relations
Comments: 29 pages
Subjects: Programming Languages (cs.PL); Data Structures and Algorithms (cs.DS)

Analyzing probabilistic programs and randomized algorithms are classical problems in computer science.The first basic problem in the analysis of stochastic processes is to consider the expectation or mean, and another basic problem is to consider concentration bounds, i.e. showing that large deviations from the mean have small probability. Similarly, in the context of probabilistic programs and randomized algorithms, the analysis of expected termination time/running time and their concentration bounds are fundamental problems. In this work, we focus on concentration bounds for probabilistic programs and probabilistic recurrences of randomized algorithms. For probabilistic programs, the basic technique to achieve concentration bounds is to consider martingales and apply the classical Azuma's inequality. For probabilistic recurrences of randomized algorithms, Karp's classical "cookbook" method, which is similar to the master theorem for recurrences, is the standard approach to obtain concentration bounds. In this work, we propose a novel approach for deriving concentration bounds for probabilistic programs and probabilistic recurrence relations through the synthesis of exponential supermartingales. For probabilistic programs, we present algorithms for synthesis of such supermartingales in several cases. We also show that our approach can derive better concentration bounds than simply applying the classical Azuma's inequality over various probabilistic programs considered in the literature. For probabilistic recurrences, our approach can derive tighter bounds than the well-established methods of and for classical algorithms such as quick sort, quick select, and randomized diameter computation. We also present a prototype implementation that can automatically infer these bounds.

[164]  arXiv:2008.00427 [pdf, ps, other]
Title: Structured strong linearizations of structured rational matrices
Comments: 35 pages
Subjects: Numerical Analysis (math.NA)

Structured rational matrices such as symmetric, skew-symmetric, Hamiltonian, skew-Hamiltonian, Hermitian, and para-Hermitian rational matrices arise in many applications. Linearizations of rational matrices have been introduced recently for computing poles, eigenvalues, eigenvectors, minimal bases and minimal indices of rational matrices. For structured rational matrices, it is desirable to construct structure-preserving linearizations so as to preserve the symmetry in the eigenvalues and poles of the rational matrices. With a view to constructing structure-preserving linearizations of structured rational matrices, we propose a family of Fiedler-like pencils and show that the family of Fiedler-like pencils is a rich source of structure-preserving strong linearizations of structured rational matrices. We construct symmetric, skew-symmetric, Hamiltonian, skew-Hamiltonian, Hermitian, skew-Hermitian, para-Hermitian and para-skew-Hermitian strong linearizations of a rational matrix $G(\lambda)$ when $G(\lambda)$ has the same structure. Further, when $G(\lambda)$ is real and symmetric, we show that the transfer functions of real symmetric linearizations of $G(\lambda)$ preserve the Cauchy-Maslov index of $G(\lambda).$ We describe the recovery of eigenvectors, minimal bases and minimal indices of $G(\lambda)$ from those of the linearizations of $G(\lambda)$ and show that the recovery is operation-free.

[165]  arXiv:2008.00428 [pdf, other]
Title: Multiobjective Backstepping Controller for Parallel Buck Converter
Subjects: Systems and Control (eess.SY)

A backstepping controller is designed for a system of parallel buck converters sharing load. Controller objective is to ensure proper current sharing and output voltage regulation. The designed controller is successfully tested for both constant load and sudden change in loading conditions.

[166]  arXiv:2008.00436 [pdf, other]
Title: Comparison results of $P_2$-finite elements for fourth-order semilinear von Karman equations
Authors: Gouranga Mallik
Subjects: Numerical Analysis (math.NA)

Lower-order $P_2$ finite elements are popular for solving fourth-order elliptic PDEs when the solution has limited regularity. A priori and a posteriori error estimates for von Karman equations are considered in Carstensen et al. (2019, 2020) with respect to different mesh dependent norms which involve different jump and penalization terms. This paper addresses the question, whether they are comparable with respect to a common norm. This article establishes that the errors for the quadratic symmetric interior discontinuous Galerkin, $C^0$ interior penalty and nonconforming Morley finite element methods are equivalent upto some higher-order oscillation term with respect to a unified norm. Numerical experiments are performed to substantiate the comparison results.

[167]  arXiv:2008.00438 [pdf, other]
Title: On extremal leaf status and internal status of trees
Authors: Haiyan Guo, Bo Zhou
Subjects: Discrete Mathematics (cs.DM); Social and Information Networks (cs.SI)

For a vertex $u$ of a tree $T$, the leaf (internal, respectively) status of $u$ is the sum of the distances from $u$ to all leaves (internal vertices, respectively) of $T$. The minimum (maximum, respectively) leaf status of a tree $T$ is the minimum (maximum, respectively) leaf statuses of all vertices of $T$. The minimum (maximum, respectively) internal status of a tree $T$ is the minimum (maximum, respectively) internal statuses of all vertices of $T$. We give the smallest and largest values for the minimum leaf status, maximum leaf status, minimum internal status, and maximum internal status of a tree and characterize the extremal cases. We also discuss these parameters of a tree with given diameter or maximum degree.

[168]  arXiv:2008.00441 [pdf, other]
Title: Relation Extraction with Self-determined Graph Convolutional Network
Comments: CIKM-2020
Subjects: Computation and Language (cs.CL)

Relation Extraction is a way of obtaining the semantic relationship between entities in text. The state-of-the-art methods use linguistic tools to build a graph for the text in which the entities appear and then a Graph Convolutional Network (GCN) is employed to encode the pre-built graphs. Although their performance is promising, the reliance on linguistic tools results in a non end-to-end process. In this work, we propose a novel model, the Self-determined Graph Convolutional Network (SGCN), which determines a weighted graph using a self-attention mechanism, rather using any linguistic tool. Then, the self-determined graph is encoded using a GCN. We test our model on the TACRED dataset and achieve the state-of-the-art result. Our experiments show that SGCN outperforms the traditional GCN, which uses dependency parsing tools to build the graph.

[169]  arXiv:2008.00444 [pdf, other]
Title: Principles and Algorithms for Forecasting Groups of Time Series: Locality and Globality
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

Forecasting groups of time series is of increasing practical importance, e.g. forecasting the demand for multiple products offered by a retailer or server loads within a data center. The local approach to this problem considers each time series separately and fits a function or model to each series. The global approach fits a single function to all series. For groups of similar time series, global methods outperform the more established local methods. However, recent results show good performance of global models even in heterogeneous datasets. This suggests a more general applicability of global methods, potentially leading to more accurate tools and new scenarios to study.
Formalizing the setting of forecasting a set of time series with local and global methods, we provide the following contributions:
1) Global methods are not more restrictive than local methods, both can produce the same forecasts without any assumptions about similarity of the series. Global models can succeed in a wider range of problems than previously thought.
2) Basic generalization bounds for local and global algorithms. The complexity of local methods grows with the size of the set while it remains constant for global methods. In large datasets, a global algorithm can afford to be quite complex and still benefit from better generalization. These bounds serve to clarify and support recent experimental results in the field, and guide the design of new algorithms. For the class of autoregressive models, this implies that global models can have much larger memory than local methods.
3) In an extensive empirical study, purposely naive algorithms derived from these principles, such as global linear models or deep networks result in superior accuracy.
In particular, global linear models can provide competitive accuracy with two orders of magnitude fewer parameters than local methods.

[170]  arXiv:2008.00446 [pdf, other]
Title: Stochastic Bundle Adjustment for Efficient and Scalable 3D Reconstruction
Comments: Accepted by ECCV 2020
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Current bundle adjustment solvers such as the Levenberg-Marquardt (LM) algorithm are limited by the bottleneck in solving the Reduced Camera System (RCS) whose dimension is proportional to the camera number. When the problem is scaled up, this step is neither efficient in computation nor manageable for a single compute node. In this work, we propose a stochastic bundle adjustment algorithm which seeks to decompose the RCS approximately inside the LM iterations to improve the efficiency and scalability. It first reformulates the quadratic programming problem of an LM iteration based on the clustering of the visibility graph by introducing the equality constraints across clusters. Then, we propose to relax it into a chance constrained problem and solve it through sampled convex program. The relaxation is intended to eliminate the interdependence between clusters embodied by the constraints, so that a large RCS can be decomposed into independent linear sub-problems. Numerical experiments on unordered Internet image sets and sequential SLAM image sets, as well as distributed experiments on large-scale datasets, have demonstrated the high efficiency and scalability of the proposed approach. Codes are released at https://github.com/zlthinker/STBA.

[171]  arXiv:2008.00448 [pdf, ps, other]
Title: Beamforming Design with Fast Convergence for IRS-Aided Full-Duplex Communication
Comments: accepted by IEEE Communications Letters
Subjects: Information Theory (cs.IT)

We study the beamforming optimization for an intelligent reflecting surface (IRS)-aided full-duplex (FD) communication system in this letter. Specifically, we maximize the sum rate of bi-directional transmissions by jointly optimizing the transmit beamforming and the beamforming of the IRS reflection. A fast converging alternating algorithm is developed to tackle this problem. In each iteration of the proposed algorithm, the solutions to the transmit beamforming and the IRS reflect beamforming are obtained in a semi-closed form and a closed form, respectively. Compared to an existing method based on the Arimoto-Blahut algorithm, the proposed method achieves almost the same performance while enjoying much faster convergence and lower computational complexity.

[172]  arXiv:2008.00455 [pdf, other]
Title: Video Super-Resolution with Recurrent Structure-Detail Network
Comments: ECCV 2020
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Most video super-resolution methods super-resolve a single reference frame with the help of neighboring frames in a temporal sliding window. They are less efficient compared to the recurrent-based methods. In this work, we propose a novel recurrent video super-resolution method which is both effective and efficient in exploiting previous frames to super-resolve the current frame. It divides the input into structure and detail components which are fed to a recurrent unit composed of several proposed two-stream structure-detail blocks. In addition, a hidden state adaptation module that allows the current frame to selectively use information from hidden state is introduced to enhance its robustness to appearance change and error accumulation. Extensive ablation study validate the effectiveness of the proposed modules. Experiments on several benchmark datasets demonstrate the superior performance of the proposed method compared to state-of-the-art methods on video super-resolution.

[173]  arXiv:2008.00456 [pdf, other]
Title: Hindsight for Foresight: Unsupervised Structured Dynamics Models from Physical Interaction
Comments: Accepted at the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

A key challenge for an agent learning to interact with the world is to reason about physical properties of objects and to foresee their dynamics under the effect of applied forces. In order to scale learning through interaction to many objects and scenes, robots should be able to improve their own performance from real-world experience without requiring human supervision. To this end, we propose a novel approach for modeling the dynamics of a robot's interactions directly from unlabeled 3D point clouds and images. Unlike previous approaches, our method does not require ground-truth data associations provided by a tracker or any pre-trained perception network. To learn from unlabeled real-world interaction data, we enforce consistency of estimated 3D clouds, actions and 2D images with observed ones. Our joint forward and inverse network learns to segment a scene into salient object parts and predicts their 3D motion under the effect of applied actions. Moreover, our object-centric model outputs action-conditioned 3D scene flow, object masks and 2D optical flow as emergent properties. Our extensive evaluation both in simulation and with real-world data demonstrates that our formulation leads to effective, interpretable models that can be used for visuomotor control and planning. Videos, code and dataset are available at this http URL

[174]  arXiv:2008.00460 [pdf, other]
Title: Mask Point R-CNN
Subjects: Computer Vision and Pattern Recognition (cs.CV)

The attributes of object contours has great significance for instance segmentation task. However, most of the current popular deep neural networks do not pay much attention to the target edge information. Inspired by the human annotation process when making instance segmentation datasets, in this paper, we propose Mask Point RCNN aiming at promoting the neural networks attention to the target edge information, which can heighten the information propagates between multiple tasks by using different attributes features. Specifically, we present an auxiliary task to Mask RCNN, including utilizing keypoint detection technology to construct the target edge contour, and enhancing the sensitivity of the network to the object edge through multi task learning and feature fusion. These improvements are easy to implement and have a small amount of additional computing overhead. By extensive evaluations on the Cityscapes dataset, the results show that our approach outperforms vanilla Mask RCNN by 5.4 on the validation subset and 5.0 on the test subset.

[175]  arXiv:2008.00461 [pdf, other]
Title: Large-scale, Language-agnostic Discourse Classification of Tweets During COVID-19
Authors: Oguzhan Gencoglu
Comments: 28 pages, 4 figures
Subjects: Social and Information Networks (cs.SI); Computation and Language (cs.CL); Machine Learning (cs.LG)

Quantifying the characteristics of public attention is an essential prerequisite for appropriate crisis management during severe events such as pandemics. For this purpose, we propose language-agnostic tweet representations to perform large-scale Twitter discourse classification with machine learning. Our analysis on more than 26 million COVID-19 tweets show that large-scale surveillance of public discourse is feasible with computationally lightweight classifiers by out-of-the-box utilization of these representations.

[176]  arXiv:2008.00463 [pdf, ps, other]
Title: Structural Causal Models Are (Solvable by) Credal Networks
Comments: To appear in the proceedings of the 10th International Conference on Probabilistic Graphical Models (PGM 2020)
Subjects: Artificial Intelligence (cs.AI)

A structural causal model is made of endogenous (manifest) and exogenous (latent) variables. We show that endogenous observations induce linear constraints on the probabilities of the exogenous variables. This allows to exactly map a causal model into a credal network. Causal inferences, such as interventions and counterfactuals, can consequently be obtained by standard algorithms for the updating of credal nets. These natively return sharp values in the identifiable case, while intervals corresponding to the exact bounds are produced for unidentifiable queries. A characterization of the causal models that allow the map above to be compactly derived is given, along with a discussion about the scalability for general models. This contribution should be regarded as a systematic approach to represent structural causal models by credal networks and hence to systematically compute causal inferences. A number of demonstrative examples is presented to clarify our methodology. Extensive experiments show that approximate algorithms for credal networks can immediately be used to do causal inference in real-size problems.

[177]  arXiv:2008.00474 [pdf]
Title: MDA Models and PIM/PSM Transformations Using Extended Automata
Comments: 11 pages, 5 figures, 2 tables
Subjects: Software Engineering (cs.SE); Formal Languages and Automata Theory (cs.FL)

This paper proposes a model of execution platform for the OMG request of a generic PlatformIndependent-Model (PIM) allowing realization of the Model Driven Architecture (MDA) standard. We propose AMDA (Automata based MDA), a method based on the use of parallel automata, which can be a common tool for building a PIM from UML diagrams (including OCL) and transforming the PIM to PSM automata and further to compilable code. Each platform would then have a mechanism to execute the translated code. Our architecture for a general PSM translator of these automata allows portable execution on various specific implementation platforms. This general translator must be written, once, for the languages and with the libraries of the required specific PSM. This allows also interoperability between different PSMs. An ATM case study example is presented to illustrate the approach.

[178]  arXiv:2008.00476 [pdf, other]
Title: SCNet: A Neural Network for Automated Side-Channel Attack
Subjects: Cryptography and Security (cs.CR); Machine Learning (cs.LG)

The side-channel attack is an attack method based on the information gained about implementations of computer systems, rather than weaknesses in algorithms. Information about system characteristics such as power consumption, electromagnetic leaks and sound can be exploited by the side-channel attack to compromise the system. Much research effort has been directed towards this field. However, such an attack still requires strong skills, thus can only be performed effectively by experts. Here, we propose SCNet, which automatically performs side-channel attacks. And we also design this network combining with side-channel domain knowledge and different deep learning model to improve the performance and better to explain the result. The results show that our model achieves good performance with fewer parameters. The proposed model is a useful tool for automatically testing the robustness of computer systems.

[179]  arXiv:2008.00482 [pdf]
Title: Investigating the Effect of Emoji in Opinion Classification of Uzbek Movie Review Comments
Comments: 10 pages, 1 figure, 3 tables
Subjects: Computation and Language (cs.CL)

Opinion mining on social media posts has become more and more popular. Users often express their opinion on a topic not only with words but they also use image symbols such as emoticons and emoji. In this paper, we investigate the effect of emoji-based features in opinion classification of Uzbek texts, and more specifically movie review comments from YouTube. Several classification algorithms are tested, and feature ranking is performed to evaluate the discriminative ability of the emoji-based features.

[180]  arXiv:2008.00483 [pdf, ps, other]
Title: Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy
Subjects: Machine Learning (cs.LG); Optimization and Control (math.OC); Machine Learning (stat.ML)

We study the global convergence and global optimality of actor-critic, one of the most popular families of reinforcement learning algorithms. While most existing works on actor-critic employ bi-level or two-timescale updates, we focus on the more practical single-timescale setting, where the actor and critic are updated simultaneously. Specifically, in each iteration, the critic update is obtained by applying the Bellman evaluation operator only once while the actor is updated in the policy gradient direction computed using the critic. Moreover, we consider two function approximation settings where both the actor and critic are represented by linear or deep neural networks. For both cases, we prove that the actor sequence converges to a globally optimal policy at a sublinear $O(K^{-1/2})$ rate, where $K$ is the number of iterations. To the best of our knowledge, we establish the rate of convergence and global optimality of single-timescale actor-critic with linear function approximation for the first time. Moreover, under the broader scope of policy optimization with nonlinear function approximation, we prove that actor-critic with deep neural network finds the globally optimal policy at a sublinear rate for the first time.

[181]  arXiv:2008.00485 [pdf, other]
Title: SymmetryNet: Learning to Predict Reflectional and Rotational Symmetries of 3D Shapes from Single-View RGB-D Images
Comments: 14 pages
Journal-ref: ACM Transactions on Graphics(Proc. of SIGGRAPH Asia), 2020
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)

We study the problem of symmetry detection of 3D shapes from single-view RGB-D images, where severe missing data renders geometric detection approach infeasible. We propose an end-to-end deep neural network which is able to predict both reflectional and rotational symmetries of 3D objects present in the input RGB-D image. Directly training a deep model for symmetry prediction, however, can quickly run into the issue of overfitting. We adopt a multi-task learning approach. Aside from symmetry axis prediction, our network is also trained to predict symmetry correspondences. In particular, given the 3D points present in the RGB-D image, our network outputs for each 3D point its symmetric counterpart corresponding to a specific predicted symmetry. In addition, our network is able to detect for a given shape multiple symmetries of different types. We also contribute a benchmark of 3D symmetry detection based on single-view RGB-D images. Extensive evaluation on the benchmark demonstrates the strong generalization ability of our method, in terms of high accuracy of both symmetry axis prediction and counterpart estimation. In particular, our method is robust in handling unseen object instances with large variation in shape, multi-symmetry composition, as well as novel object categories.

[182]  arXiv:2008.00490 [pdf, other]
Title: Tensor Low-Rank Reconstruction for Semantic Segmentation
Comments: ECCV2020. Top-1 performance on PASCAL-VOC12; Source code at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Context information plays an indispensable role in the success of semantic segmentation. Recently, non-local self-attention based methods are proved to be effective for context information collection. Since the desired context consists of spatial-wise and channel-wise attentions, 3D representation is an appropriate formulation. However, these non-local methods describe 3D context information based on a 2D similarity matrix, where space compression may lead to channel-wise attention missing. An alternative is to model the contextual information directly without compression. However, this effort confronts a fundamental difficulty, namely the high-rank property of context information. In this paper, we propose a new approach to model the 3D context representations, which not only avoids the space compression but also tackles the high-rank difficulty. Here, inspired by tensor canonical-polyadic decomposition theory (i.e, a high-rank tensor can be expressed as a combination of rank-1 tensors.), we design a low-rank-to-high-rank context reconstruction framework (i.e, RecoNet). Specifically, we first introduce the tensor generation module (TGM), which generates a number of rank-1 tensors to capture fragments of context feature. Then we use these rank-1 tensors to recover the high-rank context features through our proposed tensor reconstruction module (TRM). Extensive experiments show that our method achieves state-of-the-art on various public datasets. Additionally, our proposed method has more than 100 times less computational cost compared with conventional non-local-based methods.

[183]  arXiv:2008.00496 [pdf, ps, other]
Title: Minimum $2$-vertex strongly biconnected spanning directed subgraph problem
Authors: Raed Jaberi
Subjects: Data Structures and Algorithms (cs.DS)

A directed graph $G=(V,E)$ is strongly biconnected if $G$ is strongly connected and its underlying graph is biconnected. A strongly biconnected directed graph $G=(V,E)$ is called $2$-vertex-strongly biconnected if $|V|\geq 3$ and the induced subgraph on $V\setminus\left\lbrace w\right\rbrace $ is strongly biconnected for every vertex $w\in V$. In this paper we study the following problem.
Given a $2$-vertex-strongly biconnected directed graph $G=(V,E)$, compute an edge subset $E^{2sb} \subseteq E$ of minimum size such that the subgraph $(V,E^{2sb})$ is $2$-vertex-strongly biconnected.

[184]  arXiv:2008.00497 [pdf, ps, other]
Title: Conforming Discrete Gradgrad-Complexes in Three Dimensions
Authors: Jun Hu, Yizhou Liang
Subjects: Numerical Analysis (math.NA)

In this paper, the first family of conforming discrete three dimensional Gradgrad-complexes consisting of finite element spaces is constructed. These discrete complexes are exact in the sense that the range of each discrete map is the kernel space of the succeeding one. These spaces can be used in the mixed form of the linearized Einstein-Bianchi system.

[185]  arXiv:2008.00498 [pdf]
Title: HyperFaceNet: A Hyperspectral Face Recognition Method Based on Deep Fusion
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

Face recognition has already been well studied under the visible light and the infrared,in both intra-spectral and cross-spectral cases. However, how to fuse different light bands, i.e., hyperspectral face recognition, is still an open research problem, which has the advantages of richer information retaining and all-weather functionality over single band face recognition. Among the very few works for hyperspectral face recognition, traditional non-deep learning techniques are largely used. Thus, we in this paper bring deep learning into the topic of hyperspectral face recognition, and propose a new fusion model (termed HyperFaceNet) especially for hyperspectral faces. The proposed fusion model is characterized by residual dense learning, a feedback style encoder and a recognition-oriented loss function. During the experiments, our method is proved to be of higher recognition rates than face recognition using either visible light or the infrared. Moreover, our fusion model is shown to be superior to other general-purposed image fusion methods including state-of-the-arts, in terms of both image quality and recognition performance.

[186]  arXiv:2008.00500 [pdf, other]
Title: Dynamic Discrete Choice Estimation with Partially Observable States and Hidden Dynamics
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

Dynamic discrete choice models are used to estimate the intertemporal preferences of an agent as described by a reward function based upon observable histories of states and implemented actions. However, in many applications, such as reliability and healthcare, the system state is partially observable or hidden (e.g., the level of deterioration of an engine, the condition of a disease), and the decision maker only has access to information imperfectly correlated with the true value of the hidden state. In this paper, we consider the estimation of a dynamic discrete choice model with state variables and system dynamics that are hidden (or partially observed) to both the agent and the modeler, thus generalizing Rust's model to partially observable cases. We analyze the structural properties of the model and prove that this model is still identifiable if the cardinality of the state space, the discount factor, the distribution of random shocks, and the rewards for a given (reference) action are given. We analyze both theoretically and numerically the potential mis-specification errors that may be incurred when Rust's model is improperly used in partially observable settings. We further apply the developed model to a subset of Rust's dataset for bus engine mileage and replacement decisions. The results show that our model can improve model fit as measured by the $\log$-likelihood function by $17.7\%$ and the $\log$-likelihood ratio test shows that our model statistically outperforms Rust's model. Interestingly, our hidden state model also reveals an economically meaningful route assignment behavior in the dataset which was hitherto ignored, i.e. routes with lower mileage are assigned to buses believed to be in worse condition.

[187]  arXiv:2008.00504 [pdf, other]
Title: Variational Filtering with Copula Models for SLAM
Comments: Published at the 2020 International Conference on Intelligent Robots and Systems (IROS)
Subjects: Robotics (cs.RO); Machine Learning (stat.ML)

The ability to infer map variables and estimate pose is crucial to the operation of autonomous mobile robots. In most cases the shared dependency between these variables is modeled through a multivariate Gaussian distribution, but there are many situations where that assumption is unrealistic. Our paper shows how it is possible to relax this assumption and perform simultaneous localization and mapping (SLAM) with a larger class of distributions, whose multivariate dependency is represented with a copula model. We integrate the distribution model with copulas into a Sequential Monte Carlo estimator and show how unknown model parameters can be learned through gradient-based optimization. We demonstrate our approach is effective in settings where Gaussian assumptions are clearly violated, such as environments with uncertain data association and nonlinear transition models.

[188]  arXiv:2008.00505 [pdf, other]
Title: Modelling, Controllability and Gait Design for a Spherical Flexible Swimmer
Subjects: Systems and Control (eess.SY); Fluid Dynamics (physics.flu-dyn)

This paper discusses modelling, controllability and gait design for a spherical flexible swimmer. We first present a kinematic model of a low Reynolds number spherical flexible swimming mechanism with periodic surface deformations in the radial and azimuthal directions. The model is then converted to a finite dimensional driftless, affine-in-control principal kinematic form by representing the surface deformations as a linear combination of finitely many Legendre polynomials. A controllability analysis is then done for this swimmer to conclude that the swimmer is locally controllable on $\mathbb{R}^3$ for certain combinations of the Legendre polynomials. The rates of the coefficients of the polynomials are considered as the control inputs for surface deformation. Finally, the Abelian nature of the structure group of the swimmer's configuration space is exploited to synthesize a curvature based gait for the spherical flexile swimmer and a rigid-link swimmer.

[189]  arXiv:2008.00506 [pdf, other]
Title: Differentiable Feature Aggregation Search for Knowledge Distillation
Comments: A feature distillation method via differentiable architecture search
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)

Knowledge distillation has become increasingly important in model compression. It boosts the performance of a miniaturized student network with the supervision of the output distribution and feature maps from a sophisticated teacher network. Some recent works introduce multi-teacher distillation to provide more supervision to the student network. However, the effectiveness of multi-teacher distillation methods are accompanied by costly computation resources. To tackle with both the efficiency and the effectiveness of knowledge distillation, we introduce the feature aggregation to imitate the multi-teacher distillation in the single-teacher distillation framework by extracting informative supervision from multiple teacher feature maps. Specifically, we introduce DFA, a two-stage Differentiable Feature Aggregation search method that motivated by DARTS in neural architecture search, to efficiently find the aggregations. In the first stage, DFA formulates the searching problem as a bi-level optimization and leverages a novel bridge loss, which consists of a student-to-teacher path and a teacher-to-student path, to find appropriate feature aggregations. The two paths act as two players against each other, trying to optimize the unified architecture parameters to the opposite directions while guaranteeing both expressivity and learnability of the feature aggregation simultaneously. In the second stage, DFA performs knowledge distillation with the derived feature aggregation. Experimental results show that DFA outperforms existing methods on CIFAR-100 and CINIC-10 datasets under various teacher-student settings, verifying the effectiveness and robustness of the design.

[190]  arXiv:2008.00508 [pdf, other]
Title: Unacceptable, where is my privacy? Exploring Accidental Triggers of Smart Speakers
Subjects: Cryptography and Security (cs.CR)

Voice assistants like Amazon's Alexa, Google's Assistant, or Apple's Siri, have become the primary (voice) interface in smart speakers that can be found in millions of households. For privacy reasons, these speakers analyze every sound in their environment for their respective wake word like ''Alexa'' or ''Hey Siri,'' before uploading the audio stream to the cloud for further processing. Previous work reported on the inaccurate wake word detection, which can be tricked using similar words or sounds like ''cocaine noodles'' instead of ''OK Google.''
In this paper, we perform a comprehensive analysis of such accidental triggers, i.,e., sounds that should not have triggered the voice assistant, but did. More specifically, we automate the process of finding accidental triggers and measure their prevalence across 11 smart speakers from 8 different manufacturers using everyday media such as TV shows, news, and other kinds of audio datasets. To systematically detect accidental triggers, we describe a method to artificially craft such triggers using a pronouncing dictionary and a weighted, phone-based Levenshtein distance. In total, we have found hundreds of accidental triggers. Moreover, we explore potential gender and language biases and analyze the reproducibility. Finally, we discuss the resulting privacy implications of accidental triggers and explore countermeasures to reduce and limit their impact on users' privacy. To foster additional research on these sounds that mislead machine learning models, we publish a dataset of more than 1000 verified triggers as a research artifact.

[191]  arXiv:2008.00511 [pdf, other]
Title: Curriculum Learning with a Progression Function
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

Curriculum Learning for Reinforcement Learning is an increasingly popular technique that involves training an agent on a defined sequence of intermediate tasks, called a Curriculum, to increase the agent's performance and learning speed. This paper introduces a novel paradigm for automatic curriculum generation based on a progression of task complexity. Different progression functions are introduced, including an autonomous online task progression based on the performance of the agent. The progression function also determines how long the agent should train on each intermediate task, which is an open problem in other task-based curriculum approaches. The benefits and wide applicability of our approach are shown by empirically comparing its performance to two state-of-the-art Curriculum Learning algorithms on a grid world and on a complex simulated navigation domain.

[192]  arXiv:2008.00512 [pdf]
Title: Integrated monitoring of ice in selected Swiss lakes. Final project report
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Various lake observables, including lake ice, are related to climate and climate change and provide a good opportunity for long-term monitoring. Lakes (and as part of them lake ice) is therefore considered an Essential Climate Variable (ECV) of the Global Climate Observing System (GCOS). Following the need for an integrated multi-temporal monitoring of lake ice in Switzerland, MeteoSwiss in the framework of GCOS Switzerland supported this 2-year project to explore not only the use of satellite images but also the possibilities of Webcams and in-situ measurements. The aim of this project is to monitor some target lakes and detect the extent of ice and especially the ice-on/off dates, with focus on the integration of various input data and processing methods. The target lakes are: St. Moritz, Silvaplana, Sils, Sihl, Greifen and Aegeri, whereby only the first four were mainly frozen during the observation period and thus processed. The observation period was mainly the winter 2016-17. During the project, various approaches were developed, implemented, tested and compared. Firstly, low spatial resolution (250 - 1000 m) but high temporal resolution (1 day) satellite images from the optical sensors MODIS and VIIRS were used. Secondly, and as a pilot project, the use of existing public Webcams was investigated for (a) validation of results from satellite data, and (b) independent estimation of lake ice, especially for small lakes like St. Moritz, that could not be possibly monitored in the satellite images. Thirdly, in-situ measurements were made in order to characterize the development of the temperature profiles and partly pressure before freezing and under the ice-cover until melting. This report presents the results of the project work.

[193]  arXiv:2008.00516 [pdf, other]
Title: Deep-Reinforcement-Learning-Based Semantic Navigation of Mobile Robots in Dynamic Environments
Comments: 6 pages, 5 figures, IEEE International Conference on Automation Science and Engineering (CASE) 2020, Hong Kong
Journal-ref: IEEE International Conference on Automation Science and Engineering (CASE) 2020, Hong Kong
Subjects: Robotics (cs.RO)

Mobile robots have gained increased importance within industrial tasks such as commissioning, delivery or operation in hazardous environments. The ability to autonomously navigate safely especially within dynamic environments, is paramount in industrial mobile robotics. Current navigation methods depend on preexisting static maps and are error-prone in dynamic environments. Furthermore, for safety reasons, they often rely on hand-crafted safety guidelines, which makes the system less flexible and slow. Visual based navigation and high level semantics bear the potential to enhance the safety of path planing by creating links the agent can reason about for a more flexible navigation. On this account, we propose a reinforcement learning based local navigation system which learns navigation behavior based solely on visual observations to cope with highly dynamic environments. Therefore, we develop a simple yet efficient simulator - ARENA2D - which is able to generate highly randomized training environments and provide semantic information to train our agent. We demonstrate enhanced results in terms of safety and robustness over a traditional baseline approach based on the dynamic window approach.

[194]  arXiv:2008.00517 [pdf, other]
Title: Interest Clustering Coefficient: a New Metric for Directed Networks like Twitter
Comments: 15 pages, 9 figures
Subjects: Social and Information Networks (cs.SI); Physics and Society (physics.soc-ph)

We study here the clustering of directed social graphs. The clustering coefficient has been introduced to capture the social phenomena that a friend of a friend tends to be my friend. This metric has been widely studied and has shown to be of great interest to describe the characteristics of a social graph. In fact, the clustering coefficient is adapted for a graph in which the links are undirected, such as friendship links (Facebook) or professional links (LinkedIn). For a graph in which links are directed from a source of information to a consumer of information, it is no more adequate. We show that former studies have missed much of the information contained in the directed part of such graphs. We thus introduce a new metric to measure the clustering of a directed social graph with interest links, namely the interest clustering coefficient. We compute it (exactly and using sampling methods) on a very large social graph, a Twitter snapshot with 505 million users and 23 billion links. We additionally provide the values of the formerly introduced directed and undirected metrics, a first on such a large snapshot. We exhibit that the interest clustering coefficient is larger than classic directed clustering coefficients introduced in the literature. This shows the relevancy of the metric to capture the informational aspects of directed graphs.

[195]  arXiv:2008.00520 [pdf, other]
Title: Statistical Inference of Minimally Complex Models
Comments: 20 pages, 5 figures
Subjects: Artificial Intelligence (cs.AI); Statistics Theory (math.ST); Data Analysis, Statistics and Probability (physics.data-an); Quantitative Methods (q-bio.QM)

Finding the best model that describes a high dimensional dataset, is a daunting task. For binary data, we show that this becomes feasible, if the search is restricted to simple models. These models -- that we call Minimally Complex Models (MCMs) -- are simple because they are composed of independent components of minimal complexity, in terms of description length. Simple models are easy to infer and to sample from. In addition, model selection within the MCMs' class is invariant with respect to changes in the representation of the data. They portray the structure of dependencies among variables in a simple way. They provide robust predictions on dependencies and symmetries, as illustrated in several examples. MCMs may contain interactions between variables of any order. So, for example, our approach reveals whether a dataset is appropriately described by a pairwise interaction model.

[196]  arXiv:2008.00524 [pdf, other]
Title: Interactive Imitation Learning in State-Space
Comments: Submitted to the 4th Conference on Robot Learning (CoRL) 2020, 10 pages, 4 figures
Subjects: Robotics (cs.RO); Machine Learning (cs.LG)

Imitation Learning techniques enable programming the behavior of agents through demonstrations rather than manual engineering. However, they are limited by the quality of available demonstration data. Interactive Imitation Learning techniques can improve the efficacy of learning since they involve teachers providing feedback while the agent executes its task. In this work, we propose a novel Interactive Learning technique that uses human feedback in state-space to train and improve agent behavior (as opposed to alternative methods that use feedback in action-space). Our method titled Teaching Imitative Policies in State-space~(TIPS) enables providing guidance to the agent in terms of `changing its state' which is often more intuitive for a human demonstrator. Through continuous improvement via corrective feedback, agents trained by non-expert demonstrators using TIPS outperformed the demonstrator and conventional Imitation Learning agents.

[197]  arXiv:2008.00525 [pdf, other]
Title: Trawling for Trolling: A Dataset
Subjects: Computers and Society (cs.CY); Social and Information Networks (cs.SI)

The ability to accurately detect and filter offensive content automatically is important to ensure a rich and diverse digital discourse. Trolling is a type of hurtful or offensive content that is prevalent in social media, but is underrepresented in datasets for offensive content detection. In this work, we present a dataset that models trolling as a subcategory of offensive content. The dataset was created by collecting samples from well-known datasets and reannotating them along precise definitions of different categories of offensive content. The dataset has 12,490 samples, split across 5 classes; Normal, Profanity, Trolling, Derogatory and Hate Speech. It encompasses content from Twitter, Reddit and Wikipedia Talk Pages. Models trained on our dataset show appreciable performance without any significant hyperparameter tuning and can potentially learn meaningful linguistic information effectively. We find that these models are sensitive to data ablation which suggests that the dataset is largely devoid of spurious statistical artefacts that could otherwise distract and confuse classification models.

[198]  arXiv:2008.00528 [pdf, other]
Title: Towards Robust Visual Tracking for Unmanned Aerial Vehicle with Tri-Attentional Correlation Filters
Comments: IROS 2020 accepted, 8 pages, 6 figures, and 2 tables
Journal-ref: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2020), Las Vegas, USA
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)

Object tracking has been broadly applied in unmanned aerial vehicle (UAV) tasks in recent years. However, existing algorithms still face difficulties such as partial occlusion, clutter background, and other challenging visual factors. Inspired by the cutting-edge attention mechanisms, a novel object tracking framework is proposed to leverage multi-level visual attention. Three primary attention, i.e., contextual attention, dimensional attention, and spatiotemporal attention, are integrated into the training and detection stages of correlation filter-based tracking pipeline. Therefore, the proposed tracker is equipped with robust discriminative power against challenging factors while maintaining high operational efficiency in UAV scenarios. Quantitative and qualitative experiments on two well-known benchmarks with 173 challenging UAV video sequences demonstrate the effectiveness of the proposed framework. The proposed tracking algorithm favorably outperforms 12 state-of-the-art methods, yielding 4.8% relative gain in UAVDT and 8.2% relative gain in UAV123@10fps against the baseline tracker while operating at the speed of $\sim$ 28 frames per second.

[199]  arXiv:2008.00539 [pdf]
Title: An Investigation in Optimal Encoding of Protein Primary Sequence for Structure Prediction by Artificial Neural Networks
Subjects: Machine Learning (cs.LG); Quantitative Methods (q-bio.QM)

Machine learning and the use of neural networks has increased precipitously over the past few years primarily due to the ever-increasing accessibility to data and the growth of computation power. It has become increasingly easy to harness the power of machine learning for predictive tasks. Protein structure prediction is one area where neural networks are becoming increasingly popular and successful. Although very powerful, the use of ANN require selection of most appropriate input/output encoding, architecture, and class to produce the optimal results. In this investigation we have explored and evaluated the effect of several conventional and newly proposed input encodings and selected an optimal architecture. We considered 11 variations of input encoding, 11 alternative window sizes, and 7 different architectures. In total, we evaluated 2,541 permutations in application to the training and testing of more than 10,000 protein structures over the course of 3 months. Our investigations concluded that one-hot encoding, the use of LSTMs, and window sizes of 9, 11, and 15 produce the optimal outcome. Through this optimization, we were able to improve the quality of protein structure prediction by predicting the {\phi} dihedrals to within 14{\deg} - 16{\deg} and {\psi} dihedrals to within 23{\deg}- 25{\deg}. This is a notable improvement compared to previously similar investigations.

[200]  arXiv:2008.00540 [pdf, ps, other]
Title: Chaos of Learning Beyond Zero-sum and Coordination via Game Decompositions
Subjects: Computer Science and Game Theory (cs.GT)

Machine learning processes, e.g. ''learning in games'', can be viewed as non-linear dynamical systems. In general, such systems exhibit a wide spectrum of behaviors, ranging from stability/recurrence to the undesirable phenomena of chaos (or ''butterfly effect''). Chaos captures sensitivity of round-off errors and can severely affect predictability and reproducibility of ML systems, but AI/ML community's understanding of it remains rudimentary. It has a lot out there that await exploration.
Recently, Cheung and Piliouras employed volume-expansion argument to show that Lyapunov chaos occurs in the cumulative payoff space, when some popular learning algorithms, including Multiplicative Weights Update (MWU), Follow-the-Regularized-Leader (FTRL) and Optimistic MWU (OMWU), are used in several subspaces of games, e.g. zero-sum, coordination or graphical constant-sum games. It is natural to ask: can these results generalize to much broader families of games? We take on a game decomposition approach and answer the question affirmatively.
Among other results, we propose a notion of ''matrix domination'' and design a linear program, and use them to characterize bimatrix games where MWU is Lyapunov chaotic almost everywhere. Such family of games has positive Lebesgue measure in the bimatrix game space, indicating that chaos is a substantial issue of learning in games. For multi-player games, we present a local equivalence of volume change between general games and graphical games, which is used to perform volume and chaos analyses of MWU and OMWU in potential games.

[201]  arXiv:2008.00542 [pdf, other]
Title: Efficient Deep Learning of Non-local Features for Hyperspectral Image Classification
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)

Deep learning based methods, such as Convolution Neural Network (CNN), have demonstrated their efficiency in hyperspectral image (HSI) classification. These methods can automatically learn spectral-spatial discriminative features within local patches. However, for each pixel in an HSI, it is not only related to its nearby pixels but also has connections to pixels far away from itself. Therefore, to incorporate the long-range contextual information, a deep fully convolutional network (FCN) with an efficient non-local module, named ENL-FCN, is proposed for HSI classification. In the proposed framework, a deep FCN considers an entire HSI as input and extracts spectral-spatial information in a local receptive field. The efficient non-local module is embedded in the network as a learning unit to capture the long-range contextual information. Different from the traditional non-local neural networks, the long-range contextual information is extracted in a specially designed criss-cross path for computation efficiency. Furthermore, by using a recurrent operation, each pixel's response is aggregated from all pixels of HSI. The benefits of our proposed ENL-FCN are threefold: 1) the long-range contextual information is incorporated effectively, 2) the efficient module can be freely embedded in a deep neural network in a plug-and-play fashion, and 3) it has much fewer learning parameters and requires less computational resources. The experiments conducted on three popular HSI datasets demonstrate that the proposed method achieves state-of-the-art classification performance with lower computational cost in comparison with several leading deep neural networks for HSI.

[202]  arXiv:2008.00544 [pdf, other]
Title: Video Question Answering on Screencast Tutorials
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

This paper presents a new video question answering task on screencast tutorials. We introduce a dataset including question, answer and context triples from the tutorial videos for a software. Unlike other video question answering works, all the answers in our dataset are grounded to the domain knowledge base. An one-shot recognition algorithm is designed to extract the visual cues, which helps enhance the performance of video question answering. We also propose several baseline neural network architectures based on various aspects of video contexts from the dataset. The experimental results demonstrate that our proposed models significantly improve the question answering performances by incorporating multi-modal contexts and domain knowledge.

[203]  arXiv:2008.00546 [pdf, other]
Title: A Foliated View of Transfer Learning
Comments: 14 pages, 6 figures
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

Transfer learning considers a learning process where a new task is solved by transferring relevant knowledge from known solutions to related tasks. While this has been studied experimentally, there lacks a foundational description of the transfer learning problem that exposes what related tasks are, and how they can be exploited. In this work, we present a definition for relatedness between tasks and identify foliations as a mathematical framework to represent such relationships.

[204]  arXiv:2008.00549 [pdf]
Title: IoT System for Real-Time Near-Crash Detection for Automated Vehicle Testing
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)

Our world is moving towards the goal of fully autonomous driving at a fast pace. While the latest automated vehicles (AVs) can handle most real-world scenarios they encounter, a major bottleneck for turning fully autonomous driving into reality is the lack of sufficient corner case data for training and testing AVs. Near-crash data, as a widely used surrogate data for traffic safety research, can also serve the purpose of AV testing if properly collected. To this end, this paper proposes an Internet-of-Things (IoT) system for real-time near-crash data collection. The system has several cool features. First, it is a low-cost and standalone system that is backward-compatible with any existing vehicles. People can fix the system to their dashboards for near-crash data collection and collision warning without the approval or help of vehicle manufacturers. Second, we propose a new near-crash detection method that models the target's size changes and relative motions with the bounding boxes generated by deep-learning-based object detection and tracking. This near-crash detection method is fast, accurate, and reliable; particularly, it is insensitive to camera parameters, thereby having an excellent transferability to different dashboard cameras. We have conducted comprehensive experiments with 100 videos locally processed at Jetson, as well as real-world tests on cars and buses. Besides collecting corner cases, it can also serve as a white-box platform for testing innovative algorithms and evaluating other AV products. The system contributes to the real-world testing of AVs and has great potential to be brought into large-scale deployment.

[205]  arXiv:2008.00550 [pdf, other]
Title: Improving accuracy in the Leray model for incompressible non-isothermal flows via adaptive deconvolution-based nonlinear filtering
Comments: 21 pages, 4 figures, 2 tables
Subjects: Numerical Analysis (math.NA)

This paper considers a Leray regularization model of incompressible, non-isothermal fluid flows which uses nonlinear filtering based on indicator functions, and introduces an efficient numerical method for solving it. The proposed method uses a multi-step, second-order temporal discretization with a finite element (FE) spatial discretization in such a way that the resulting algorithm is linear at each time level, and decouples the evolution equations from the velocity filter step. Since the indicator function chosen in this model is mathematically based on approximation theory, the proposed numerical algorithm can be analyzed robustly, i.e the stability and convergence of the method is provable. A series of numerical tests are carried out to verify the theoretical convergence rates, and to compare the algorithm with direct numerical simulation and the usual Leray-$\alpha$ model of the flow problem.

[206]  arXiv:2008.00553 [pdf, ps, other]
Title: A Unifying Framework for Parallel and Distributed Processing in R using Futures
Authors: Henrik Bengtsson
Comments: 16 pages, 0 figures, to be submitted
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Computation (stat.CO)

A future is a programming construct designed for concurrent and asynchronous evaluation of code, making it particularly useful for parallel processing. The future package implements the Future API for programming with futures in R. This minimal API provides sufficient constructs for implementing parallel versions of well-established, high-level map-reduce APIs. The future ecosystem supports exception handling, output and condition relaying, parallel random number generation, and automatic identification of globals lowering the threshold to parallelize code. The Future API bridges parallel frontends with parallel backends following the philosophy that end-users are the ones who choose the parallel backend while the developer focuses on what to parallelize. A variety of backends exist and third-party contributions meeting the specifications, which ensure that the same code works on all backends, are automatically supported. The future framework solves several problems not addressed by other parallel frameworks in R.

[207]  arXiv:2008.00558 [pdf, ps, other]
Title: Semi-supervised deep learning based on label propagation in a 2D embedded space
Comments: 7 pages, 5 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

While convolutional neural networks need large labeled sets for training images, expert human supervision of such datasets can be very laborious. Proposed solutions propagate labels from a small set of supervised images to a large set of unsupervised ones to obtain sufficient truly-and-artificially labeled samples to train a deep neural network model. Yet, such solutions need many supervised images for validation. We present a loop in which a deep neural network (VGG-16) is trained from a set with more correctly labeled samples along iterations, created by using t-SNE to project the features of its last max-pooling layer into a 2D embedded space in which labels are propagated using the Optimum-Path Forest semi-supervised classifier. As the labeled set improves along iterations, it improves the features of the neural network. We show that this can significantly improve classification results on test data (using only 1\% to 5\% of supervised samples) of three private challenging datasets and two public ones.

[208]  arXiv:2008.00563 [pdf, other]
Title: SemEval-2020 Task 5: Counterfactual Recognition
Comments: Task description paper of SemEval-2020 Task 5: Modelling Causal Reasoning in Language: Detecting Counterfactuals
Subjects: Computation and Language (cs.CL)

We present a counterfactual recognition (CR) task, the shared Task 5 of SemEval-2020. Counterfactuals describe potential outcomes (consequents) produced by actions or circumstances that did not happen or cannot happen and are counter to the facts (antecedent). Counterfactual thinking is an important characteristic of the human cognitive system; it connects antecedents and consequents with causal relations. Our task provides a benchmark for counterfactual recognition in natural language with two subtasks. Subtask-1 aims to determine whether a given sentence is a counterfactual statement or not. Subtask-2 requires the participating systems to extract the antecedent and consequent in a given counterfactual statement. During the SemEval-2020 official evaluation period, we received 27 submissions to Subtask-1 and 11 to Subtask-2. The data, baseline code, and leaderboard can be found at https://competitions.codalab.org/competitions/21691. The data and baseline code are also available at https://zenodo.org/record/3932442.

[209]  arXiv:2008.00571 [pdf, other]
Title: Exponential convergence for multipole and local expansions and their translations for sources in layered media: three-dimensional Laplace equation
Subjects: Numerical Analysis (math.NA)

In this paper, we prove the exponential convergence of the multipole and local expansions, shifting and translation operators used in fast multipole methods (FMMs) for 3-dimensional Laplace equations in layered media. These theoretical results ensure the exponential convergence of the FMM which has been shown by the numerical results recently reported in [9]. As the free space components are calculated by the classic FMM, this paper will focus on the analysis for the reaction components of the Green's function for the Laplace equation in layered media. We first prove that the density functions in the integral representations of the reaction components are analytic and bounded in the right half complex plane. Then, using the Cagniard-de Hoop transform and contour deformations, estimate for the remainder terms of the truncated expansions is given, and, as a result, the exponential convergence for the expansions and translation operators is proven.

[210]  arXiv:2008.00579 [pdf, other]
Title: Modeling of Personalized Anatomy using Plastic Strains
Comments: 18 pages, 24 figures. Rejected from ACM SIGGRAPH 2020 and ACM SIGGRAPH Asia 2020. Resubmission is under preparation
Subjects: Graphics (cs.GR)

We give a method for modeling solid objects undergoing large spatially varying and/or anisotropic strains, and use it to reconstruct human anatomy from medical images. Our novel shape deformation method uses plastic strains and the Finite Element Method to successfully model shapes undergoing large and/or anisotropic strains, specified by sparse point constraints on the boundary of the object. We extensively compare our method to standard second-order shape deformation methods, variational methods and surface-based methods and demonstrate that our method avoids the spikiness, wiggliness and other artefacts of previous methods. We demonstrate how to perform such shape deformation both for attached and un-attached ("free flying") objects, using a novel method to solve linear systems with singular matrices with a known nullspace. While our method is applicable to general large-strain shape deformation modeling, we use it to create personalized 3D triangle and volumetric meshes of human organs, based on MRI or CT scans. Given a medically accurate anatomy template of a generic individual, we optimize the geometry of the organ to match the MRI or CT scan of a specific individual. Our examples include human hand muscles, a liver, a hip bone, and a gluteus medius muscle ("hip abductor").

[211]  arXiv:2008.00581 [pdf, other]
Title: A Combinatorial Design for Cascaded Coded Distributed Computing on General Networks
Comments: 30 pages, 6 figures
Subjects: Information Theory (cs.IT); Computational Complexity (cs.CC); Distributed, Parallel, and Cluster Computing (cs.DC)

Coding theoretic approached have been developed to significantly reduce the communication load in modern distributed computing system. In particular, coded distributed computing (CDC) introduced by Li et al. can efficiently trade computation resources to reduce the communication load in MapReduce like computing systems. For the more general cascaded CDC, Map computations are repeated at r nodes to significantly reduce the communication load among nodes tasked with computing Q Reduce functions s times. In this paper, we propose a novel low-complexity combinatorial design for cascaded CDC which 1) determines both input file and output function assignments, 2) requires significantly less number of input files and output functions, and 3) operates on heterogeneous networks where nodes have varying storage and computing capabilities. We provide an analytical characterization of the computation-communication tradeoff, from which we show the proposed scheme can outperform the state-of-the-art scheme proposed by Li et al. for the homogeneous networks. Further, when the network is heterogeneous, we show that the performance of the proposed scheme can be better than its homogeneous counterpart. In addition, the proposed scheme is optimal within a constant factor of the information theoretic converse bound while fixing the input file and the output function assignments.

[212]  arXiv:2008.00582 [pdf, other]
Title: audioLIME: Listenable Explanations Using Source Separation
Comments: Submitted to The 13th International Workshop on Machine Learning and Music
Subjects: Sound (cs.SD); Information Retrieval (cs.IR); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)

Deep neural networks (DNNs) are successfully applied in a wide variety of music information retrieval (MIR) tasks but their predictions are usually not interpretable. We propose audioLIME, a method based on Local Interpretable Model-agnostic Explanations (LIME) extended by a musical definition of locality. The perturbations used in LIME are created by switching on/off components extracted by source separation which makes our explanations listenable. We validate audioLIME on two different music tagging systems and show that it produces sensible explanations in situations where a competing method cannot.

[213]  arXiv:2008.00583 [pdf, ps, other]
Title: Decision problems for linear recurrences involving arbitrary real numbers
Authors: Eike Neumann
Subjects: Logic in Computer Science (cs.LO)

We study the decidability of the Skolem Problem, the Positivity Problem, and the Ultimate Positivity Problem for linear recurrences with real number initial values and real number coefficients in the bit-model of real computation. We show that for each problem there exists a correct partial algorithm which halts for all problem instances for which the answer is locally constant, thus establishing that all three problems are as close to decidable as one can expect them to be in this setting. We further show that the algorithms for the Positivity Problem and the Ultimate Positivity Problem halt on almost every instance with respect to the usual Lebesgue measure on Euclidean space. In comparison, the analogous problems for exact rational or real algebraic coefficients are known to be decidable only for linear recurrences of fairly low order.

[214]  arXiv:2008.00584 [pdf, ps, other]
Title: On the optimal rates of convergence of Gegenbauer projections
Authors: Haiyong Wang
Comments: 30 pages; 8 figures
Subjects: Numerical Analysis (math.NA); Classical Analysis and ODEs (math.CA)

In this paper we present a comprehensive convergence rate analysis of Gegenbauer projections. We show that, for analytic functions, the convergence rate of the Gegenbauer projection of degree $n$ is the same as that of the best approximation of the same degree when $\lambda\leq0$ and the former is slower than the latter by a factor of $n^{\lambda}$ when $\lambda>0$, where $\lambda$ is the parameter in Gegenbauer polynomials. For piecewise analytic functions, we demonstrate that the convergence rate of the Gegenbauer projection of degree $n$ is the same as that of the best approximation of the same degree when $\lambda\leq1$ and the former is slower than the latter by a factor of $n^{\lambda-1}$ when $\lambda>1$. The extension to functions of fractional smoothness is also discussed. Our theoretical findings are illustrated by numerical experiments.

[215]  arXiv:2008.00589 [pdf, other]
Title: Finding Closed Quasigeodesics on Convex Polyhedra
Comments: 18 pages, 11 figures. Revised version of paper from SoCG 2020
Subjects: Computational Geometry (cs.CG); Metric Geometry (math.MG)

A closed quasigeodesic is a closed loop on the surface of a polyhedron with at most $180^\circ$ of surface on both sides at all points; such loops can be locally unfolded straight. In 1949, Pogorelov proved that every convex polyhedron has at least three (non-self-intersecting) closed quasigeodesics, but the proof relies on a nonconstructive topological argument. We present the first finite algorithm to find a closed quasigeodesic on a given convex polyhedron, which is the first positive progress on a 1990 open problem by O'Rourke and Wyman. The algorithm's running time is pseudopolynomial, namely $O\left({n^2 \over \varepsilon^2} {L \over \ell} b\right)$ time, where $\varepsilon$ is the minimum curvature of a vertex, $L$ is the length of the longest edge, $\ell$ is the smallest distance within a face between a vertex and a nonincident edge (minimum feature size of any face), and $b$ is the maximum number of bits of an integer in a constant-size radical expression of a real number representing the polyhedron. We take special care with the model of computation, introducing the $O(1)$-expression RAM and showing that it can be implemented in the standard word RAM.

[216]  arXiv:2008.00595 [pdf, other]
Title: Generating Minimum-Snap Quadrotor Trajectories Really Fast
Comments: 6 pages, 4 figures, to be published as part of 2020 IEEE International Conference on Intelligent Robots and Systems (IROS)
Subjects: Systems and Control (eess.SY)

We propose an algorithm for generating minimum-snap trajectories for quadrotors with linear computational complexity with respect to the number of segments in the spline trajectory. Our algorithm is numerically stable for large numbers of segments and is able to generate trajectories of more than $500,000$ segments. The computational speed and numerical stability of our algorithm makes it suitable for real-time generation of very large scale trajectories. We demonstrate the performance of our algorithm and compare it to existing methods, in which it is both faster and able to calculate larger trajectories than state-of-the-art. We also show the feasibility of the trajectories experimentally with a long quadrotor flight.

[217]  arXiv:2008.00601 [pdf, ps, other]
Title: The Amazing Power of Randomness: NP=RP
Authors: András Faragó
Comments: 58 pages (not including title page)
Subjects: Computational Complexity (cs.CC); Combinatorics (math.CO); Probability (math.PR)

We (claim to) prove the extremely surprising fact that NP=RP. It is achieved by creating a Fully Polynomial-Time Randomized Approximation Scheme (FPRAS) for approximately counting the number of independent sets in bounded degree graphs, with any fixed degree bound, which is known to imply NP=RP. While our method is rooted in the well known Markov Chain Monte Carlo (MCMC) approach, we overcome the notorious problem of slow mixing by a new idea for generating a random sample from among the independent sets. A key tool that enables the result is a solution to a novel sampling task that we call Subset Sampling. In its basic form, a stationary sample is given from the (exponentially large) state space of a Markov chain, as input, and we want to transform it into another stationary sample that is conditioned on falling into a given subset, which is still exponentially large. In general, Subset Sampling can be both harder and easier than stationary sampling from a Markov chain. It can be harder, due to the conditioning on a subset, which may have more complex structure than the original state space. But it may also be easier, since a stationary sample is already given, which, in a sense, already encompasses "most of the hardness" of such sampling tasks, being already in the stationary distribution, which is hard to reach in a slowly mixing chain. We show that it is possible to efficiently balance the two sides: we can capitalize on already having a stationary sample from the original space, so that the complexity of confining it to a subset is mitigated. We prove that an efficient approximation is possible for the considered sampling task, and then it is applied recursively to create the FPRAS.

[218]  arXiv:2008.00603 [pdf, other]
Title: Learning Agile Locomotion via Adversarial Training
Comments: To appear at the International Conference on Intelligent Robots and Systems (IROS 2020) as a full paper
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI)

Developing controllers for agile locomotion is a long-standing challenge for legged robots. Reinforcement learning (RL) and Evolution Strategy (ES) hold the promise of automating the design process of such controllers. However, dedicated and careful human effort is required to design training environments to promote agility. In this paper, we present a multi-agent learning system, in which a quadruped robot (protagonist) learns to chase another robot (adversary) while the latter learns to escape. We find that this adversarial training process not only encourages agile behaviors but also effectively alleviates the laborious environment design effort. In contrast to prior works that used only one adversary, we find that training an ensemble of adversaries, each of which specializes in a different escaping strategy, is essential for the protagonist to master agility. Through extensive experiments, we show that the locomotion controller learned with adversarial training significantly outperforms carefully designed baselines.

[219]  arXiv:2008.00610 [pdf, other]
Title: Robust Collaborative Learning of Patch-level and Image-level Annotations for Diabetic Retinopathy Grading from Fundus Image
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Currently, diabetic retinopathy (DR) grading from fundus images has attracted incremental interests in both academic and industrial communities. Most convolutional neural networks (CNNs) based algorithms treat DR grading as a classification task via image-level annotations. However, they have not fully explored the valuable information from the DR-related lesions. In this paper, we present a robust framework, which can collaboratively utilize both patch-level lesion and image-level grade annotations, for DR severity grading. By end-to-end optimizing the entire framework, the fine-grained lesion and image-level grade information can be bidirectionally exchanged to exploit more discriminative features for DR grading. Compared with the recent state-of-the-art algorithms and three over 9-years clinical experienced ophthalmologists, the proposed algorithm shows favorable performance. Testing on the datasets from totally different scenarios and distributions (such as label and camera), our algorithm is proved robust in facing image quality and distribution problems that commonly exist in real-world practice. Extensive ablation studies dissect the proposed framework and indicate the effectiveness and necessity of each motivation. The code and some valuable annotations are now publicly available.

[220]  arXiv:2008.00612 [pdf, other]
Title: How Different is Test Case Prioritization for Open and Closed Source Projects?
Comments: 13 pages, 2 figures, 17 tables, submitted to TSE
Subjects: Software Engineering (cs.SE)

Improved test case prioritization means that software developers can detect and fix more software faults sooner than usual. But is there one "best" prioritization algorithm? Or do different kinds of projects deserve special kinds of prioritization? To answer these questions, this paper applies nine prioritization schemes to 31 projects that range from (a) highly rated open-source Github projects to (b) computational science software to (c) a closed-source project. We find that prioritization approaches that work best for open-source projects can work worst for the closed-source project (and vice versa). From these experiments, we conclude that (a) it is ill-advised to always apply one prioritization scheme to all projects since (b) prioritization requires tuning to different project types.

[221]  arXiv:2008.00614 [pdf, other]
Title: Dynamics Generalization via Information Bottleneck in Deep Reinforcement Learning
Comments: 16 pages
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

Despite the significant progress of deep reinforcement learning (RL) in solving sequential decision making problems, RL agents often overfit to training environments and struggle to adapt to new, unseen environments. This prevents robust applications of RL in real world situations, where system dynamics may deviate wildly from the training settings. In this work, our primary contribution is to propose an information theoretic regularization objective and an annealing-based optimization method to achieve better generalization ability in RL agents. We demonstrate the extreme generalization benefits of our approach in different domains ranging from maze navigation to robotic tasks; for the first time, we show that agents can generalize to test parameters more than 10 standard deviations away from the training parameter distribution. This work provides a principled way to improve generalization in RL by gradually removing information that is redundant for task-solving; it opens doors for the systematic study of generalization from training to extremely different testing settings, focusing on the established connections between information theory and machine learning.

[222]  arXiv:2008.00622 [pdf, other]
Title: Anchor-Assisted Intelligent Reflecting Surface Channel Estimation for Multiuser Communications
Comments: We propose a new anchor-assisted channel estimation scheme for IRS-aided multiuser communications. By exploring multi-antennas at the BS, the training overhead in estimating all users' cascaded channels is significantly reduced. Numerical results validate the effectiveness of the proposed scheme, especially when the number of antennas at the BS and/or that of the users is large
Subjects: Information Theory (cs.IT); Signal Processing (eess.SP)

Due to the passive nature of Intelligent Reflecting Surface (IRS), channel estimation is a fundamental challenge in IRS-aided wireless networks. Particularly, as the number of IRS reflecting elements and/or that of IRS-served users increase, the channel training overhead becomes excessively high. To tackle this challenge, we propose in this paper a new anchor-assisted two-phase channel estimation scheme, where two anchor nodes, namely A1 and A2, are deployed near the IRS for helping the base station (BS) to acquire the cascaded BS-IRS-user channels. Specifically, in the first phase, the partial channel state information (CSI), i.e., the element-wise channel gain square, of the BS-IRS link is obtained by estimating the BS-IRS-A1/A2 channels and the A1-IRS-A2 channel, separately. Then, in the second phase, by leveraging such partial knowledge of the BS-IRS channel that is common to all users, the individual cascaded BS-IRS-user channels are efficiently estimated. Simulation results demonstrate that the proposed anchor-assisted channel estimation scheme is able to achieve comparable mean-squared error (MSE) performance as compared to the conventional scheme, but with significantly reduced channel training time.

[223]  arXiv:2008.00623 [pdf, other]
Title: DeLighT: Very Deep and Light-weight Transformer
Comments: 16 pages including references and appendix
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL)

We introduce a very deep and light-weight transformer, DeLighT, that delivers similar or better performance than transformer-based models with significantly fewer parameters. DeLighT more efficiently allocates parameters both (1) within each Transformer block using DExTra, a deep and light-weight transformation and (2) across blocks using block-wise scaling, that allows for shallower and narrower DeLighT blocks near the input and wider and deeper DeLighT blocks near the output. Overall, DeLighT networks are 2.5 to 4 times deeper than standard transformer models and yet have fewer parameters and operations. Experiments on machine translation and language modeling tasks show that DeLighT matches the performance of baseline Transformers with significantly fewer parameters. On the WMT'14 En-Fr high resource dataset, DeLighT requires 1.8 times fewer parameters and 2 times fewer operations and achieves better performance (+0.4 BLEU score) than baseline transformers. On the WMT'16 En-Ro low resource dataset, DeLighT delivers similar performance with 2.8 times fewer parameters than baseline transformers.

[224]  arXiv:2008.00627 [pdf, other]
Title: Learning to Purify Noisy Labels via Meta Soft Label Corrector
Comments: 12 pages,6 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Recent deep neural networks (DNNs) can easily overfit to biased training data with noisy labels. Label correction strategy is commonly used to alleviate this issue by designing a method to identity suspected noisy labels and then correct them. Current approaches to correcting corrupted labels usually need certain pre-defined label correction rules or manually preset hyper-parameters. These fixed settings make it hard to apply in practice since the accurate label correction usually related with the concrete problem, training data and the temporal information hidden in dynamic iterations of training process. To address this issue, we propose a meta-learning model which could estimate soft labels through meta-gradient descent step under the guidance of noise-free meta data. By viewing the label correction procedure as a meta-process and using a meta-learner to automatically correct labels, we could adaptively obtain rectified soft labels iteratively according to current training problems without manually preset hyper-parameters. Besides, our method is model-agnostic and we can combine it with any other existing model with ease. Comprehensive experiments substantiate the superiority of our method in both synthetic and real-world problems with noisy labels compared with current SOTA label correction strategies.

[225]  arXiv:2008.00634 [pdf, other]
Title: Deep Photo Cropper and Enhancer
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)

This paper introduces a new type of image enhancement problem. Compared to traditional image enhancement methods, which mostly deal with pixel-wise modifications of a given photo, our proposed task is to crop an image which is embedded within a photo and enhance the quality of the cropped image. We split our proposed approach into two deep networks: deep photo cropper and deep image enhancer. In the photo cropper network, we employ a spatial transformer to extract the embedded image. In the photo enhancer, we employ super-resolution to increase the number of pixels in the embedded image and reduce the effect of stretching and distortion of pixels. We use cosine distance loss between image features and ground truth for the cropper and the mean square loss for the enhancer. Furthermore, we propose a new dataset to train and test the proposed method. Finally, we analyze the proposed method with respect to qualitative and quantitative evaluations.

[226]  arXiv:2008.00635 [pdf, other]
Title: BenchBot: Evaluating Robotics Research in Photorealistic 3D Simulation and on Real Robots
Comments: Future submission to RAL; software available at this http URL
Subjects: Robotics (cs.RO)

We introduce BenchBot, a novel software suite for benchmarking the performance of robotics research across both photorealistic 3D simulations and real robot platforms. BenchBot provides a simple interface to the sensorimotor capabilities of a robot when solving robotics research problems; an interface that is consistent regardless of whether the target platform is simulated or a real robot. In this paper we outline the BenchBot system architecture, and explore the parallels between its user-centric design and an ideal research development process devoid of tangential robot engineering challenges. The paper describes the research benefits of using the BenchBot system, including: enhanced capacity to focus solely on research problems, direct quantitative feedback to inform research development, tools for deriving comprehensive performance characteristics, and submission formats which promote sharability and repeatability of research outcomes. BenchBot is publicly available (this http URL), and we encourage its use in the research community for comprehensively evaluating the simulated and real world performance of novel robotic algorithms.

[227]  arXiv:2008.00637 [pdf, other]
Title: Self-supervised Object Tracking with Cycle-consistent Siamese Networks
Comments: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Self-supervised learning for visual object tracking possesses valuable advantages compared to supervised learning, such as the non-necessity of laborious human annotations and online training. In this work, we exploit an end-to-end Siamese network in a cycle-consistent self-supervised framework for object tracking. Self-supervision can be performed by taking advantage of the cycle consistency in the forward and backward tracking. To better leverage the end-to-end learning of deep networks, we propose to integrate a Siamese region proposal and mask regression network in our tracking framework so that a fast and more accurate tracker can be learned without the annotation of each frame. The experiments on the VOT dataset for visual object tracking and on the DAVIS dataset for video object segmentation propagation show that our method outperforms prior approaches on both tasks.

[228]  arXiv:2008.00638 [pdf, other]
Title: High Throughput Matrix-Matrix Multiplication between Asymmetric Bit-Width Operands
Subjects: Machine Learning (cs.LG); Hardware Architecture (cs.AR); Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC); Machine Learning (stat.ML)

Matrix multiplications between asymmetric bit-width operands, especially between 8- and 4-bit operands are likely to become a fundamental kernel of many important workloads including neural networks and machine learning. While existing SIMD matrix multiplication instructions for symmetric bit-width operands can support operands of mixed precision by zero- or sign-extending the narrow operand to match the size of the other operands, they cannot exploit the benefit of narrow bit-width of one of the operands. We propose a new SIMD matrix multiplication instruction that uses mixed precision on its inputs (8- and 4-bit operands) and accumulates product values into narrower 16-bit output accumulators, in turn allowing the SIMD operation at 128-bit vector width to process a greater number of data elements per instruction to improve processing throughput and memory bandwidth utilization without increasing the register read- and write-port bandwidth in CPUs. The proposed asymmetric-operand-size SIMD instruction offers 2x improvement in throughput of matrix multiplication in comparison to throughput obtained using existing symmetric-operand-size instructions while causing negligible (0.05%) overflow from 16-bit accumulators for representative machine learning workloads. The asymmetric-operand-size instruction not only can improve matrix multiplication throughput in CPUs, but also can be effective to support multiply-and-accumulate (MAC) operation between 8- and 4-bit operands in state-of-the-art DNN hardware accelerators (e.g., systolic array microarchitecture in Google TPU, etc.) and offer similar improvement in matrix multiply performance seamlessly without violating the various implementation constraints. We demonstrate how a systolic array architecture designed for symmetric-operand-size instructions could be modified to support an asymmetric-operand-sized instruction.

[229]  arXiv:2008.00639 [pdf, other]
Title: An Electrocommunication System Using FSK Modulation and Deep Learning Based Demodulation for Underwater Robots
Comments: IROS2020
Subjects: Robotics (cs.RO)

Underwater communication is extremely challenging for small underwater robots that have stringent power and size constraints. In our previous work, we have demonstrated that electrocommunication is an alternative method for small underwater robot communication. This paper presents a new electrocommunication system which utilizes Binary Frequency Shift Keying (2FSK) modulation and deep-learning-based demodulation for underwater robots. We first derive an underwater electrocommunication model which covers both the near-field area and a large transition area outside of the near-field area. The 2FSK modulation is adopted to improve the anti-interference ability of the signal. A deep learning algorithm is used to demodulate the signal by the receiver. Simulations and experiments show that at the same testing condition, the new communication system has a lower bit error rate and a higher data rate than the previous electrocommunication system. The communication system achieves stable communication within the distance of 10 m at a data transfer rate of 5 Kbps with a power consumption of less than 0.1 W. The large improvement of the communication distance in this study further advances the application of electrocommunication

[230]  arXiv:2008.00644 [pdf, other]
Title: GP-SLAM+: real-time 3D lidar SLAM based on improved regionalized Gaussian process map reconstruction
Comments: Accepted by IROS 2020
Subjects: Robotics (cs.RO)

This paper presents a 3D lidar SLAM system based on improved regionalized Gaussian process (GP) map reconstruction to provide both low-drift state estimation and mapping in real-time for robotics applications. We utilize spatial GP regression to model the environment. This tool enables us to recover surfaces including those in sparsely scanned areas and obtain uniform samples with uncertainty. Those properties facilitate robust data association and map updating in our scan-to-map registration scheme, especially when working with sparse range data. Compared with previous GP-SLAM, this work overcomes the prohibitive computational complexity of GP and redesigns the registration strategy to meet the accuracy requirements in 3D scenarios. For large-scale tasks, a two-thread framework is employed to suppress the drift further. Aerial and ground-based experiments demonstrate that our method allows robust odometry and precise mapping in real-time. It also outperforms the state-of-the-art lidar SLAM systems in our tests with light-weight sensors.

[231]  arXiv:2008.00645 [pdf, other]
Title: Classification from Ambiguity Comparisons
Comments: Code and Dataset: this https URL
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

Labeling data is an unavoidable pre-processing procedure for most machine learning tasks. However, it takes a considerable amount of time, money, and labor to collect accurate \textit{explicit class labels} for a large dataset. A positivity comparison oracle has been adapted to relieve this burden, where two data points are received as input and the oracle answers which one is more likely to be positive. However, when information about the classification threshold is lacking, this oracle alone can at most rank all data points on the basis of their relative positivity; thus, it still needs to access explicit class labels. In order to harness pairwise comparisons in a more effective way, we propose an \textit{ambiguity comparison oracle}. This oracle also receives two data points as input, and it answers which one is more ambiguous, or more difficult to assign a label to. We then propose an efficient adaptive labeling algorithm that can actively query \textit{only pairwise comparison oracles} without accessing the explicit labeling oracle. We also address the situation where the labeling budget is insufficient compared to the dataset size, which can be dealt with by plugging the proposed algorithm into an active learning algorithm. Furthermore, we confirm the feasibility of the proposed oracle and the performance of the proposed labeling algorithms theoretically and empirically.

[232]  arXiv:2008.00646 [pdf, other]
Title: Interpretable Sequence Learning for COVID-19 Forecasting
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

We propose a novel approach that integrates machine learning into compartmental disease modeling to predict the progression of COVID-19. Our model is explainable by design as it explicitly shows how different compartments evolve and it uses interpretable encoders to incorporate covariates and improve performance. Explainability is valuable to ensure that the model's forecasts are credible to epidemiologists and to instill confidence in end-users such as policy makers and healthcare institutions. Our model can be applied at different geographic resolutions, and here we demonstrate it for states and counties in the United States. We show that our model provides more accurate forecasts, in metrics averaged across the entire US, than state-of-the-art alternatives, and that it provides qualitatively meaningful explanatory insights. Lastly, we analyze the performance of our model for different subgroups based on the subgroup distributions within the counties.

[233]  arXiv:2008.00650 [pdf, other]
Title: On the Expressive Power of Higher-Order Pushdown Systems
Authors: Paweł Parys
Subjects: Formal Languages and Automata Theory (cs.FL)

We show that deterministic collapsible pushdown automata of second order can recognize a language that is not recognizable by any deterministic higher-order pushdown automaton (without collapse) of any order. This implies that there exists a tree generated by a second order collapsible pushdown system (equivalently, by a recursion scheme of second order) that is not generated by any deterministic higher-order pushdown system (without collapse) of any order (equivalently, by any safe recursion scheme of any order). As a side effect, we present a pumping lemma for deterministic higher-order pushdown automata, which potentially can be useful for other applications.

[234]  arXiv:2008.00653 [pdf, ps, other]
Title: On the Approximation of Local Expansions of Laplace Potentials by the Fast Multipole Method
Comments: 25 pages, 3 figures
Subjects: Numerical Analysis (math.NA)

In this paper, we present a generalization of the classical error bounds of Greengard-Rokhlin for the Fast Multipole Method (FMM) for Laplace potentials in three dimensions, extended to the case of local expansion (instead of point) targets. We also present a complementary, less sharp error bound proven via approximation theory whose applicability is not restricted to Laplace potentials. Our study is motivated by the GIGAQBX FMM, an algorithm for the fast, high-order accurate evaluation of layer potentials near and on the source layer. GIGAQBX is based on the FMM, but unlike a conventional FMM, which is designed to evaluate potentials at point-shaped targets, GIGAQBX evaluates local expansions of potentials at ball-shaped targets. Although the accuracy (or the acceleration error, i.e., error due to the approximation of the potential by the fast algorithm) of the conventional FMM is well understood, the acceleration error of FMM-based algorithms applied to the evaluation of local expansions has not been as well studied. The main contribution of this paper is a proof of a set of hypotheses first demonstrated numerically in the paper "A Fast Algorithm for Quadrature by Expansion in Three Dimensions," which pertain to the accuracy of FMM approximation of local expansions of Laplace potentials in three dimensions. These hypotheses are also essential to the three-dimensional error bound for GIGAQBX, which was previously stated conditionally on their truth and can now be stated unconditionally.

[235]  arXiv:2008.00658 [pdf]
Title: PIC-Net: Point Cloud and Image Collaboration Network for Large-Scale Place Recognition
Comments: 8 pages, 8 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Place recognition is one of the hot research fields in automation technology and is still an open issue, Camera and Lidar are two mainstream sensors used in this task, Camera-based methods are easily affected by illumination and season changes, LIDAR cannot get the rich data as the image could , In this paper, we propose the PIC-Net (Point cloud and Image Collaboration Network), which use attention mechanism to fuse the features of image and point cloud, and mine the complementary information between the two. Furthermore, in order to improve the recognition performance at night, we transform the night image into the daytime style. Comparison results show that the collaboration of image and point cloud outperform both image-based and point cloud-based method, the attention strategy and day-night-transform could further improve the performance.

[236]  arXiv:2008.00663 [pdf, ps, other]
Title: Near MDS codes from oval polynomials
Subjects: Information Theory (cs.IT)

A linear code with parameters of the form $[n, k, n-k+1]$ is referred to as an MDS (maximum distance separable) code. A linear code with parameters of the form $[n, k, n-k]$ is said to be almost MDS (i.e., almost maximum distance separable) or AMDS for short. A code is said to be near maximum distance separable (in short, near MDS or NMDS) if both the code and its dual are almost maximum distance separable. Near MDS codes correspond to interesting objects in finite geometry and have nice applications in combinatorics and cryptography. In this paper, seven infinite families of $[2^m+1, 3, 2^m-2]$ near MDS codes over $\gf(2^m)$ and seven infinite families of $[2^m+2, 3, 2^m-1]$ near MDS codes over $\gf(2^m)$ are constructed with special oval polynomials for odd $m$. In addition, nine infinite families of optimal $[2^m+3, 3, 2^m]$ near MDS codes over $\gf(2^m)$ are constructed with oval polynomials in general.

[237]  arXiv:2008.00665 [pdf]
Title: The pursuit of beauty: Converting image labels to meaningful vectors
Comments: 20 pages, 8 figures, 1 table
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

A challenge of the computer vision community is to understand the semantics of an image, in order to allow image reconstruction based on existing high-level features or to better analyze (semi-)labelled datasets. Towards addressing this challenge, this paper introduces a method, called Occlusion-based Latent Representations (OLR), for converting image labels to meaningful representations that capture a significant amount of data semantics. Besides being informational rich, these representations compose a disentangled low-dimensional latent space where each image label is encoded into a separate vector. We evaluate the quality of these representations in a series of experiments whose results suggest that the proposed model can capture data concepts and discover data interrelations.

[238]  arXiv:2008.00666 [pdf, other]
Title: Exemplar-based Layout Fine-tuning for Node-link Diagrams
Subjects: Graphics (cs.GR); Human-Computer Interaction (cs.HC)

We design and evaluate a novel layout fine-tuning technique for node-link diagrams that facilitates exemplar-based adjustment of a group of substructures in batching mode. The key idea is to transfer user modifications on a local substructure to other substructures in the whole graph that are topologically similar to the exemplar. We first precompute a canonical representation for each substructure with node embedding techniques and then use it for on-the-fly substructure retrieval. We design and develop a light-weight interactive system to enable intuitive adjustment, modification transfer, and visual graph exploration. \replaced[id=pan]{\textcolor{black}{We also report some results of quantitative comparisons, three case studies, and a within-participant user study.}}{Experimental results and case studies demonstrate that our approach improves readability and performance over existing layout editing schemes.

[239]  arXiv:2008.00670 [pdf]
Title: Deep Learning based Topic Analysis on Financial Emerging Event Tweets
Subjects: Computation and Language (cs.CL)

Financial analyses of stock markets rely heavily on quantitative approaches in an attempt to predict subsequent or market movements based on historical prices and other measurable metrics. These quantitative analyses might have missed out on un-quantifiable aspects like sentiment and speculation that also impact the market. Analyzing vast amounts of qualitative text data to understand public opinion on social media platform is one approach to address this gap. This work carried out topic analysis on 28264 financial tweets [1] via clustering to discover emerging events in the stock market. Three main topics were discovered to be discussed frequently within the period. First, the financial ratio EPS is a measure that has been discussed frequently by investors. Secondly, short selling of shares were discussed heavily, it was often mentioned together with Morgan Stanley. Thirdly, oil and energy sectors were often discussed together with policy. These tweets were semantically clustered by a method consisting of word2vec algorithm to obtain word embeddings that map words to vectors. Semantic word clusters were then formed. Each tweet was then vectorized using the Term Frequency-Inverse Document Frequency (TF-IDF) values of the words it consisted of and based on which clusters its words were in. Tweet vectors were then converted to compressed representations by training a deep-autoencoder. K-means clusters were then formed. This method reduces dimensionality and produces dense vectors, in contrast to the usual Vector Space Model. Topic modelling with Latent Dirichlet Allocation (LDA) and top frequent words were used to analyze clusters and reveal emerging events.

[240]  arXiv:2008.00674 [pdf, other]
Title: Reinforcement Solver for H-infinity Filter with Bounded Noise
Subjects: Systems and Control (eess.SY); Optimization and Control (math.OC)

H-infinity filter has been widely applied in engineering field, but copping with bounded noise is still an open problem and difficult to solve. This paper considers the H-infinity filtering problem for linear system with bounded process and measurement noise. The problem is first formulated as a zero-sum game where the dynamic of estimation error is non-affine with respect to filter gain and measurement noise. A nonquadratic Hamilton-Jacobi-Isaacs (HJI) equation is then derived by employing a nonquadratic cost to characterize bounded noise, which is extremely difficult to solve due to its non-affine and nonlinear properties. Next, a reinforcement learning algorithm based on gradient descent method which can handle nonlinearity is proposed to update the gain of reinforcement filter, where measurement noise is fixed to tackle non-affine property and increase the convexity of Hamiltonian. Two examples demonstrate the convergence and effectiveness of the proposed algorithm.

[241]  arXiv:2008.00679 [pdf, other]
Title: Cooperative Control of Mobile Robots with Stackelberg Learning
Comments: 8 pages, 7 figures
Journal-ref: Proceedings of the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
Subjects: Robotics (cs.RO); Computer Science and Game Theory (cs.GT); Machine Learning (cs.LG); Multiagent Systems (cs.MA)

Multi-robot cooperation requires agents to make decisions that are consistent with the shared goal without disregarding action-specific preferences that might arise from asymmetry in capabilities and individual objectives. To accomplish this goal, we propose a method named SLiCC: Stackelberg Learning in Cooperative Control. SLiCC models the problem as a partially observable stochastic game composed of Stackelberg bimatrix games, and uses deep reinforcement learning to obtain the payoff matrices associated with these games. Appropriate cooperative actions are then selected with the derived Stackelberg equilibria. Using a bi-robot cooperative object transportation problem, we validate the performance of SLiCC against centralized multi-agent Q-learning and demonstrate that SLiCC achieves better combined utility.

[242]  arXiv:2008.00681 [pdf, other]
Title: A robust but easily implementable remote control for quadrotors: Experimental acrobatic flight tests
Comments: 9th International Conference on Advanced Technologies (ICAT'20), 10-12 August 2020, Istanbul, Turkey
Subjects: Systems and Control (eess.SY); Optimization and Control (math.OC)

Experimental flight tests are reported about quadrotors UAVs via a recent model-free control (MFC) strategy, which is easily implementable. We show that it is possible to achieve acrobatic rate control of the UAV, which is beyond the previous standard. The same remote controller is tested on two physical vehicles without any re-tuning. It produces in both cases low tracking error. We show that MFC is robust even when the quadrotor is highly damaged. A video footage can be found at: https://youtu.be/wtSLalA4szc

[243]  arXiv:2008.00682 [pdf, ps, other]
Title: Discovering indicators of dark horse of soccer games by deep learning from sequential trading data
Authors: Liyao Lu, Qiang Lyu
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)

It is not surprise for machine learning models to provide decent prediction accuracy of soccer games outcomes based on various objective metrics. However, the performance is not that decent in terms of predicting difficult and valuable matches. A deep learning model is designed and trained on a real sequential trading data from the real prediction market, with the assumption that such trading data contain critical latent information to determine the game outcomes. A new loss function is proposed which biases the selection toward matches with high investment return to train our model. Full investigation of 4669 top soccer league matches showed that our model traded off prediction accuracy for high value return due to a certain ability to detect dark horses. A further try is conducted to depict some indicators discovered by our model for describing key features of big dark horses and regular hot horses.

[244]  arXiv:2008.00693 [pdf, other]
Title: Compliant Manipulation of Free-Floating Objects
Comments: Published in ICRA 2018 proceedings
Journal-ref: 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, QLD, 2018, pp. 865-872
Subjects: Robotics (cs.RO)

Compliant motions allow alignment of workpieces using naturally occurring interaction forces. However, free-floating objects do not have a fixed base to absorb the reaction forces caused by the interactions. Consequently, if the interaction forces are too high, objects can gain momentum and move away after contact. This paper proposes an approach based on direct force control for compliant manipulation of free-floating objects. The objective of the controller is to minimize the interaction forces while maintaining the contact. The proposed approach achieves this by maintaining small constant force along the motion direction and an apparent reduction of manipulator inertia along remaining Degrees of Freedom (DOF). Simulation results emphasize the importance of relative inertia of the robotic manipulator with respect to the free-floating object. The experiments were performed with KUKA LWR4+ manipulator arm and a two-dimensional micro-gravity emulator (object floating on an air bed), which was developed in-house. It was verified that the proposed control law is capable of controlling the interaction forces and aligning the tools without pushing the object away. We conclude that direct force control works better with a free-floating object than implicit force control algorithms, such as impedance control.

[245]  arXiv:2008.00694 [pdf, other]
Title: Asynchronous Periodic Distributed Event-Triggered Frequency Control of Microgrids
Comments: 8 pages
Subjects: Systems and Control (eess.SY); Dynamical Systems (math.DS); Optimization and Control (math.OC)

In this paper, we introduce a distributed secondary frequency control scheme for an islanded ac microgrid under event-triggered communication. An integral type event-triggered mechanism is proposed by which each distributed generator (DG) asynchronously and periodically checks its triggering condition and determines whether to update its control inputs and broadcast its states to neighboring DGs. In contrast to existing event-triggered strategies on secondary control of microgrids, under the proposed sampled-data based event-triggered mechanism, DGs need not be synchronized to a common clock and each individual DG checks its triggering condition periodically, relying on its own clock. Furthermore, the proposed method efficiently reduces communication and computation complexity. We provide sufficient conditions under which all DGs' frequencies asymptotically converge to the common reference frequency value. Finally, effectiveness of our proposed method is verified by simulating different scenarios on a well-established islanded ac microgrid benchmark in the MATLAB/Simulink environment.

[246]  arXiv:2008.00695 [pdf, ps, other]
Title: The Subfield Codes of $[q+1, 2, q]$ MDS Codes
Subjects: Information Theory (cs.IT)

Recently, subfield codes of geometric codes over large finite fields $\gf(q)$ with dimension $3$ and $4$ were studied and distance-optimal subfield codes over $\gf(p)$ were obtained, where $q=p^m$. The key idea for obtaining good subfield codes over small fields is to choose very good linear codes over an extension field with small dimension. This paper first presents a general construction of $[q+1, 2, q]$ MDS codes over $\gf(q)$, and then study the subfield codes over $\gf(p)$ of some of the $[q+1, 2,q]$ MDS codes over $\gf(q)$. Several families of distance-optimal codes over small fields are produced.

[247]  arXiv:2008.00696 [pdf, other]
Title: Heterogeneous Swarms for Maritime Dynamic Target Search and Tracking
Comments: Accepted for IEEE/MTS OCEANS 2020, Singapore
Subjects: Robotics (cs.RO); Systems and Control (eess.SY)

Current strategies employed for maritime target search and tracking are primarily based on the use of agents following a predetermined path to perform a systematic sweep of a search area. Recently, dynamic Particle Swarm Optimization (PSO) algorithms have been used together with swarming multi-robot systems (MRS), giving search and tracking solutions the added properties of robustness, scalability, and flexibility. Swarming MRS also give the end-user the opportunity to incrementally upgrade the robotic system, inevitably leading to the use of heterogeneous swarming MRS. However, such systems have not been well studied and incorporating upgraded agents into a swarm may result in degraded mission performances. In this paper, we propose a PSO-based strategy using a topological k-nearest neighbor graph with tunable exploration and exploitation dynamics with an adaptive repulsion parameter. This strategy is implemented within a simulated swarm of 50 agents with varying proportions of fast agents tracking a target represented by a fictitious binary function. Through these simulations, we are able to demonstrate an increase in the swarm's collective response level and target tracking performance by substituting in a proportion of fast buoys.

[248]  arXiv:2008.00697 [pdf, other]
Title: Adversarial Semantic Data Augmentation for Human Pose Estimation
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Human pose estimation is the task of localizing body keypoints from still images. The state-of-the-art methods suffer from insufficient examples of challenging cases such as symmetric appearance, heavy occlusion and nearby person. To enlarge the amounts of challenging cases, previous methods augmented images by cropping and pasting image patches with weak semantics, which leads to unrealistic appearance and limited diversity. We instead propose Semantic Data Augmentation (SDA), a method that augments images by pasting segmented body parts with various semantic granularity. Furthermore, we propose Adversarial Semantic Data Augmentation (ASDA), which exploits a generative network to dynamiclly predict tailored pasting configuration. Given off-the-shelf pose estimation network as discriminator, the generator seeks the most confusing transformation to increase the loss of the discriminator while the discriminator takes the generated sample as input and learns from it. The whole pipeline is optimized in an adversarial manner. State-of-the-art results are achieved on challenging benchmarks.

[249]  arXiv:2008.00698 [pdf, other]
Title: Anti-Bandit Neural Architecture Search for Model Defense
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

Deep convolutional neural networks (DCNNs) have dominated as the best performers in machine learning, but can be challenged by adversarial attacks. In this paper, we defend against adversarial attacks using neural architecture search (NAS) which is based on a comprehensive search of denoising blocks, weight-free operations, Gabor filters and convolutions. The resulting anti-bandit NAS (ABanditNAS) incorporates a new operation evaluation measure and search process based on the lower and upper confidence bounds (LCB and UCB). Unlike the conventional bandit algorithm using UCB for evaluation only, we use UCB to abandon arms for search efficiency and LCB for a fair competition between arms. Extensive experiments demonstrate that ABanditNAS is faster than other NAS methods, while achieving an $8.73\%$ improvement over prior arts on CIFAR-10 under PGD-$7$.

[250]  arXiv:2008.00699 [pdf, other]
Title: Getting to Know One Another: Calibrating Intent, Capabilities and Trust for Human-Robot Collaboration
Comments: IROS 2020
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC); Multiagent Systems (cs.MA)

Common experience suggests that agents who know each other well are better able to work together. In this work, we address the problem of calibrating intention and capabilities in human-robot collaboration. In particular, we focus on scenarios where the robot is attempting to assist a human who is unable to directly communicate her intent. Moreover, both agents may have differing capabilities that are unknown to one another. We adopt a decision-theoretic approach and propose the TICC-POMDP for modeling this setting, with an associated online solver. Experiments show our approach leads to better team performance both in simulation and in a real-world study with human subjects.

[251]  arXiv:2008.00701 [pdf, other]
Title: Memory Optimal Dispersion by Anonymous Mobile Robots
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

Consider a team of $k \leq n$ autonomous mobile robots initially placed at a node of an arbitrary graph $G$ with $n$ nodes. The dispersion problem asks for a distributed algorithm that allows the robots to reach a configuration in which each robot is at a distinct node of the graph. If the robots are anonymous, i.e., they do not have any unique identifiers, then the problem is not solvable by any deterministic algorithm. However, the problem can be solved even by anonymous robots if each robot is given access to a fair coin which they can use to generate random bits. In this setting, it is known that the robots require $\Omega(\log{\Delta})$ bits of memory to achieve dispersion, where $\Delta$ is the maximum degree of $G$. On the other hand, the best known memory upper bound is $min \{\Delta, max\{\log{\Delta}, \log{D}\}\}$ ($D$ = diameter of $G$), which can be $\omega(\log{\Delta})$, depending on the values of $\Delta$ and $D$. In this paper, we close this gap by presenting an optimal algorithm requiring $O(\log{\Delta})$ bits of memory.

[252]  arXiv:2008.00706 [pdf, other]
Title: LiDAR point-cloud processing based on projection methods: a comparison
Subjects: Robotics (cs.RO)

An accurate and rapid-response perception system is fundamental for autonomous vehicles to operate safely. 3D object detection methods handle point clouds given by LiDAR sensors to provide accurate depth and position information for each detection, together with its dimensions and classification. The information is then used to track vehicles and other obstacles in the surroundings of the autonomous vehicle, and also to feed control units that guarantee collision avoidance and motion planning. Nowadays, object detection systems can be divided into two main categories. The first ones are the geometric based, which retrieve the obstacles using geometric and morphological operations on the 3D points. The seconds are the deep learning-based, which process the 3D points, or an elaboration of the 3D point-cloud, with deep learning techniques to retrieve a set of obstacles. This paper presents a comparison between those two approaches, presenting one implementation of each class on a real autonomous vehicle. Accuracy of the estimates of the algorithms has been evaluated with experimental tests carried in the Monza ENI circuit. The position of the ego vehicle and the obstacle is given by GPS sensors with RTK correction, which guarantees an accurate ground truth for the comparison. Both algorithms have been implemented on ROS and run on a consumer laptop.

[253]  arXiv:2008.00710 [pdf, other]
Title: Deep Complementary Joint Model for Complex Scene Registration and Few-shot Segmentation on Medical Images
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Deep learning-based medical image registration and segmentation joint models utilize the complementarity (augmentation data or weakly supervised data from registration, region constraints from segmentation) to bring mutual improvement in complex scene and few-shot situation. However, further adoption of the joint models are hindered: 1) the diversity of augmentation data is reduced limiting the further enhancement of segmentation, 2) misaligned regions in weakly supervised data disturb the training process, 3) lack of label-based region constraints in few-shot situation limits the registration performance. We propose a novel Deep Complementary Joint Model (DeepRS) for complex scene registration and few-shot segmentation. We embed a perturbation factor in the registration to increase the activity of deformation thus maintaining the augmentation data diversity. We take a pixel-wise discriminator to extract alignment confidence maps which highlight aligned regions in weakly supervised data so the misaligned regions' disturbance will be suppressed via weighting. The outputs from segmentation model are utilized to implement deep-based region constraints thus relieving the label requirements and bringing fine registration. Extensive experiments on the CT dataset of MM-WHS 2017 Challenge show great advantages of our DeepRS that outperforms the existing state-of-the-art models.

[254]  arXiv:2008.00714 [pdf, other]
Title: AE TextSpotter: Learning Visual and Linguistic Representation for Ambiguous Text Spotting
Comments: Accepted by ECCV 2020
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Scene text spotting aims to detect and recognize the entire word or sentence with multiple characters in natural images. It is still challenging because ambiguity often occurs when the spacing between characters is large or the characters are evenly spread in multiple rows and columns, making many visually plausible groupings of the characters (e.g. "BERLIN" is incorrectly detected as "BERL" and "IN" in Fig. 1(c)). Unlike previous works that merely employed visual features for text detection, this work proposes a novel text spotter, named Ambiguity Eliminating Text Spotter (AE TextSpotter), which learns both visual and linguistic features to significantly reduce ambiguity in text detection. The proposed AE TextSpotter has three important benefits. 1) The linguistic representation is learned together with the visual representation in a framework. To our knowledge, it is the first time to improve text detection by using a language model. 2) A carefully designed language module is utilized to reduce the detection confidence of incorrect text lines, making them easily pruned in the detection stage. 3) Extensive experiments show that AE TextSpotter outperforms other state-of-the-art methods by a large margin. For example, we carefully select a set of extremely ambiguous samples from the IC19-ReCTS dataset, where our approach surpasses other methods by more than 4%.

[255]  arXiv:2008.00715 [pdf, other]
Title: Learning to Drive Small Scale Cars from Scratch
Subjects: Robotics (cs.RO)

We consider the problem of learning to drive low-cost small scale cars using reinforcement learning. It is challenging to handle the long-tailed distributions of events in the real-world with handcrafted logical rules and reinforcement learning could be a potentially more scalable solution to deal with them. We adopt an existing platform called Donkey car for low-cost repeatable and reproducible research in autonomous driving. We consider the task of learning to drive around a track, given only monocular image observations from an on-board camera. We demonstrate that the soft actor-critic algorithm combined with state representation learning using a variational autoencoder can learn to drive around randomly generated tracks on the Donkey car simulator and a real-world track using the Donkey car platform. Our agent can learn from scratch using sparse and noisy rewards within just 10 minutes of driving experience.

[256]  arXiv:2008.00720 [pdf, ps, other]
Title: Pseudoinverse Graph Convolutional Networks: Fast Filters Tailored for Large Eigengaps of Dense Graphs and Hypergraphs
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

Graph Convolutional Networks (GCNs) have proven to be successful tools for semi-supervised classification on graph-based datasets. We propose a new GCN variant whose three-part filter space is targeted at dense graphs. Examples include Gaussian graphs for 3D point clouds with an increased focus on non-local information, as well as hypergraphs based on categorical data. These graphs differ from the common sparse benchmark graphs in terms of the spectral properties of their graph Laplacian. Most notably we observe large eigengaps, which are unfavorable for popular existing GCN architectures. Our method overcomes these issues by utilizing the pseudoinverse of the Laplacian. Another key ingredient is a low-rank approximation of the convolutional matrix, ensuring computational efficiency and increasing accuracy at the same time. We outline how the necessary eigeninformation can be computed efficiently in each applications and discuss the appropriate choice of the only metaparameter, the approximation rank. We finally showcase our method's performance regarding runtime and accuracy in various experiments with real-world datasets.

[257]  arXiv:2008.00724 [pdf, ps, other]
Title: A lemma on closures and its application to modularity in logic programming semantics
Authors: Michael J. Maher
Comments: 12 pages
Subjects: Logic in Computer Science (cs.LO)

This note points out a lemma on closures of monotonic increasing functions and shows how it is applicable to decomposition and modularity for semantics defined as the least fixedpoint of some monotonic function. In particular it applies to numerous semantics of logic programs. An appendix addresses the fixedpoints of (possibly non-monotonic) functions that are sandwiched between functions with the same fixedpoints.

[258]  arXiv:2008.00727 [pdf]
Title: Deep Bayesian Bandits: Exploring in Online Personalized Recommendations
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

Recommender systems trained in a continuous learning fashion are plagued by the feedback loop problem, also known as algorithmic bias. This causes a newly trained model to act greedily and favor items that have already been engaged by users. This behavior is particularly harmful in personalised ads recommendations, as it can also cause new campaigns to remain unexplored. Exploration aims to address this limitation by providing new information about the environment, which encompasses user preference, and can lead to higher long-term reward. In this work, we formulate a display advertising recommender as a contextual bandit and implement exploration techniques that require sampling from the posterior distribution of click-through-rates in a computationally tractable manner. Traditional large-scale deep learning models do not provide uncertainty estimates by default. We approximate these uncertainty measurements of the predictions by employing a bootstrapped model with multiple heads and dropout units. We benchmark a number of different models in an offline simulation environment using a publicly available dataset of user-ads engagements. We test our proposed deep Bayesian bandits algorithm in the offline simulation and online AB setting with large-scale production traffic, where we demonstrate a positive gain of our exploration model.

[259]  arXiv:2008.00730 [pdf, other]
Title: Nonlinearity continuation method for steady-state groundwater flow modeling in variably saturated conditions
Subjects: Numerical Analysis (math.NA)

Application of nonlinearity continuation method to numerical solution of steady-state groundwater flow in variably saturated conditions is presented. In order to solve the system of nonlinear equations obtained by finite volume discretization of steady-state Richards equation, a series of problems with increasing nonlinearity are solved using the Newton method. This approach is compared to pseudo-transient method on several test cases, including real site problems and involving parallel computations.

[260]  arXiv:2008.00739 [pdf, other]
Title: Distributed Localization of Wireless Sensor Network Using Communication Wheel
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

We study the network localization problem, i.e., the problem of determining node positions of a wireless sensor network modeled as a unit disk graph. In an arbitrarily deployed network, positions of all nodes of the network may not be uniquely determined. It is known that even if the network corresponds to a unique solution, no polynomial-time algorithm can solve this problem in the worst case, unless RP = NP. So we are interested in algorithms that efficiently localize the network partially. A widely used technique that can efficiently localize a uniquely localizable portion of the network is trilateration: starting from three anchors (nodes with known positions), nodes having at least three localized neighbors are sequentially localized. However, the performance of trilateration can substantially differ for different choices of the initial three anchors. In this paper, we propose a distributed localization scheme with a theoretical characterization of nodes that are guaranteed to be localized. In particular, our proposed distributed algorithm starts localization from a strongly interior node and provided that the subgraph induced by the strongly interior nodes is connected, it localizes all nodes of the network except some boundary nodes and isolated weakly interior nodes.

[261]  arXiv:2008.00741 [pdf, other]
Title: Low-loss connection of weight vectors: distribution-based approaches
Comments: accepted to ICML 2020
Subjects: Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)

Recent research shows that sublevel sets of the loss surfaces of overparameterized networks are connected, exactly or approximately. We describe and compare experimentally a panel of methods used to connect two low-loss points by a low-loss curve on this surface. Our methods vary in accuracy and complexity. Most of our methods are based on "macroscopic" distributional assumptions, and some are insensitive to the detailed properties of the points to be connected. Some methods require a prior training of a "global connection model" which can then be applied to any pair of points. The accuracy of the method generally correlates with its complexity and sensitivity to the endpoint detail.

[262]  arXiv:2008.00742 [pdf, ps, other]
Title: Collaborative Learning as an Agreement Problem
Subjects: Machine Learning (cs.LG); Distributed, Parallel, and Cluster Computing (cs.DC); Machine Learning (stat.ML)

We address the problem of Byzantine collaborative learning: a set of $n$ nodes try to collectively learn from data, whose distributions may vary from one node to another. None of them is trusted and $f < n$ can behave arbitrarily.
We show that collaborative learning is equivalent to a new form of agreement, which we call averaging agreement. In this problem, nodes start each with an initial vector and seek to approximately agree on a common vector, while guaranteeing that this common vector remains within a constant (also called averaging constant) of the maximum distance between the original vectors. Essentially, the smaller the averaging constant, the better the learning.
We present three asynchronous solutions to averaging agreement, each interesting in its own right. The first, based on the minimum volume ellipsoid, achieves asymptotically the best-possible averaging constant but requires $ n \geq 6f+1$. The second, based on reliable broadcast, achieves optimal Byzantine resilience, i.e., $n \geq 3f+1$, but requires signatures and induces a large number of communication rounds. The third, based on coordinate-wise trimmed mean, is faster and achieves optimal Byzantine resilience, i.e., $n \geq 4f+1$, within standard form algorithms that do not use signatures.

[263]  arXiv:2008.00744 [pdf, other]
Title: The End-of-End-to-End: A Video Understanding Pentathlon Challenge (2020)
Comments: Individual reports, dataset information, rules, and released source code can be found at the competition webpage (this https URL)
Subjects: Computer Vision and Pattern Recognition (cs.CV)

We present a new video understanding pentathlon challenge, an open competition held in conjunction with the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2020. The objective of the challenge was to explore and evaluate new methods for text-to-video retrieval-the task of searching for content within a corpus of videos using natural language queries. This report summarizes the results of the first edition of the challenge together with the findings of the participants.

[264]  arXiv:2008.00745 [pdf, other]
Title: Community membership consistency in corporate board interlock networks
Subjects: Social and Information Networks (cs.SI); Physics and Society (physics.soc-ph)

Community detection is a well established method for studying the meso scale structure of social networks. Applying a community detection algorithm results in a division of a network into communities that is often used to inspect and reason about community membership of specific nodes. This micro level interpretation step of community structure is a crucial step in typical social science research. However, the methodological caveat in this step is that virtually all modern community detection methods are non-deterministic and based on randomization and approximated results. This needs to be explicitly taken into consideration when reasoning about community membership of individual nodes. To do so, we propose a metric of \emph{community membership consistency}, that provides node-level insights in how reliable the placement of that node into a community really is. In addition, it enables us to distinguish the \emph{community core} members of a community. The usefulness the proposed metrics is demonstrated on corporate board interlock networks, in which weighted links represent shared senior level directors between firms. Results suggest that the community structure of global business groups is centered around persistent communities consisting of core countries tied by geographical and cultural proximity. In addition, we identify fringe countries that appear to associate with a number of different global business communities.

[265]  arXiv:2008.00748 [pdf, other]
Title: Tensorizing GAN with High-Order Pooling for Alzheimer's Disease Assessment
Comments: 15 pages, 20 figures
Subjects: Machine Learning (cs.LG); Image and Video Processing (eess.IV); Machine Learning (stat.ML)

It is of great significance to apply deep learning for the early diagnosis of Alzheimer's Disease (AD). In this work, a novel tensorizing GAN with high-order pooling is proposed to assess Mild Cognitive Impairment (MCI) and AD. By tensorizing a three-player cooperative game based framework, the proposed model can benefit from the structural information of the brain. By incorporating the high-order pooling scheme into the classifier, the proposed model can make full use of the second-order statistics of the holistic Magnetic Resonance Imaging (MRI) images. To the best of our knowledge, the proposed Tensor-train, High-pooling and Semi-supervised learning based GAN (THS-GAN) is the first work to deal with classification on MRI images for AD diagnosis. Extensive experimental results on Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset are reported to demonstrate that the proposed THS-GAN achieves superior performance compared with existing methods, and to show that both tensor-train and high-order pooling can enhance classification performance. The visualization of generated samples also shows that the proposed model can generate plausible samples for semi-supervised learning purpose.

[266]  arXiv:2008.00752 [pdf]
Title: GmFace: A Mathematical Model for Face Image Representation Using Multi-Gaussian
Comments: 12 pages, 12 figures, 4 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

Establishing mathematical models is a ubiquitous and effective method to understand the objective world. Due to complex physiological structures and dynamic behaviors, mathematical representation of the human face is an especially challenging task. A mathematical model for face image representation called GmFace is proposed in the form of a multi-Gaussian function in this paper. The model utilizes the advantages of two-dimensional Gaussian function which provides a symmetric bell surface with a shape that can be controlled by parameters. The GmNet is then designed using Gaussian functions as neurons, with parameters that correspond to each of the parameters of GmFace in order to transform the problem of GmFace parameter solving into a network optimization problem of GmNet. The face modeling process can be described by the following steps: (1) GmNet initialization; (2) feeding GmNet with face image(s); (3) training GmNet until convergence; (4) drawing out the parameters of GmNet (as the same as GmFace); (5) recording the face model GmFace. Furthermore, using GmFace, several face image transformation operations can be realized mathematically through simple parameter computation.

[267]  arXiv:2008.00759 [pdf, other]
Title: Proximal Deterministic Policy Gradient
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

This paper introduces two simple techniques to improve off-policy Reinforcement Learning (RL) algorithms. First, we formulate off-policy RL as a stochastic proximal point iteration. The target network plays the role of the variable of optimization and the value network computes the proximal operator. Second, we exploits the two value functions commonly employed in state-of-the-art off-policy algorithms to provide an improved action value estimate through bootstrapping with limited increase of computational resources. Further, we demonstrate significant performance improvement over state-of-the-art algorithms on standard continuous-control RL benchmarks.

[268]  arXiv:2008.00760 [pdf, other]
Title: IntroVAC: Introspective Variational Classifiers for Learning Interpretable Latent Subspaces
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)

Learning useful representations of complex data has been the subject of extensive research for many years. With the diffusion of Deep Neural Networks, Variational Autoencoders have gained lots of attention since they provide an explicit model of the data distribution based on an encoder/decoder architecture which is able to both generate images and encode them in a low-dimensional subspace. However, the latent space is not easily interpretable and the generation capabilities show some limitations since images typically look blurry and lack details. In this paper, we propose the Introspective Variational Classifier (IntroVAC), a model that learns interpretable latent subspaces by exploiting information from an additional label and provides improved image quality thanks to an adversarial training strategy.We show that IntroVAC is able to learn meaningful directions in the latent space enabling fine-grained manipulation of image attributes. We validate our approach on the CelebA dataset.

[269]  arXiv:2008.00766 [pdf, other]
Title: Tracking the Race Between Deep Reinforcement Learning and Imitation Learning -- Extended Version
Comments: Extended Version of the Conference Paper published in the Proceedings of the 17th International Conference on Quantitative Evaluation of SysTems (QEST)
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)

Learning-based approaches for solving large sequential decision making problems have become popular in recent years. The resulting agents perform differently and their characteristics depend on those of the underlying learning approach. Here, we consider a benchmark planning problem from the reinforcement learning domain, the Racetrack, to investigate the properties of agents derived from different deep (reinforcement) learning approaches. We compare the performance of deep supervised learning, in particular imitation learning, to reinforcement learning for the Racetrack model. We find that imitation learning yields agents that follow more risky paths. In contrast, the decisions of deep reinforcement learning are more foresighted, i.e., avoid states in which fatal decisions are more likely. Our evaluations show that for this sequential decision making problem, deep reinforcement learning performs best in many aspects even though for imitation learning optimal decisions are considered.

[270]  arXiv:2008.00767 [pdf, other]
Title: DCSFN: Deep Cross-scale Fusion Network for Single Image Rain Removal
Comments: Accepted to ACM International Conference on Multimedia (MM'20)
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Rain removal is an important but challenging computer vision task as rain streaks can severely degrade the visibility of images that may make other visions or multimedia tasks fail to work. Previous works mainly focused on feature extraction and processing or neural network structure, while the current rain removal methods can already achieve remarkable results, training based on single network structure without considering the cross-scale relationship may cause information drop-out. In this paper, we explore the cross-scale manner between networks and inner-scale fusion operation to solve the image rain removal task. Specifically, to learn features with different scales, we propose a multi-sub-networks structure, where these sub-networks are fused via a crossscale manner by Gate Recurrent Unit to inner-learn and make full use of information at different scales in these sub-networks. Further, we design an inner-scale connection block to utilize the multi-scale information and features fusion way between different scales to improve rain representation ability and we introduce the dense block with skip connection to inner-connect these blocks. Experimental results on both synthetic and real-world datasets have demonstrated the superiority of our proposed method, which outperforms over the state-of-the-art methods. The source code will be available at https://supercong94.wixsite.com/supercong94.

[271]  arXiv:2008.00769 [pdf, other]
Title: A Low-Complexity Algorithmic Framework for Large-Scale IRS-Assisted Wireless Systems
Subjects: Information Theory (cs.IT); Signal Processing (eess.SP)

Intelligent reflecting surfaces (IRSs) are revolutionary enablers for next-generation wireless communication networks, with the ability to customize the radio propagation environment. To fully exploit the potential of IRS-assisted wireless systems, reflective elements have to be jointly optimized with conventional communication techniques. However, the resulting optimization problems pose significant algorithmic challenges, mainly due to the large-scale non-convex constraints induced by the passive hardware implementations. In this paper, we propose a low-complexity algorithmic framework incorporating alternating optimization and gradient-based methods for large-scale IRS-assisted wireless systems. The proposed algorithm provably converges to a stationary point of the optimization problem. Extensive simulation results demonstrate that the proposed framework provides significant speedups compared with existing algorithms, while achieving a comparable or better performance.

[272]  arXiv:2008.00774 [pdf, ps, other]
Title: Elsevier OA CC-By Corpus
Comments: 6 pages, 0 figures
Subjects: Computation and Language (cs.CL); Digital Libraries (cs.DL)

We introduce the Elsevier OA CC-BY corpus. This is the first open corpus of Scientific Research papers which has a representative sample from across scientific disciplines. This corpus not only includes the full text of the article, but also the metadata of the documents, along with the bibliographic information for each reference.

[273]  arXiv:2008.00777 [pdf, other]
Title: Dynamic and Static Context-aware LSTM for Multi-agent Motion Prediction
Comments: 17 pages, 6 figures
Journal-ref: ECCV 2020
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Multi-agent motion prediction is challenging because it aims to foresee the future trajectories of multiple agents (\textit{e.g.} pedestrians) simultaneously in a complicated scene. Existing work addressed this challenge by either learning social spatial interactions represented by the positions of a group of pedestrians, while ignoring their temporal coherence (\textit{i.e.} dependencies between different long trajectories), or by understanding the complicated scene layout (\textit{e.g.} scene segmentation) to ensure safe navigation. However, unlike previous work that isolated the spatial interaction, temporal coherence, and scene layout, this paper designs a new mechanism, \textit{i.e.}, Dynamic and Static Context-aware Motion Predictor (DSCMP), to integrates these rich information into the long-short-term-memory (LSTM). It has three appealing benefits. (1) DSCMP models the dynamic interactions between agents by learning both their spatial positions and temporal coherence, as well as understanding the contextual scene layout.(2) Different from previous LSTM models that predict motions by propagating hidden features frame by frame, limiting the capacity to learn correlations between long trajectories, we carefully design a differentiable queue mechanism in DSCMP, which is able to explicitly memorize and learn the correlations between long trajectories. (3) DSCMP captures the context of scene by inferring latent variable, which enables multimodal predictions with meaningful semantic scene layout. Extensive experiments show that DSCMP outperforms state-of-the-art methods by large margins, such as 9.05\% and 7.62\% relative improvements on the ETH-UCY and SDD datasets respectively.

[274]  arXiv:2008.00779 [pdf, other]
Title: Approximating pathwidth for graphs of small treewidth
Subjects: Data Structures and Algorithms (cs.DS); Discrete Mathematics (cs.DM); Combinatorics (math.CO)

We describe a polynomial-time algorithm which, given a graph $G$ with treewidth $t$, approximates the pathwidth of $G$ to within a ratio of $O(t\sqrt{\log t})$. This is the first algorithm to achieve an $f(t)$-approximation for some function $f$.
Our approach builds on the following key insight: every graph with large pathwidth has large treewidth or contains a subdivision of a large complete binary tree. Specifically, we show that every graph with pathwidth at least $th+2$ has treewidth at least $t$ or contains a subdivision of a complete binary tree of height $h+1$. The bound $th+2$ is best possible up to a multiplicative constant. This result was motivated by, and implies (with $c=2$), the following conjecture of Kawarabayashi and Rossman (SODA'18): there exists a universal constant $c$ such that every graph with pathwidth $\Omega(k^c)$ has treewidth at least $k$ or contains a subdivision of a complete binary tree of height $k$.
Our main technical algorithm takes a graph $G$ and some (not necessarily optimal) tree decomposition of $G$ of width $t'$ in the input, and it computes in polynomial time an integer $h$, a certificate that $G$ has pathwidth at least $h$, and a path decomposition of $G$ of width at most $(t'+1)h+1$. The certificate is closely related to (and implies) the existence of a subdivision of a complete binary tree of height $h$. The approximation algorithm for pathwidth is then obtained by combining this algorithm with the approximation algorithm of Feige, Hajiaghayi, and Lee (STOC'05) for treewidth.

[275]  arXiv:2008.00783 [pdf, other]
Title: Attribute-aware Diversification for Sequential Recommendations
Comments: AIIS 2020, as part of SIGIR 2020 this https URL
Subjects: Information Retrieval (cs.IR)

Users prefer diverse recommendations over homogeneous ones. However, most previous work on Sequential Recommenders does not consider diversity, and strives for maximum accuracy, resulting in homogeneous recommendations. In this paper, we consider both accuracy and diversity by presenting an Attribute-aware Diversifying Sequential Recommender (ADSR). Specifically, ADSR utilizes available attribute information when modeling a user's sequential behavior to simultaneously learn the user's most likely item to interact with, and their preference of attributes. Then, ADSR diversifies the recommended items based on the predicted preference for certain attributes. Experiments on two benchmark datasets demonstrate that ADSR can effectively provide diverse recommendations while maintaining accuracy.

[276]  arXiv:2008.00784 [pdf]
Title: COVID-19 Misinformation and Disinformation on Social Networks -- The Limits of Veritistic Countermeasures
Authors: Andrew Buzzell
Subjects: Computers and Society (cs.CY); Social and Information Networks (cs.SI)

The COVID-19 pandemic has been the subject of a vast amount of misinformation, particularly in digital information environments, and major social media platforms recently publicized some of the countermeasures they are adopting. This presents an opportunity to examine the nature of the misinformation and disinformation being produced, and the theoretical and technological paradigm used to counter it. I argue that this approach is based on a conception of misinformation as epistemic pollution that can only justify a limited and potentially inadequate response , and that some of the measures undertaken in practice outrun this. In fact, social networks manage ecological and architectural conditions that influence discourse on their platforms in ways that should motivate reconsideration of the justifications that ground epistemic interventions to combat misinformation, and the types of intervention that they warrant. The editorial role of platforms should not be framed solely as the management of epistemic pollution, but instead as managing the epistemic environment in which narratives and social epistemic processes take place. There is an element of inevitable epistemic paternalism involved in this, and exploration of the independent constraints on its justifiability can help determine proper limits of its exercise in practice.

[277]  arXiv:2008.00787 [pdf, other]
Title: Fluid Composition of Intermittent IoT Energy Services
Comments: 9 pages, Accepted and to appear in 2020 IEEE International Conference on Services Computing (SCC). Content may change prior to final publication
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

We propose a novel fluid composition approach of wireless energy services in a crowdsourced IoT environment. The proposed approach selects an optimal set of dynamic energy services according to the consumer's requirements. We leverage the mobility patterns of the crowd in confined areas to capture the intermittent behavior of IoT energy services. We model the IoT energy services based on their mobility patterns to propose a knapsack-based heuristic for the fluid composition. Experimental results demonstrate the efficiency of the proposed approach.

[278]  arXiv:2008.00791 [pdf, other]
Title: Characterizing COVID-19 Misinformation Communities Using a Novel Twitter Dataset
Comments: 9 pages, under review
Subjects: Social and Information Networks (cs.SI); Computation and Language (cs.CL)

From conspiracy theories to fake cures and fake treatments, COVID-19 has become a hot-bed for the spread of misinformation online. It is more important than ever to identify methods to debunk and correct false information online. In this paper, we present a methodology and analyses to characterize the two competing COVID-19 misinformation communities online: (i) misinformed users or users who are actively posting misinformation, and (ii) informed users or users who are actively spreading true information, or calling out misinformation. The goals of this study are two-fold: (i) collecting a diverse set of annotated COVID-19 Twitter dataset that can be used by the research community to conduct meaningful analysis; and (ii) characterizing the two target communities in terms of their network structure, linguistic patterns, and their membership in other communities. Our analyses show that COVID-19 misinformed communities are denser, and more organized than informed communities, with a high volume of the misinformation being part of disinformation campaigns. Our analyses also suggest that a large majority of misinformed users may be anti-vaxxers. Finally, our sociolinguistic analyses suggest that COVID-19 informed users tend to use more narratives than misinformed users.

[279]  arXiv:2008.00792 [pdf, ps, other]
Title: Nonlinear MPC for Collision Avoidance and Controlof UAVs With Dynamic Obstacles
Comments: 8 pages, 10 figures
Journal-ref: IEEE Robotics and Automation Letters, vol. 5, no. 4, pp. 6001-6008, Oct. 2020
Subjects: Robotics (cs.RO)

This article proposes a Novel Nonlinear Model Predictive Control (NMPC) for navigation and obstacle avoidance of an Unmanned Aerial Vehicle (UAV). The proposed NMPC formulation allows for a fully parametric obstacle trajectory, while in this article we apply a classification scheme to differentiate between different kinds of trajectories to predict future obstacle positions. The trajectory calculation is done from an initial condition, and fed to the NMPC as an additional input. The solver used is the nonlinear, non-convex solver Proximal Averaged Newton for Optimal Control (PANOC) and its associated software OpEn (Optimization Engine), in which we apply a penalty method to properly consider the obstacles and other constraints during navigation. The proposed NMPC scheme allows for real-time solutions using a sampling time of 50 ms and a two second prediction of both the obstacle trajectory and the NMPC problem, which implies that the scheme can be considered as a local path-planner. This paper will present the NMPC cost function and constraint formulation, as well as the methodology of dealing with the dynamic obstacles. We include multiple laboratory experiments to demonstrate the efficacy of the proposed control architecture, and to show that the proposed method delivers fast and computationally stable solutions to the dynamic obstacle avoidance scenarios.

[280]  arXiv:2008.00793 [pdf, other]
Title: Distributed Dispatching in the Parallel Server Model
Comments: 25 pages, 6 figures
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

With the rapid increase in the size and volume of cloud services and data centers, architectures with multiple job dispatchers are quickly becoming the norm. Load balancing is a key element of such systems. Nevertheless, current solutions to load balancing in such systems admit a paradoxical behavior in which more accurate information regarding server queue lengths degrades performance due to herding and detrimental incast effects. Indeed, both in theory and in practice, there is a common doubt regarding the value of information in the context of multi-dispatcher load balancing. As a result, both researchers and system designers resort to more straightforward solutions, such as the power-of-two-choices to avoid worst-case scenarios, potentially sacrificing overall resource utilization and system performance. A principal focus of our investigation concerns the value of information about queue lengths in the multi-dispatcher setting. We argue that, at its core, load balancing with multiple dispatchers is a distributed computing task. In that light, we propose a new job dispatching approach, called Tidal Water Filling, which addresses the distributed nature of the system. Specifically, by incorporating the existence of other dispatchers into the decision-making process, our protocols outperform previous solutions in many scenarios. In particular, when the dispatchers have complete and accurate information regarding the server queues, our policies significantly outperform all existing solutions.

[281]  arXiv:2008.00801 [pdf, other]
Title: Real-Time Point Cloud Fusion of Multi-LiDAR Infrastructure Sensor Setups with Unknown Spatial Location and Orientation
Comments: Accepted to be published as part of the 23rd IEEE International Conference on Intelligent Transportation Systems (ITSC), Rhodes, Greece, September 20-23, 2020
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)

The use of infrastructure sensor technology for traffic detection has already been proven several times. However, extrinsic sensor calibration is still a challenge for the operator. While previous approaches are unable to calibrate the sensors without the use of reference objects in the sensor field of view (FOV), we present an algorithm that is completely detached from external assistance and runs fully automatically. Our method focuses on the high-precision fusion of LiDAR point clouds and is evaluated in simulation as well as on real measurements. We set the LiDARs in a continuous pendulum motion in order to simulate real-world operation as closely as possible and to increase the demands on the algorithm. However, it does not receive any information about the initial spatial location and orientation of the LiDARs throughout the entire measurement period. Experiments in simulation as well as with real measurements have shown that our algorithm performs a continuous point cloud registration of up to four 64-layer LiDARs in real-time. The averaged resulting translational error is within a few centimeters and the averaged error in rotation is below 0.15 degrees.

[282]  arXiv:2008.00805 [pdf, other]
Title: LT@Helsinki at SemEval-2020 Task 12: Multilingual or language-specific BERT?
Comments: Accepted at SemEval-2020 Task 12. Identical to camera-ready version except where adjustments to fit arXiv requirements were necessary
Subjects: Computation and Language (cs.CL)

This paper presents the different models submitted by the LT@Helsinki team for the SemEval 2020 Shared Task 12. Our team participated in sub-tasks A and C; titled offensive language identification and offense target identification, respectively. In both cases we used the so-called Bidirectional Encoder Representation from Transformer (BERT), a model pre-trained by Google and fine-tuned by us on the OLID and SOLID datasets. The results show that offensive tweet classification is one of several language-based tasks where BERT can achieve state-of-the-art results.

[283]  arXiv:2008.00806 [pdf, other]
Title: DAMO: Deep Agile Mask Optimization for Full Chip Scale
Subjects: Hardware Architecture (cs.AR)

Continuous scaling of the VLSI system leaves a great challenge on manufacturing and optical proximity correction (OPC) is widely applied in conventional design flow for manufacturability optimization. Traditional techniques conducted OPC by leveraging a lithography model and suffered from prohibitive computational overhead, and mostly focused on optimizing a single clip without addressing how to tackle the full chip. In this paper, we present DAMO, a high performance and scalable deep learning-enabled OPC system for full chip scale. It is an end-to-end mask optimization paradigm which contains a Deep Lithography Simulator (DLS) for lithography modeling and a Deep Mask Generator (DMG) for mask pattern generation. Moreover, a novel layout splitting algorithm customized for DAMO is proposed to handle the full chip OPC problem. Extensive experiments show that DAMO outperforms the state-of-the-art OPC solutions in both academia and industrial commercial toolkit.

[284]  arXiv:2008.00807 [pdf, other]
Title: Adding Seemingly Uninformative Labels Helps in Low Data Regimes
Comments: ICML 2020
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Machine Learning (stat.ML)

Evidence suggests that networks trained on large datasets generalize well not solely because of the numerous training examples, but also class diversity which encourages learning of enriched features. This raises the question of whether this remains true when data is scarce - is there an advantage to learning with additional labels in low-data regimes? In this work, we consider a task that requires difficult-to-obtain expert annotations: tumor segmentation in mammography images. We show that, in low-data settings, performance can be improved by complementing the expert annotations with seemingly uninformative labels from non-expert annotators, turning the task into a multi-class problem. We reveal that these gains increase when less expert data is available, and uncover several interesting properties through further studies. We demonstrate our findings on CSAW-S, a new dataset that we introduce here, and confirm them on two public datasets.

[285]  arXiv:2008.00809 [pdf, ps, other]
Title: Adaptive Hierarchical Decomposition of Large Deep Networks
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)

Deep learning has recently demonstrated its ability to rival the human brain for visual object recognition. As datasets get larger, a natural question to ask is if existing deep learning architectures can be extended to handle the 50+K classes thought to be perceptible by a typical human. Most deep learning architectures concentrate on splitting diverse categories, while ignoring the similarities amongst them. This paper introduces a framework that automatically analyzes and configures a family of smaller deep networks as a replacement to a singular, larger network. Class similarities guide the creation of a family from course to fine classifiers which solve categorical problems more effectively than a single large classifier. The resulting smaller networks are highly scalable, parallel and more practical to train, and achieve higher classification accuracy. This paper also proposes a method to adaptively select the configuration of the hierarchical family of classifiers using linkage statistics from overall and sub-classification confusion matrices. Depending on the number of classes and the complexity of the problem, a deep learning model is selected and the complexity is determined. Numerous experiments on network classes, layers, and architecture configurations validate our results.

[286]  arXiv:2008.00810 [pdf, other]
Title: CASNet: Common Attribute Support Network for image instance and panoptic segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Instance segmentation and panoptic segmentation is being paid more and more attention in recent years. In comparison with bounding box based object detection and semantic segmentation, instance segmentation can provide more analytical results at pixel level. Given the insight that pixels belonging to one instance have one or more common attributes of current instance, we bring up an one-stage instance segmentation network named Common Attribute Support Network (CASNet), which realizes instance segmentation by predicting and clustering common attributes. CASNet is designed in the manner of fully convolutional and can implement training and inference from end to end. And CASNet manages predicting the instance without overlaps and holes, which problem exists in most of current instance segmentation algorithms. Furthermore, it can be easily extended to panoptic segmentation through minor modifications with little computation overhead. CASNet builds a bridge between semantic and instance segmentation from finding pixel class ID to obtaining class and instance ID by operations on common attribute. Through experiment for instance and panoptic segmentation, CASNet gets mAP 32.8% and PQ 59.0% on Cityscapes validation dataset by joint training, and mAP 36.3% and PQ 66.1% by separated training mode. For panoptic segmentation, CASNet gets state-of-the-art performance on the Cityscapes validation dataset.

[287]  arXiv:2008.00811 [pdf, ps, other]
Title: Truly asymptotic lower bounds for online vector bin packing
Comments: Submitted to SODA 2021
Subjects: Data Structures and Algorithms (cs.DS); Discrete Mathematics (cs.DM); Combinatorics (math.CO); Optimization and Control (math.OC)

In this work, we consider online vector bin packing. It is known that no algorithm can have a competitive ratio of $o(d/\log^2 d)$ in the absolute sense, though upper bounds for this problem were always shown in the asymptotic sense. Since variants of bin packing are traditionally studied with respect to the asymptotic measure and since the two measures are different, we focus on the asymptotic measure and prove new lower bounds on the asymptotic competitive ratio. The existing lower bounds prior to this work were much smaller than $3$ even for very large dimensions.
We significantly improve the best known lower bounds on the asymptotic competitive ratio (and as a byproduct, on the absolute competitive ratio) for online vector packing of vectors with $d \geq 3$ dimensions, for every such dimension $d$. To obtain these results, we use several different constructions, one of which is an adaptive construction showing a lower bound of $\Omega(\sqrt{d})$. Our main result is that the lower bound of $\Omega(d/\log^2 d)$ on the competitive ratio holds also in the asymptotic sense. The last result requires a careful adaptation of constructions for online coloring rather than simple black-box reductions.

[288]  arXiv:2008.00813 [pdf, other]
Title: Kinematics of motion tracking using computer vision
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)

This paper describes the kinematics of the motion tracking of a rigid body using video recording. The novelty of the paper is on the adaptation of the methods and nomenclature used in Computer Vision to those used in Multibody System Dynamics. That way, the equations presented here can be used, for example, for inverse-dynamics multibody simulations driven by the motion tracking of selected bodies. This paper also adapts the well-known Zhang calibration method to the presented nomenclature.

[289]  arXiv:2008.00818 [pdf, other]
Title: Partially Supervised Multi-Task Network for Single-View Dietary Assessment
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Food volume estimation is an essential step in the pipeline of dietary assessment and demands the precise depth estimation of the food surface and table plane. Existing methods based on computer vision require either multi-image input or additional depth maps, reducing convenience of implementation and practical significance. Despite the recent advances in unsupervised depth estimation from a single image, the achieved performance in the case of large texture-less areas needs to be improved. In this paper, we propose a network architecture that jointly performs geometric understanding (i.e., depth prediction and 3D plane estimation) and semantic prediction on a single food image, enabling a robust and accurate food volume estimation regardless of the texture characteristics of the target plane. For the training of the network, only monocular videos with semantic ground truth are required, while the depth map and 3D plane ground truth are no longer needed. Experimental results on two separate food image databases demonstrate that our method performs robustly on texture-less scenarios and is superior to unsupervised networks and structure from motion based approaches, while it achieves comparable performance to fully-supervised methods.

[290]  arXiv:2008.00819 [pdf, other]
Title: Tell me what this is: Few-Shot Incremental Object Learning by a Robot
Comments: Accepted at IEEE IROS 2020
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)

For many applications, robots will need to be incrementally trained to recognize the specific objects needed for an application. This paper presents a practical system for incrementally training a robot to recognize different object categories using only a small set of visual examples provided by a human. The paper uses a recently developed state-of-the-art method for few-shot incremental learning of objects. After learning the object classes incrementally, the robot performs a table cleaning task organizing objects into categories specified by the human. We also demonstrate the system's ability to learn arrangements of objects and predict missing or incorrectly placed objects. Experimental evaluations demonstrate that our approach achieves nearly the same performance as a system trained with all examples at one time (batch training), which constitutes a theoretical upper bound.

[291]  arXiv:2008.00820 [pdf, other]
Title: Generating Visually Aligned Sound from Videos
Comments: Published in IEEE Transactions on Image Processing, 2020. Code, pre-trained models and demo video: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)

We focus on the task of generating sound from natural videos, and the sound should be both temporally and content-wise aligned with visual signals. This task is extremely challenging because some sounds generated \emph{outside} a camera can not be inferred from video content. The model may be forced to learn an incorrect mapping between visual content and these irrelevant sounds. To address this challenge, we propose a framework named REGNET. In this framework, we first extract appearance and motion features from video frames to better distinguish the object that emits sound from complex background information. We then introduce an innovative audio forwarding regularizer that directly considers the real sound as input and outputs bottlenecked sound features. Using both visual and bottlenecked sound features for sound prediction during training provides stronger supervision for the sound prediction. The audio forwarding regularizer can control the irrelevant sound component and thus prevent the model from learning an incorrect mapping between video frames and sound emitted by the object that is out of the screen. During testing, the audio forwarding regularizer is removed to ensure that REGNET can produce purely aligned sound only from visual features. Extensive evaluations based on Amazon Mechanical Turk demonstrate that our method significantly improves both temporal and content-wise alignment. Remarkably, our generated sound can fool the human with a 68.12% success rate. Code and pre-trained models are publicly available at https://github.com/PeihaoChen/regnet

[292]  arXiv:2008.00821 [pdf, other]
Title: Experimental results on palmvein-based personal recognition by multi-snapshot fusion of textural features
Comments: 22 pages, 5 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

In this paper, we investigate multiple snapshot fusion of textural features for palmvein recognition including identification and verification. Although the literature proposed several approaches for palmvein recognition, the palmvein performance is still affected by identification and verification errors. As well-known, palmveins are usually described by line-based methods which enhance the vein flow. This is claimed to be unique from person to person. However, palmvein images are also characterized by texture that can be pointed out by textural features, which relies on recent and efficient hand-crafted algorithms such as Local Binary Patterns, Local Phase Quantization, Local Tera Pattern, Local directional Pattern, and Binarized Statistical Image Features (LBP, LPQ, LTP, LDP and BSIF, respectively), among others. Finally, they can be easily managed at feature-level fusion, when more than one sample can be acquired for recognition. Therefore, multi-snapshot fusion can be adopted for exploiting these features complementarity. Our goal is to show that this is confirmed for palmvein recognition, thus allowing to achieve very high recognition rates on a well-known benchmark data set.

[293]  arXiv:2008.00823 [pdf, other]
Title: Rethinking Image Deraining via Rain Streaks and Vapors
Comments: ECCV 2020
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Image and Video Processing (eess.IV)

Single image deraining regards an input image as a fusion of a background image, a transmission map, rain streaks, and atmosphere light. While advanced models are proposed for image restoration (i.e., background image generation), they regard rain streaks with the same properties as background rather than transmission medium. As vapors (i.e., rain streaks accumulation or fog-like rain) are conveyed in the transmission map to model the veiling effect, the fusion of rain streaks and vapors do not naturally reflect the rain image formation. In this work, we reformulate rain streaks as transmission medium together with vapors to model rain imaging. We propose an encoder-decoder CNN named as SNet to learn the transmission map of rain streaks. As rain streaks appear with various shapes and directions, we use ShuffleNet units within SNet to capture their anisotropic representations. As vapors are brought by rain streaks, we propose a VNet containing spatial pyramid pooling (SSP) to predict the transmission map of vapors in multi-scales based on that of rain streaks. Meanwhile, we use an encoder CNN named ANet to estimate atmosphere light. The SNet, VNet, and ANet are jointly trained to predict transmission maps and atmosphere light for rain image restoration. Extensive experiments on the benchmark datasets demonstrate the effectiveness of the proposed visual model to predict rain streaks and vapors. The proposed deraining method performs favorably against state-of-the-art deraining approaches.

[294]  arXiv:2008.00824 [pdf, other]
Title: State-of-the-art Techniques in Deep Edge Intelligence
Comments: 13 pages, 5 figures
Subjects: Artificial Intelligence (cs.AI); Distributed, Parallel, and Cluster Computing (cs.DC); Machine Learning (cs.LG); Signal Processing (eess.SP)

The potential held by the gargantuan volumes of data being generated across networks worldwide has been truly unlocked by machine learning techniques and more recently Deep Learning. The advantages offered by the latter have seen it rapidly becoming a framework of choice for various applications. However, the centralization of computational resources and the need for data aggregation have long been limiting factors in the democratization of Deep Learning applications. Edge Computing is an emerging paradigm that aims to utilize the hitherto untapped processing resources available at the network periphery. Edge Intelligence (EI) has quickly emerged as a powerful alternative to enable learning using the concepts of Edge Computing. Deep Learning-based Edge Intelligence or Deep Edge Intelligence (DEI) lies in this rapidly evolving domain. In this article, we provide an overview of the major constraints in operationalizing DEI. The major research avenues in DEI have been consolidated under Federated Learning, Distributed Computation, Compression Schemes and Conditional Computation. We also present some of the prevalent challenges and highlight prospective research avenues.

[295]  arXiv:2008.00825 [pdf, other]
Title: DSC IIT-ISM at SemEval-2020 Task 8: Bi-Fusion Techniques for Deep Meme Emotion Analysis
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Memes have become an ubiquitous social media entity and the processing and analysis of suchmultimodal data is currently an active area of research. This paper presents our work on theMemotion Analysis shared task of SemEval 2020, which involves the sentiment and humoranalysis of memes. We propose a system which uses different bimodal fusion techniques toleverage the inter-modal dependency for sentiment and humor classification tasks. Out of all ourexperiments, the best system improved the baseline with macro F1 scores of 0.357 on SentimentClassification (Task A), 0.510 on Humor Classification (Task B) and 0.312 on Scales of SemanticClasses (Task C).

[296]  arXiv:2008.00827 [pdf, other]
Title: Defining Traffic States using Spatio-temporal Traffic Graphs
Comments: Accepted in 23rd IEEE International Conference on Intelligent Transportation Systems September 20 to 23, 2020. 6 pages, 6 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Signal Processing (eess.SP)

Intersections are one of the main sources of congestion and hence, it is important to understand traffic behavior at intersections. Particularly, in developing countries with high vehicle density, mixed traffic type, and lane-less driving behavior, it is difficult to distinguish between congested and normal traffic behavior. In this work, we propose a way to understand the traffic state of smaller spatial regions at intersections using traffic graphs. The way these traffic graphs evolve over time reveals different traffic states - a) a congestion is forming (clumping), the congestion is dispersing (unclumping), or c) the traffic is flowing normally (neutral). We train a spatio-temporal deep network to identify these changes. Also, we introduce a large dataset called EyeonTraffic (EoT) containing 3 hours of aerial videos collected at 3 busy intersections in Ahmedabad, India. Our experiments on the EoT dataset show that the traffic graphs can help in correctly identifying congestion-prone behavior in different spatial regions of an intersection.

[297]  arXiv:2008.00829 [pdf]
Title: Deep Network Ensemble Learning applied to Image Classification using CNN Trees
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

Traditional machine learning approaches may fail to perform satisfactorily when dealing with complex data. In this context, the importance of data mining evolves w.r.t. building an efficient knowledge discovery and mining framework. Ensemble learning is aimed at integration of fusion, modeling and mining of data into a unified model. However, traditional ensemble learning methods are complex and have optimization or tuning problems. In this paper, we propose a simple, sequential, efficient, ensemble learning approach using multiple deep networks. The deep network used in the ensembles is ResNet50. The model draws inspiration from binary decision/classification trees. The proposed approach is compared against the baseline viz. the single classifier approach i.e. using a single multiclass ResNet50 on the ImageNet and Natural Images datasets. Our approach outperforms the baseline on all experiments on the ImageNet dataset. Code is available in https://github.com/mueedhafiz1982/CNNTreeEnsemble.git

[298]  arXiv:2008.00832 [pdf, other]
Title: Hardware locality-aware partitioning and dynamic load-balancing of unstructured meshes for large-scale scientific applications
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

We present an open-source topology-aware hierarchical unstructured mesh partitioning and load-balancing tool TreePart. The framework provides powerful abstractions to automatically detect and build hierarchical MPI topology resembling the hardware at runtime. Using this information it intelligently chooses between shared and distributed parallel algorithms for partitioning and load-balancing. It provides a range of partitioning methods by interfacing with existing shared and distributed memory parallel partitioning libraries. It provides powerful and scalable abstractions like one-sided distributed dictionaries and MPI3 shared memory based halo communicators for optimising HPC codes. The tool was successfully integrated into our in-house code and we present results from a large-eddy simulation of a combustion problem.

[299]  arXiv:2008.00836 [pdf, other]
Title: LSOTB-TIR:A Large-Scale High-Diversity Thermal Infrared Object Tracking Benchmark
Comments: accepted by ACM Mutlimedia Conference, 2020
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)

In this paper, we present a Large-Scale and high-diversity general Thermal InfraRed (TIR) Object Tracking Benchmark, called LSOTBTIR, which consists of an evaluation dataset and a training dataset with a total of 1,400 TIR sequences and more than 600K frames. We annotate the bounding box of objects in every frame of all sequences and generate over 730K bounding boxes in total. To the best of our knowledge, LSOTB-TIR is the largest and most diverse TIR object tracking benchmark to date. To evaluate a tracker on different attributes, we define 4 scenario attributes and 12 challenge attributes in the evaluation dataset. By releasing LSOTB-TIR, we encourage the community to develop deep learning based TIR trackers and evaluate them fairly and comprehensively. We evaluate and analyze more than 30 trackers on LSOTB-TIR to provide a series of baselines, and the results show that deep trackers achieve promising performance. Furthermore, we re-train several representative deep trackers on LSOTB-TIR, and their results demonstrate that the proposed training dataset significantly improves the performance of deep TIR trackers. Codes and dataset are available at https://github.com/QiaoLiuHit/LSOTB-TIR.

[300]  arXiv:2008.00840 [pdf, other]
Title: GPP, the Generic Preprocessor
Comments: 4 pages, 0 figures
Journal-ref: Journal of Open Source Software, 5(51), 2020
Subjects: Programming Languages (cs.PL)

In computer science, a preprocessor (or macro processor) is a tool that programatically alters its input, typically on the basis of inline annotations, to produce data that serves as input for another program. Preprocessors are used in software development and document processing workflows to translate or extend programming or markup languages, as well as for conditional or pattern-based generation of source code and text. Early preprocessors were relatively simple string replacement tools that were tied to specific programming languages and application domains, and while these have since given rise to more powerful, general-purpose tools, these often require the user to learn and use complex macro languages with their own syntactic conventions. In this paper, we present GPP, an extensible, general-purpose preprocessor whose principal advantage is that its syntax and behaviour can be customized to suit any given preprocessing task. This makes GPP of particular benefit to research applications, where it can be easily adapted for use with novel markup, programming, and control languages.

[301]  arXiv:2008.00842 [pdf, other]
Title: A Survey on the Evolution of Stream Processing Systems
Comments: 34 pages, 15 figures, 5 tables
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Computation and Language (cs.CL); Databases (cs.DB); Performance (cs.PF)

Stream processing has been an active research field for more than 20 years, but it is now witnessing its prime time due to recent successful efforts by the research community and numerous worldwide open-source communities. This survey provides a comprehensive overview of fundamental aspects of stream processing systems and their evolution in the functional areas of out-of-order data management, state management, fault tolerance, high availability, load management, elasticity, and reconfiguration. We review noteworthy past research findings, outline the similarities and differences between early ('00-'10) and modern ('11-'18) streaming systems, and discuss recent trends and open problems.

[302]  arXiv:2008.00843 [pdf, other]
Title: Profiles of dynamical systems and their algebra
Comments: 12 pages, 2 figures
Subjects: Discrete Mathematics (cs.DM); Commutative Algebra (math.AC)

The commutative semiring $\mathbf{D}$ of finite, discrete-time dynamical systems was introduced in order to study their (de)composition from an algebraic point of view. However, many decision problems related to solving polynomial equations over $\mathbf{D}$ are intractable (or conjectured to be so), and sometimes even undecidable. In order to take a more abstract look at those problems, we introduce the notion of ``topographic'' profile of a dynamical system $(A,f)$ with state transition function $f \colon A \to A$ as the sequence $\mathop{\mathrm{prof}} A = (|A|_i)_{i \in \mathbb{N}}$, where $|A|_i$ is the number of states having distance $i$, in terms of number of applications of $f$, from a limit cycle of $(A,f)$. We prove that the set of profiles is also a commutative semiring $(\mathbf{P},+,\times)$ with respect to operations compatible with those of $\mathbf{D}$ (namely, disjoint union and tensor product), and investigate its algebraic properties, such as its irreducible elements and factorisations, as well as the computability and complexity of solving polynomial equations over $\mathbf{P}$.

[303]  arXiv:2008.00851 [pdf, other]
Title: Planning to Score a Goal in Robotic Football with Heuristic Search
Comments: 12 pages, 5 figures, camera-ready version of the paper as to appear in ICR 2020 proceedings
Subjects: Robotics (cs.RO)

This paper considers a problem of planning an attack in robotic football (RoboCup). The problem is reduced to finding a trajectory of the ball from its current position to the opponents goals. Heuristic search algorithm, i.e. A*, is used to find such a trajectory. For this algorithm to be applicable we introduce a discretized model of the environment, i.e. a graph, as well as the core search components: cost function and heuristic function. Both are designed to take into account all the available information of the game state. We extensively evaluate the suggested approach in simulation comparing it to a range of baselines. The result of the conducted evaluation clearly shows the benefit of utilizing heuristic search within the RoboCup context.

[304]  arXiv:2008.00853 [pdf, other]
Title: OFAI-UKP at HAHA@IberLEF2019: Predicting the Humorousness of Tweets Using Gaussian Process Preference Learning
Comments: 11 pages, 1 figure
Journal-ref: Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2019), volume 2421 of CEUR Workshop Proceedings, pages 180-190, 2019
Subjects: Computation and Language (cs.CL)

Most humour processing systems to date make at best discrete, coarse-grained distinctions between the comical and the conventional, yet such notions are better conceptualized as a broad spectrum. In this paper, we present a probabilistic approach, a variant of Gaussian process preference learning (GPPL), that learns to rank and rate the humorousness of short texts by exploiting human preference judgments and automatically sourced linguistic annotations. We apply our system, which had previously shown good performance on English-language one-liners annotated with pairwise humorousness annotations, to the Spanish-language data set of the HAHA@IberLEF2019 evaluation campaign. We report system performance for the campaign's two subtasks, humour detection and funniness score prediction, and discuss some issues arising from the conversion between the numeric scores used in the HAHA@IberLEF2019 data and the pairwise judgment annotations required for our method.

[305]  arXiv:2008.00854 [pdf, other]
Title: Data-driven modeling of public risk perception and emotion on Twitter during the Covid-19 pandemic
Authors: Blas Kolic, Joel Dyer
Comments: 29 pages (main text: 19 pages; supplementary material: 10 pages), 9 Figures
Subjects: Social and Information Networks (cs.SI); Computers and Society (cs.CY)

Successful navigation of the Covid-19 pandemic is predicated on public cooperation with safety measures and appropriate perception of risk, in which emotion and attention play important roles. Signatures of public emotion and attention are present in social media data, thus natural language analysis of this text enables near-to-real-time monitoring of indicators of public risk perception. We compare key epidemiological indicators of the progression of the pandemic with indicators of the public perception of the pandemic constructed from approx. 20 million unique Covid-19-related tweets from 12 countries posted between 10th March and 14th June 2020. We find evidence of psychophysical numbing: Twitter users increasingly fixate on mortality, but in a decreasingly emotional and increasingly analytic tone. We find that the national attention on Covid-19 mortality is modelled accurately as a logarithmic or power law function of national daily Covid-19 deaths rates, implying generalisations of the Weber-Fechner and power law models of sensory perception to the collective. Our parameter estimates for these models are consistent with estimates from psychological experiments, and indicate that users in this dataset exhibit differential sensitivity by country to the national Covid-19 death rates. Our work illustrates the potential utility of social media for monitoring public risk perception and guiding public communication during crisis scenarios.

[306]  arXiv:2008.00859 [pdf, other]
Title: Adversarial Graph Representation Adaptation for Cross-Domain Facial Expression Recognition
Comments: Accepted at ACM MM 2020
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Data inconsistency and bias are inevitable among different facial expression recognition (FER) datasets due to subjective annotating process and different collecting conditions. Recent works resort to adversarial mechanisms that learn domain-invariant features to mitigate domain shift. However, most of these works focus on holistic feature adaptation, and they ignore local features that are more transferable across different datasets. Moreover, local features carry more detailed and discriminative content for expression recognition, and thus integrating local features may enable fine-grained adaptation. In this work, we propose a novel Adversarial Graph Representation Adaptation (AGRA) framework that unifies graph representation propagation with adversarial learning for cross-domain holistic-local feature co-adaptation. To achieve this, we first build a graph to correlate holistic and local regions within each domain and another graph to correlate these regions across different domains. Then, we learn the per-class statistical distribution of each domain and extract holistic-local features from the input image to initialize the corresponding graph nodes. Finally, we introduce two stacked graph convolution networks to propagate holistic-local feature within each domain to explore their interaction and across different domains for holistic-local feature co-adaptation. In this way, the AGRA framework can adaptively learn fine-grained domain-invariant features and thus facilitate cross-domain expression recognition. We conduct extensive and fair experiments on several popular benchmarks and show that the proposed AGRA framework achieves superior performance over previous state-of-the-art methods.

[307]  arXiv:2008.00861 [pdf]
Title: Processing of Crowdsourced Observations of Aircraft in a High Performance Computing Environment
Comments: 6 pages, 4 figures, 4 tables
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Computational Engineering, Finance, and Science (cs.CE)

As unmanned aircraft systems (UASs) continue to integrate into the U.S. National Airspace System (NAS), there is a need to quantify the risk of airborne collisions between unmanned and manned aircraft to support regulation and standards development. Both regulators and standards developing organizations have made extensive use of Monte Carlo collision risk analysis simulations using probabilistic models of aircraft flight. We've previously determined that the observations of manned aircraft by the OpenSky Network, a community network of ground-based sensors, are appropriate to develop models of the low altitude environment. This works overviews the high performance computing workflow designed and deployed on the Lincoln Laboratory Supercomputing Center to process 3.9 billion observations of aircraft. We then trained the aircraft models using more than 250,000 flight hours at 5,000 feet above ground level or below. A key feature of the workflow is that all the aircraft observations and supporting datasets are available as open source technologies or been released to the public domain.

[308]  arXiv:2008.00877 [pdf]
Title: Towards a Semantic Model of the GDPR Register of Processing Activities
Subjects: Computers and Society (cs.CY); Cryptography and Security (cs.CR)

A core requirement for GDPR compliance is the maintenance of a register of processing activities (ROPA). Our analysis of six ROPA templates from EU data protection regulators shows the scope and granularity of a ROPA is subject to widely varying guidance in different jurisdictions. We present a consolidated data model based on common concepts and relationships across analysed templates. We then analyse the extent of using the Data Privacy Vocabulary - a vocabulary specification for GDPR. We show that the DPV currently does not provide sufficient concepts to represent the ROPA data model and propose an extension to fill this gap. This will enable creation of a pan-EU information management framework for interoperability between organisations and regulators for GDPR compliance.

[309]  arXiv:2008.00878 [pdf, other]
Title: Fusion of Deep and Non-Deep Methods for Fast Super-Resolution of Satellite Images
Comments: Accepted in IEEE BigMM 2020
Subjects: Computer Vision and Pattern Recognition (cs.CV)

In the emerging commercial space industry there is a drastic increase in access to low cost satellite imagery. The price for satellite images depends on the sensor quality and revisit rate. This work proposes to bridge the gap between image quality and the price by improving the image quality via super-resolution (SR). Recently, a number of deep SR techniques have been proposed to enhance satellite images. However, none of these methods utilize the region-level context information, giving equal importance to each region in the image. This, along with the fact that most state-of-the-art SR methods are complex and cumbersome deep models, the time taken to process very large satellite images can be impractically high. We, propose to handle this challenge by designing an SR framework that analyzes the regional information content on each patch of the low-resolution image and judiciously chooses to use more computationally complex deep models to super-resolve more structure-rich regions on the image, while using less resource-intensive non-deep methods on non-salient regions. Through extensive experiments on a large satellite image, we show substantial decrease in inference time while achieving similar performance to that of existing deep SR methods over several evaluation measures like PSNR, MSE and SSIM.

[310]  arXiv:2008.00879 [pdf, ps, other]
Title: Failure Probability Analysis for Partial Extraction from Invertible Bloom Filters
Subjects: Information Theory (cs.IT); Distributed, Parallel, and Cluster Computing (cs.DC)

Invertible Bloom Filter (IBF) is a data structure, which employs a small set of hash functions. An IBF allows for an efficient insertion and, with high probability, for an efficient extraction of the data. However, the success probability of the extraction depends on the storage overhead of an IBF and the amount of the data stored. In an application, such as set reconciliation, where there is a need to extract data stored in the IBF, the extraction might succeed only partially, by recovering only part of the stored data. In this work, the probability of success for a partial extraction of data from an IBF is analyzed. It is shown that partial extraction could be useful in applications, such as set reconciliation. In particular, it allows for set reconciliation by using the IBF, where the storage overhead is too small to allow full extraction. An upper bound on the number of rounds in an iterative set reconciliation protocol is presented. The numerical results are derived analytically, and confirmed by the computer simulations.

[311]  arXiv:2008.00881 [pdf, other]
Title: Demystifying the Role of zk-SNARKs in Zcash
Subjects: Cryptography and Security (cs.CR)

Zero-knowledge proofs have always provided a clear solution when it comes to conveying information from a prover to a verifier or vice versa without revealing essential information about the process. Advancements in zero-knowledge have helped develop proofs which are succinct and provide non-interactive arguments of knowledge along with maintaining the zero-knowledge criteria. zk-SNARKs (Zero knowledge Succinct Non-Interactive Argument of Knowledge) are one such method that outshines itself when it comes to advancement of zero-knowledge proofs. The underlying principle of the Zcash algorithm is such that it delivers a full-fledged ledger-based digital currency with strong privacy guarantees and the root of ensuring privacy lies fully on the construction of a proper zk-SNARK. In this paper we elaborate and construct a concrete zk-SNARK proof from scratch and explain its role in the Zcash algorithm.

[312]  arXiv:2008.00892 [pdf, other]
Title: Shape Adaptor: A Learnable Resizing Module
Comments: Published at ECCV 2020
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)

We present a novel resizing module for neural networks: shape adaptor, a drop-in enhancement built on top of traditional resizing layers, such as pooling, bilinear sampling, and strided convolution. Whilst traditional resizing layers have fixed and deterministic reshaping factors, our module allows for a learnable reshaping factor. Our implementation enables shape adaptors to be trained end-to-end without any additional supervision, through which network architectures can be optimised for each individual task, in a fully automated way. We performed experiments across seven image classification datasets, and results show that by simply using a set of our shape adaptors instead of the original resizing layers, performance increases consistently over human-designed networks, across all datasets. Additionally, we show the effectiveness of shape adaptors on two other applications: network compression and transfer learning. The source code is available at: github.com/lorenmt/shape-adaptor.

[313]  arXiv:2008.00896 [pdf, other]
Title: A Dichotomy for the Generalized Model Counting Problem for Unions of Conjunctive Queries
Subjects: Databases (cs.DB); Computational Complexity (cs.CC)

We study the $generalized~model~counting~problem$, defined as follows: given a database, and a set of deterministic tuples, count the number of subsets of the database that include all deterministic tuples and satisfy the query. This problem is computationally equivalent to the evaluation of the query over a tuple-independent probabilistic database where all tuples have probabilities in $\{0,\frac{1}{2},1\}$. Previous work has established a dichotomy for Unions of Conjunctive Queries (UCQ) when the probabilities are arbitrary rational numbers, showing that, for each query, its complexity is either in polynomial time or #P-hard. The query is called $safe$ in the first case, and $unsafe$ in the second case. Here, we strengthen the hardness proof, by proving that an unsafe UCQ query remains #P-hard even if the probabilities are restricted to $\{0,\frac{1}{2},1\}$. This requires a complete redesign of the hardness proof, using new techniques. A related problem is the $model~counting~problem$, which asks for the probability of the query when the input probabilities are restricted to $\{0,\frac{1}{2}\}$. While our result does not extend to model counting for all unsafe UCQs, we prove that model counting is #P-hard for a class of unsafe queries called Type-I forbidden queries.

[314]  arXiv:2008.00899 [pdf, ps, other]
Title: Tikhonov regularization for polynomial approximation problems in Gauss quadrature points
Comments: 15pages, 5 figures
Subjects: Numerical Analysis (math.NA)

This paper is concerned with the introduction of Tikhonov regularization into least squares approximation scheme on $[-1,1]$ by orthonormal polynomials, in order to handle noisy data. This scheme includes interpolation and hyperinterpolation as special cases. With Gauss quadrature points employed as nodes, coefficients of the approximation polynomial with respect to given basis are derived in an entry-wise closed form. Under interpolatory conditions, the solution to the regularized approximation problem is rewritten in forms of two kinds of barycentric interpolation formulae, by introducing only a multiplicative correction factor into both classical barycentric formulae. An $L_2$ error bound and a uniform error bound are derived, providing similar information that Tikhonov regularization is able to reduce operator norm (Lebesgue constants) and the error term related to the level of noise, both by multiplying a correction factor which is less than one. Numerical examples show the benefits of Tikhonov regularization when data is noisy or data size is relatively small.

[315]  arXiv:2008.00900 [pdf, other]
Title: A new technique to solve linear integro-differential equations (IDEs) with modified Bernoulli polynomials
Comments: 12 pages, 09 heads, 06 figures, 46 equations
Subjects: Numerical Analysis (math.NA)

In this work, a new technique has been presented to find approximate solution of linear integro-differential equations. The method is based on modified orthonormal Bernoulli polynomials and an operational matrix thereof. The method converts a given integro-differential equation into a set of algebraic equations with unknown coefficients, which is easily obtained with help of the known functions appearing in the equation, modified Bernoulli polynomials and operational matrix. Approximate solution is obtained in form of a polynomial of required degree. The method is also applied to three well known integro-differential equations to demonstrated the accuracy and efficacy of the method. Numerical results of approximate solution are plotted to compared with available exact solutions. Considerably small error of approximation is observed through numerical comparison, which is further reducible to a required level of significance. Method is comparatively simpler and shorter than many existing methods.

[316]  arXiv:2008.00902 [pdf, other]
Title: Efficient Orchestration of Host and Remote Shared Memory for Memory Intensive Workloads
Comments: 13 pages, 23 figures, 8 tables, MemSys '20: The International Symposium on Memory Systems, Sept 2020, Washington, DC
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)

Since very few contributions to the development of an unified memory orchestration framework for efficient management of both host and remote idle memory, we present Valet, an efficient approach to orchestration of host and remote shared memory for improving performance of memory intensive workloads. The paper makes three original contributions. First, we redesign the data flow in the critical path by introducing a host-coordinated memory pool that works as a local cache to reduce the latency in the critical path of the host and remote memory orchestration. Second, Valet utilizes unused local memory across containers by managing local memory via Valet host-coordinated memory pool, which allows containers to dynamically expand and shrink their memory allocations according to the workload demands. Third, Valet provides an efficient remote memory reclaiming technique on remote peers, based on two optimizations: (1) an activity-based victim selection scheme to allow the least-active-chunk of data to be selected for serving the eviction requests and (2) a migration protocol to move the least-active-chunk of data to less-memory-pressured remote node. As a result, Valet can effectively reduce the performance impact and migration overhead on local nodes. Our extensive experiments on both NoSQL systems and Machine Learning (ML) workloads show that Valet outperforms existing representative remote paging systems with up to 226X throughput improvement and up to 98% latency decrease over conventional OS swap facility for big data and ML workloads, and by up to 5.5X throughput improvement and up to 78.4% latency decrease over the state-of-the-art remote paging systems.

[317]  arXiv:2008.00905 [pdf, other]
Title: Learning Based Methods for Traffic Matrix Estimation from Link Measurements
Comments: 10 pages
Subjects: Networking and Internet Architecture (cs.NI); Artificial Intelligence (cs.AI); Signal Processing (eess.SP)

Network traffic demand matrix is a critical input for capacity planning, anomaly detection and many other network management related tasks. The demand matrix is often computed from link load measurements. The traffic matrix (TM) estimation problem is the determination of the traffic demand matrix from link load measurements. The relationship between the link loads and the traffic matrix that generated the link load can be modeled as an under-determined linear system and has multiple feasible solutions. Therefore, prior knowledge of the traffic demand pattern has to be used in order to find a potentially feasible demand matrix. In this paper, we consider the TM estimation problem where we have information about the distribution of the demand sizes. This information can be obtained from the analysis of a few traffic matrices measured in the past or from operator experience. We develop an iterative projection based algorithm for the solution of this problem. If large number of past traffic matrices are accessible, we propose a Generative Adversarial Network (GAN) based approach for solving the problem. We compare the strengths of the two approaches and evaluate their performance for several networks using varying amounts of past data.

[318]  arXiv:2008.00910 [pdf, other]
Title: Color Texture Image Retrieval Based on Copula Multivariate Modeling in the Shearlet Domain
Comments: 37 pages, 16 figures, 4 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)

In this paper, a color texture image retrieval framework is proposed based on Shearlet domain modeling using Copula multivariate model. In the proposed framework, Gaussian Copula is used to model the dependencies between different sub-bands of the Non Subsample Shearlet Transform (NSST) and non-Gaussian models are used for marginal modeling of the coefficients. Six different schemes are proposed for modeling NSST coefficients based on the four types of neighboring defined; moreover, Kullback Leibler Divergence(KLD) close form is calculated in different situations for the two Gaussian Copula and non Gaussian functions in order to investigate the similarities in the proposed retrieval framework. The Jeffery divergence (JD) criterion, which is a symmetrical version of KLD, is used for investigating similarities in the proposed framework. We have implemented our experiments on four texture image retrieval benchmark datasets, the results of which show the superiority of the proposed framework over the existing state-of-the-art methods. In addition, the retrieval time of the proposed framework is also analyzed in the two steps of feature extraction and similarity matching, which also shows that the proposed framework enjoys an appropriate retrieval time.

[319]  arXiv:2008.00914 [pdf, other]
Title: On the Efficiency of Test Suite based Program Repair: A Systematic Assessment of 16 Automated Repair Systems for Java Programs
Subjects: Software Engineering (cs.SE)

Test-based automated program repair has been a prolific field of research in software engineering in the last decade. Many approaches have indeed been proposed, which leverage test suites as a weak, but affordable, approximation to program specifications. Although the literature regularly sets new records on the number of benchmark bugs that can be fixed, several studies increasingly raise concerns about the limitations and biases of state-of-the-art approaches. For example, the correctness of generated patches has been questioned in a number of studies, while other researchers pointed out that evaluation schemes may be misleading with respect to the processing of fault localization results. Nevertheless, there is little work addressing the efficiency of patch generation, with regard to the practicality of program repair. In this paper, we fill this gap in the literature, by providing an extensive review on the efficiency of test suite based program repair. Our objective is to assess the number of generated patch candidates, since this information is correlated to (1) the strategy to traverse the search space efficiently in order to select sensical repair attempts, (2) the strategy to minimize the test effort for identifying a plausible patch, (3) as well as the strategy to prioritize the generation of a correct patch. To that end, we perform a large-scale empirical study on the efficiency, in terms of quantity of generated patch candidates of the 16 open-source repair tools for Java programs. The experiments are carefully conducted under the same fault localization configurations to limit biases.

[320]  arXiv:2008.00916 [pdf, other]
Title: Explainable Face Recognition
Comments: To appear in the Proceedings of ECCV 2020. Project page at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Explainable face recognition is the problem of explaining why a facial matcher matches faces. In this paper, we provide the first comprehensive benchmark and baseline evaluation for explainable face recognition. We define a new evaluation protocol called the ``inpainting game'', which is a curated set of 3648 triplets (probe, mate, nonmate) of 95 subjects, which differ by synthetically inpainting a chosen facial characteristic like the nose, eyebrows or mouth creating an inpainted nonmate. An explainable face matcher is tasked with generating a network attention map which best explains which regions in a probe image match with a mated image, and not with an inpainted nonmate for each triplet. This provides ground truth for quantifying what image regions contribute to face matching. Furthermore, we provide a comprehensive benchmark on this dataset comparing five state of the art methods for network attention in face recognition on three facial matchers. This benchmark includes two new algorithms for network attention called subtree EBP and Density-based Input Sampling for Explanation (DISE) which outperform the state of the art by a wide margin. Finally, we show qualitative visualization of these network attention techniques on novel images, and explore how these explainable face recognition models can improve transparency and trust for facial matchers.

[321]  arXiv:2008.00920 [pdf, other]
Title: On The Plurality of Graphs
Comments: Manuscript accepted at NETREASON @ ECAI2020
Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL)

We conduct a series of experiments designed to empirically demonstrate the effects of varying the structural features of a multi-agent emergent communication game framework. Specifically, we model the interactions (edges) between individual agents (nodes)as the structure of a graph generated according to a series of known random graph generating algorithms. Confirming the hypothesis proposed in [10], we show that the two factors of variation induced in this work, namely 1) the graph-generating process and 2) the centrality measure according to which edges are sampled, in fact play a significant role in determining the dynamics of language emergence within the population at hand.

[322]  arXiv:2008.00923 [pdf, other]
Title: Active Object Search
Comments: Accepted at ACM MM 2020
Subjects: Computer Vision and Pattern Recognition (cs.CV)

In this work, we investigate an Active Object Search (AOS) task that is not explicitly addressed in the literature; it aims to actively performs as few action steps as possible to search and locate the target object in a 3D indoor scene. Different from classic object detection that passively receives visual information, this task encourages an intelligent agent to perform active search via reasonable action planning; thus it can better recall the target objects, especially for the challenging situations that the target is far from the agent, blocked by an obstacle and out of view. To handle this cross-modal task, we formulate a reinforcement learning framework that consists of a 3D object detector, a state controller and a cross-modal action planner to work cooperatively to find out the target object with minimal action steps. During training, we design a novel cost-sensitive active search reward that penalizes inaccurate object search and redundant action steps. To evaluate this novel task, we construct an Active Object Search (AOS) benchmark that contains 5,845 samples from 30 diverse indoor scenes. We conduct extensive qualitative and quantitative evaluations on this benchmark to demonstrate the effectiveness of the proposed approach and analyze the key factors that contribute more to address this task.

[323]  arXiv:2008.00927 [pdf, other]
Title: A parameter-dependent smoother for the multigrid method
Subjects: Numerical Analysis (math.NA)

The solution of parameter-dependent linear systems, by classical methods, leads to an arithmetic effort that grows exponentially in the number of parameters. This renders the multigrid method, which has a well understood convergence theory, infeasible. A parameter-dependent representation, e.g., a low-rank tensor format, can avoid this exponential dependence, but in these it is unknown how to calculate the inverse directly within the representation. The combination of these representations with the multigrid method requires a parameter-dependent version of the classical multigrid theory and a parameter-dependent representation of the linear system, the smoother, the prolongation and the restriction. A derived parameter-dependent version of the smoothing property, fulfilled by parameter-dependent versions of the Richardson and Jacobi methods, together with the approximation property prove the convergence of the multigrid method for arbitrary parameter-dependent representations. For a model problem low-rank tensor formats represent the parameter-dependent linear system, prolongation and restriction. The smoother, a damped Jacobi method, is directly approximated in the low-rank tensor format by using exponential sums. Proving the smoothing property for this approximation guarantees the convergence of the parameter-dependent method. Numerical experiments for the parameter-dependent model problem, with bounded parameter value range, indicate a grid size independent convergence rate.

[324]  arXiv:2008.00928 [pdf, other]
Title: Traffic Prediction Framework for OpenStreetMap using Deep Learning based Complex Event Processing and Open Traffic Cameras
Comments: 16 pages, 9 Figures, 3 Tables, Paper accepted in GIScience 2020 (now postponed to 2021)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computers and Society (cs.CY)

Displaying near-real-time traffic information is a useful feature of digital navigation maps. However, most commercial providers rely on privacy-compromising measures such as deriving location information from cellphones to estimate traffic. The lack of an open-source traffic estimation method using open data platforms is a bottleneck for building sophisticated navigation services on top of OpenStreetMap (OSM). We propose a deep learning-based Complex Event Processing (CEP) method that relies on publicly available video camera streams for traffic estimation. The proposed framework performs near-real-time object detection and objects property extraction across camera clusters in parallel to derive multiple measures related to traffic with the results visualized on OpenStreetMap. The estimation of object properties (e.g. vehicle speed, count, direction) provides multidimensional data that can be leveraged to create metrics and visualization for congestion beyond commonly used density-based measures. Our approach couples both flow and count measures during interpolation by considering each vehicle as a sample point and their speed as weight. We demonstrate multidimensional traffic metrics (e.g. flow rate, congestion estimation) over OSM by processing 22 traffic cameras from London streets. The system achieves a near-real-time performance of 1.42 seconds median latency and an average F-score of 0.80.

[325]  arXiv:2008.00932 [pdf, other]
Title: AUTSL: A Large Scale Multi-modal Turkish Sign Language Dataset and Baseline Methods
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Sign language recognition is a challenging problem where signs are identified by simultaneous local and global articulations of multiple sources, i.e. hand shape and orientation, hand movements, body posture and facial expressions. Solving this problem computationally for a large vocabulary of signs in real life settings is still a challenge, even with the state-of-the-art models. In this study, we present a new large-scale multi-modal Turkish Sign Language dataset (AUTSL) with a benchmark and provide baseline models for performance evaluations. Our dataset consists of 226 signs performed by 43 different signers and 38,336 isolated sign video samples in total. Samples contain a wide variety of backgrounds recorded in indoor and outdoor environments. Moreover, spatial positions and the postures of signers also vary in the recordings. Each sample is recorded with Microsoft Kinect v2 and contains color image (RGB), depth and skeleton data modalities.
We prepared benchmark training and test sets for user independent assessments of the models. We trained several deep learning based models and provide empirical evaluations using the benchmark; we used Convolutional Neural Networks (CNNs) to extract features, unidirectional and bidirectional Long Short-Term Memory (LSTM) models to characterize temporal information. We also incorporated feature pooling modules and temporal attention to our models to improve the performances. Using the benchmark test set, we obtained 62.02% accuracy with RGB+Depth data and 47.62% accuracy with RGB only data with the CNN+FPM+BLSTM+Attention model. Our dataset will be made publicly available at https://cvml.ankara.edu.tr.

[326]  arXiv:2008.00936 [pdf, other]
Title: Detection and Localization of Robotic Tools in Robot-Assisted Surgery Videos Using Deep Neural Networks for Region Proposal and Detection
Journal-ref: IEEE Transactions on Medical Imaging 36 (2017) 1542-1549
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Video understanding of robot-assisted surgery (RAS) videos is an active research area. Modeling the gestures and skill level of surgeons presents an interesting problem. The insights drawn may be applied in effective skill acquisition, objective skill assessment, real-time feedback, and human-robot collaborative surgeries. We propose a solution to the tool detection and localization open problem in RAS video understanding, using a strictly computer vision approach and the recent advances of deep learning. We propose an architecture using multimodal convolutional neural networks for fast detection and localization of tools in RAS videos. To our knowledge, this approach will be the first to incorporate deep neural networks for tool detection and localization in RAS videos. Our architecture applies a Region Proposal Network (RPN), and a multi-modal two stream convolutional network for object detection, to jointly predict objectness and localization on a fusion of image and temporal motion cues. Our results with an Average Precision (AP) of 91% and a mean computation time of 0.1 seconds per test frame detection indicate that our study is superior to conventionally used methods for medical imaging while also emphasizing the benefits of using RPN for precision and efficiency. We also introduce a new dataset, ATLAS Dione, for RAS video understanding. Our dataset provides video data of ten surgeons from Roswell Park Cancer Institute (RPCI) (Buffalo, NY) performing six different surgical tasks on the daVinci Surgical System (dVSS R ) with annotations of robotic tools per frame.

[327]  arXiv:2008.00937 [pdf, other]
Title: The Need for Advanced Intelligence in NFV Management and Orchestration
Comments: To Appear in IEEE Network
Subjects: Networking and Internet Architecture (cs.NI); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

With the constant demand for connectivity at an all-time high, Network Service Providers (NSPs) are required to optimize their networks to cope with rising capital and operational expenditures required to meet the growing connectivity demand. A solution to this challenge was presented through Network Function Virtualization (NFV). As network complexity increases and futuristic networks take shape, NSPs are required to incorporate an increasing amount of operational efficiency into their NFV-enabled networks. One such technique is Machine Learning (ML), which has been applied to various entities in NFV-enabled networks, most notably in the NFV Orchestrator. While traditional ML provides tremendous operational efficiencies, including real-time and high-volume data processing, challenges such as privacy, security, scalability, transferability, and concept drift hinder its widespread implementation. Through the adoption of Advanced Intelligence techniques such as Reinforcement Learning and Federated Learning, NSPs can leverage the benefits of traditional ML while simultaneously addressing the major challenges traditionally associated with it. This work presents the benefits of adopting these advanced techniques, provides a list of potential use cases and research topics, and proposes a bottom-up micro-functionality approach to applying these methods of Advanced Intelligence to NFV Management and Orchestration.

[328]  arXiv:2008.00938 [pdf, other]
Title: Implicit Regularization in Deep Learning: A View from Function Space
Comments: 24 pages. A preliminary version of this work has been presented at the NeurIPS 2019 Workshops on "Machine Learning with Guarantees" and "Science meets Engineering of Deep Learning"
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

We approach the problem of implicit regularization in deep learning from a geometrical viewpoint. We highlight a possible regularization effect induced by a dynamical alignment of the neural tangent features introduced by Jacot et al, along a small number of task-relevant directions. By extrapolating a new analysis of Rademacher complexity bounds in linear models, we propose and study a new heuristic complexity measure for neural networks which captures this phenomenon, in terms of sequences of tangent kernel classes along in the learning trajectories.

[329]  arXiv:2008.00942 [pdf, other]
Title: Improving Generative Adversarial Networks with Local Coordinate Coding
Comments: 20 pages, 5 figures
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV); Machine Learning (stat.ML)

Generative adversarial networks (GANs) have shown remarkable success in generating realistic data from some predefined prior distribution (e.g., Gaussian noises). However, such prior distribution is often independent of real data and thus may lose semantic information (e.g., geometric structure or content in images) of data. In practice, the semantic information might be represented by some latent distribution learned from data. However, such latent distribution may incur difficulties in data sampling for GANs. In this paper, rather than sampling from the predefined prior distribution, we propose an LCCGAN model with local coordinate coding (LCC) to improve the performance of generating data. First, we propose an LCC sampling method in LCCGAN to sample meaningful points from the latent manifold. With the LCC sampling method, we can exploit the local information on the latent manifold and thus produce new data with promising quality. Second, we propose an improved version, namely LCCGAN++, by introducing a higher-order term in the generator approximation. This term is able to achieve better approximation and thus further improve the performance. More critically, we derive the generalization bound for both LCCGAN and LCCGAN++ and prove that a low-dimensional input is sufficient to achieve good generalization performance. Extensive experiments on four benchmark datasets demonstrate the superiority of the proposed method over existing GANs.

[330]  arXiv:2008.00946 [pdf, other]
Title: Conditional Latent Block Model: a Multivariate Time Series Clustering Approach for Autonomous Driving Validation
Comments: 17 pages, 15 figures
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

Autonomous driving systems validation remains one of the biggest challenges car manufacturers must tackle in order to provide safe driverless cars. The high complexity stems from several factors: the multiplicity of vehicles, embedded systems, use cases, and the very high required level of reliability for the driving system to be at least as safe as a human driver. In order to circumvent these issues, large scale simulations reproducing this huge variety of physical conditions are intensively used to test driverless cars. Therefore, the validation step produces a massive amount of data, including many time-indexed ones, to be processed. In this context, building a structure in the feature space is mandatory to interpret the various scenarios. In this work, we propose a new co-clustering approach adapted to high-dimensional time series analysis, that extends the standard model-based co-clustering. The FunCLBM model extends the recently proposed Functional Latent Block Model and allows to create a dependency structure between row and column clusters. This structured partition acts as a feature selection method, that provides several clustering views of a dataset, while discriminating irrelevant features. In this workflow, times series are projected onto a common interpolated low-dimensional frequency space, which allows to optimize the projection basis. In addition, FunCLBM refines the definition of each latent block by performing block-wise dimension reduction and feature selection. We propose a SEM-Gibbs algorithm to infer this model, as well as a dedicated criterion to select the optimal nested partition. Experiments on both simulated and real-case Renault datasets shows the effectiveness of the proposed tools and the adequacy to our use case.

[331]  arXiv:2008.00947 [pdf, other]
Title: Pre-training for Video Captioning Challenge 2020 Summary
Subjects: Computer Vision and Pattern Recognition (cs.CV)

The Pre-training for Video Captioning Challenge 2020 Summary: results and challenge participants' technical reports.

[332]  arXiv:2008.00948 [pdf, other]
Title: Frame-To-Frame Consistent Semantic Segmentation
Comments: ACVRW20
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

In this work, we aim for temporally consistent semantic segmentation throughout frames in a video. Many semantic segmentation algorithms process images individually which leads to an inconsistent scene interpretation due to illumination changes, occlusions and other variations over time. To achieve a temporally consistent prediction, we train a convolutional neural network (CNN) which propagates features through consecutive frames in a video using a convolutional long short term memory (ConvLSTM) cell. Besides the temporal feature propagation, we penalize inconsistencies in our loss function. We show in our experiments that the performance improves when utilizing video information compared to single frame prediction. The mean intersection over union (mIoU) metric on the Cityscapes validation set increases from 45.2 % for the single frames to 57.9 % for video data after implementing the ConvLSTM to propagate features trough time on the ESPNet. Most importantly, inconsistency decreases from 4.5 % to 1.3 % which is a reduction by 71.1 %. Our results indicate that the added temporal information produces a frame-to-frame consistent and more accurate image understanding compared to single frame processing.

[333]  arXiv:2008.00951 [pdf, other]
Title: Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation
Subjects: Computer Vision and Pattern Recognition (cs.CV)

We present a generic image-to-image translation framework, Pixel2Style2Pixel (pSp). Our pSp framework is based on a novel encoder network that directly generates a series of style vectors which are fed into a pretrained StyleGAN generator, forming the extended W+ latent space. We first show that our encoder can directly embed real images into W+, with no additional optimization. We further introduce a dedicated identity loss which is shown to achieve improved performance in the reconstruction of an input image. We demonstrate pSp to be a simple architecture that, by leveraging a well-trained, fixed generator network, can be easily applied on a wide-range of image-to-image translation tasks. Solving these tasks through the style representation results in a global approach that does not rely on a local pixel-to-pixel correspondence and further supports multi-modal synthesis via the resampling of styles. Notably, we demonstrate that pSp can be trained to align a face image to a frontal pose without any labeled data, generate multi-modal results for ambiguous tasks such as conditional face generation from segmentation maps, and construct high-resolution images from corresponding low-resolution images.

[334]  arXiv:2008.00956 [pdf, other]
Title: Interactive Text Graph Mining with a Prolog-based Dialog Engine
Comments: Under consideration in Theory and Practice of Logic Programming (TPLP). arXiv admin note: substantial text overlap with arXiv:1909.09742
Subjects: Computation and Language (cs.CL); Logic in Computer Science (cs.LO)

On top of a neural network-based dependency parser and a graph-based natural language processing module we design a Prolog-based dialog engine that explores interactively a ranked fact database extracted from a text document.
We reorganize dependency graphs to focus on the most relevant content elements of a sentence and integrate sentence identifiers as graph nodes.
Additionally, after ranking the graph we take advantage of the implicit semantic information that dependency links and WordNet bring in the form of subject-verb-object, is-a and part-of relations.
Working on the Prolog facts and their inferred consequences, the dialog engine specializes the text graph with respect to a query and reveals interactively the document's most relevant content elements.
The open-source code of the integrated system is available at https://github.com/ptarau/DeepRank .
Under consideration in Theory and Practice of Logic Programming (TPLP).

[335]  arXiv:2008.00958 [pdf]
Title: SSGMT: A Secure Smart Grid Monitoring Technique
Authors: Sohini Roy
Comments: 7 pages, 3 figures. arXiv admin note: substantial text overlap with arXiv:2005.13093
Subjects: Networking and Internet Architecture (cs.NI); Systems and Control (eess.SY)

Critical infrastructure systems like power grid require an improved critical in-formation infrastructure (CII) that can not only help in monitoring of the crit-ical entities but also take part in failure analysis and self-healing. Efficient designing of a CII is challenging as each kind of communication technology has its own advantages and disadvantages. Wired networks are highly scala-ble and secure, but they are neither cost effective nor dynamic in nature. Wireless communication technologies on the other hand are easy to deploy, low cost etc. but they are vulnerable to cyber-attacks. In order to optimize cost, power consumption, dynamic nature, accuracy and scalability a hybrid communication network is designed in this paper where a portion of the communication network is built using wireless sensor networks (WSN) and the rest is a wired network of fiber optic channels. To offer seamless opera-tion of the hybrid communication network and provide security a Secure Smart Grid Monitoring Technique (SSGMT) is also proposed. The perfor-mance of the proposed hybrid CII for the generation and transmission sys-tem of power grid coupled with the SSGMT during different cyber-attacks is tested using NS2 simulator. The simulation results show that the SSGMT for a joint power communication network of IEEE 118-Bus system performs better than the prevailing wireless CIIs like Lo-ADI and Modified AODV.

[336]  arXiv:2008.00960 [pdf, ps, other]
Title: New Results on the Storage-Retrieval Tradeoff in Private Information Retrieval Systems
Subjects: Information Theory (cs.IT)

In a private information retrieval (PIR) system, the user needs to retrieve one of the possible messages from a set of storage servers, but wishes to keep the identity of requested message private from any given server. Existing efforts in this area have made it clear that the efficiency of the retrieval will be impacted significantly by the amount of the storage space allowed at the servers. In this work, we consider the tradeoff between the storage cost and the retrieval cost. We first present three fundamental results: 1) a regime-wise 2-approximate characterization of the optimal tradeoff, 2) a cyclic permutation lemma that can produce more sophisticated codes from simpler ones, and 3) a relaxed entropic linear program (LP) lower bound that has a polynomial complexity. Equipped with the cyclic permutation lemma, we then propose two novel code constructions, and by applying the lemma, obtain new storage-retrieval points. Furthermore, we derive more explicit lower bounds by utilizing only a subset of the constraints in the relaxed entropic LP in a systematic manner. Though the new upper bound and lower bound do not lead to a more precise approximate characterization in general, they are significantly tighter than the existing art.

[337]  arXiv:2008.00961 [pdf, other]
Title: Accelerating Genome Analysis: A Primer on an Ongoing Journey
Subjects: Hardware Architecture (cs.AR); Genomics (q-bio.GN); Computation (stat.CO)

Genome analysis fundamentally starts with a process known as read mapping, where sequenced fragments of an organism's genome are compared against a reference genome. Read mapping is currently a major bottleneck in the entire genome analysis pipeline, because state-of-the-art genome sequencing technologies are able to sequence a genome much faster than the computational techniques employed to analyze the genome. We describe the ongoing journey in significantly improving the performance of read mapping. We explain state-of-the-art algorithmic methods and hardware-based acceleration approaches. Algorithmic approaches exploit the structure of the genome as well as the structure of the underlying hardware. Hardware-based acceleration approaches exploit specialized microarchitectures or various execution paradigms (e.g., processing inside or near memory). We conclude with the challenges of adopting these hardware-accelerated read mappers.

[338]  arXiv:2008.00962 [pdf, other]
Title: Deep Traffic Sign Detection and Recognition Without Target Domain Real Images
Comments: arXiv admin note: text overlap with arXiv:1907.09679
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Deep learning has been successfully applied to several problems related to autonomous driving, often relying on large databases of real target-domain images for proper training. The acquisition of such real-world data is not always possible in the self-driving context, and sometimes their annotation is not feasible. Moreover, in many tasks, there is an intrinsic data imbalance that most learning-based methods struggle to cope with. Particularly, traffic sign detection is a challenging problem in which these three issues are seen altogether. To address these challenges, we propose a novel database generation method that requires only (i) arbitrary natural images, i.e., requires no real image from the target-domain, and (ii) templates of the traffic signs. The method does not aim at overcoming the training with real data, but to be a compatible alternative when the real data is not available. The effortlessly generated database is shown to be effective for the training of a deep detector on traffic signs from multiple countries. On large data sets, training with a fully synthetic data set almost matches the performance of training with a real one. When compared to training with a smaller data set of real images, training with synthetic images increased the accuracy by 12.25%. The proposed method also improves the performance of the detector when target-domain data are available.

[339]  arXiv:2008.00965 [pdf, other]
Title: End-to-end Full Projector Compensation
Comments: Source code: this https URL arXiv admin note: text overlap with arXiv:1908.06246, arXiv:1904.04335
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Full projector compensation aims to modify a projector input image to compensate for both geometric and photometric disturbance of the projection surface. Traditional methods usually solve the two parts separately and may suffer from suboptimal solutions. In this paper, we propose the first end-to-end differentiable solution, named CompenNeSt++, to solve the two problems jointly. First, we propose a novel geometric correction subnet, named WarpingNet, which is designed with a cascaded coarse-to-fine structure to learn the sampling grid directly from sampling images. Second, we propose a novel photometric compensation subnet, named CompenNeSt, which is designed with a siamese architecture to capture the photometric interactions between the projection surface and the projected images, and to use such information to compensate the geometrically corrected images. By concatenating WarpingNet with CompenNeSt, CompenNeSt++ accomplishes full projector compensation and is end-to-end trainable. Third, to improve practicability, we propose a novel synthetic data-based pre-training strategy to significantly reduce the number of training images and training time. Moreover, we construct the first setup-independent full compensation benchmark to facilitate future studies. In thorough experiments, our method shows clear advantages over prior art with promising compensation quality and meanwhile being practically convenient.

[340]  arXiv:2008.00969 [pdf, other]
Title: Predicted Composite Signed-Distance Fields for Real-Time Motion Planning in Dynamic Environments
Comments: 8 pages, 8 figures, 1 table, submitted to IEEE Robotics and Automation Letters (RA-L)
Subjects: Robotics (cs.RO); Systems and Control (eess.SY)

We present a novel framework for motion planning in dynamic environments that accounts for the predicted trajectories of moving objects in the scene. We explore the use of composite signed-distance fields in motion planning and detail how they can be used to generate signed-distance fields (SDFs) in real-time to incorporate predicted obstacle motions. We benchmark our approach of using composite SDFs against performing exact SDF calculations on the workspace occupancy grid. Our proposed technique generates predictions substantially faster and typically exhibits an 81--97% reduction in time for subsequent predictions. We integrate our framework with GPMP2 to demonstrate a full implementation of our approach in real-time, enabling a 7-DoF Panda arm to smoothly avoid a moving robot.

[341]  arXiv:2008.00975 [pdf, other]
Title: SeCo: Exploring Sequence Supervision for Unsupervised Representation Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV)

A steady momentum of innovations and breakthroughs has convincingly pushed the limits of unsupervised image representation learning. Compared to static 2D images, video has one more dimension (time). The inherent supervision existing in such sequential structure offers a fertile ground for building unsupervised learning models. In this paper, we compose a trilogy of exploring the basic and generic supervision in the sequence from spatial, spatiotemporal and sequential perspectives. We materialize the supervisory signals through determining whether a pair of samples is from one frame or from one video, and whether a triplet of samples is in the correct temporal order. We uniquely regard the signals as the foundation in contrastive learning and derive a particular form named Sequence Contrastive Learning (SeCo). SeCo shows superior results under the linear protocol on action recognition (Kinetics), untrimmed activity recognition (ActivityNet) and object tracking (OTB-100). More remarkably, SeCo demonstrates considerable improvements over recent unsupervised pre-training techniques, and leads the accuracy by 2.96% and 6.47% against fully-supervised ImageNet pre-training in action recognition task on UCF101 and HMDB51, respectively.

[342]  arXiv:2008.00977 [pdf, other]
Title: Inter-Coder Agreement for Improving Reliability in Software Engineering Qualitative Research
Comments: 37 pages, 24 figures, 12 tables. arXiv admin note: text overlap with arXiv:2005.10388
Subjects: Software Engineering (cs.SE); Methodology (stat.ME)

In recent years, the research on empirical software engineering that uses qualitative data analysis (e.g. thematic analysis, content analysis, and grounded theory) is increasing. However, most of this research does not deep into the reliability and validity of findings, specifically in the reliability of coding, despite there exist a variety of statistical techniques known as Inter-Coder Agreement (ICA) for analyzing consensus in team coding.
This paper aims to establish a novel theoretical framework that enables a methodological approach for conducting this validity analysis. This framework is based on a set of statistics for measuring the degree of agreement that different coders achieve when judging a common matter. We analyze different reliability coefficients and provide detailed examples of calculation, with special attention to Krippendorff's $\alpha$ coefficients. We systematically review several variants of Krippendorff's $\alpha$ reported in the literature and provide a novel common mathematical framework in which all of them are unified through a universal $\alpha$ coefficient.
Finally, this paper provides a detailed guide of the use of this theoretical framework in a large case study on DevOps culture. We explain how $\alpha$ coefficients is computed and interpreted using a widely used software tool for qualitative analysis like Atlas.ti.

[343]  arXiv:2008.00979 [pdf, other]
Title: Anakatabatic Inertia: Particle-wise Adaptive Inertia for PSO
Comments: 6 pages, 5 figures, 2 tables. arXiv admin note: substantial text overlap with arXiv:1906.02474
Subjects: Neural and Evolutionary Computing (cs.NE)

Throughout the course of the development of Particle Swarm Optimization, particle inertia has been established as an important aspect of the method for researching possible method improvements. As a continuation of our previous research, we propose a novel generalized technique of inertia weight adaptation based on individual particle's fitness improvement, called anakatabatic inertia. This technique allows for adapting inertia weight value for each particle corresponding to the particle's increasing or decreasing fitness, i.e. conditioned by particle's ascending (anabatic) or descending (katabatic) movement. The proposed inertia weight control framework was metaoptimized and tested on the 30 test functions of the CEC 2014 test suite. The conducted procedure produced four anakatabatic models, two for each of the PSO methods used (Standard PSO and TVAC-PSO). The benchmark testing results show that using the proposed anakatabatic inertia models reliably yield moderate improvements in accuracy of Standard PSO (final fitness minimum reduced up to 0.09 orders of magnitude) and rather strong improvements for TVAC-PSO (final fitness minimum reduced up to 0.59 orders of magnitude), mostly without any adverse effects on the method's performance.

[344]  arXiv:2008.00987 [pdf, other]
Title: Age of Information-Reliability Trade-offs in Energy Harvesting Sensor Networks
Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Subjects: Information Theory (cs.IT); Signal Processing (eess.SP)

Age of Information (AoI) is a recently defined quantity, which measures the freshness of information in a communication scheme. In this paper, we analyze a network that consists of a sensor node, an energy source and a receiver. The energy source is broadcasting energy and the sensor is charging its battery using energy-harvesting technologies. Whenever the battery gets fully charged, the sensor measures some quantity (called its status) from an environment, and (or) transmits its status to the receiver. The full analysis of AoI of this network, in the setting when each status is sent once, is given previously. However, that approach does not present a reliability guarantee better than the success probability of one transmission. In this paper, we present a closed form expression for the AoI of a deterministic and a randomized scheme that guarantee a desired probability of successful transmission for each status, alongside with a zero-error scheme. Furthermore, we define a novel notion called AoI-reliability trade-off and present the AoI-reliability trade-offs of our schemes. Additionally, we show that numerical results match our theoretical findings.

[345]  arXiv:2008.00989 [pdf, other]
Title: Exposed Buffer Architecture for Programmable and Stateful Networking
Subjects: Networking and Internet Architecture (cs.NI)

Exposed Buffer Architecture addresses network ossification by confronting the virtualization that is inherent to the Internet architecture. In the Internet stack below the Network Layer, the Link layer models services that are local to network nodes, or that connect them in local area networks. Aggregating these low level resources in the implementation of IP to implement wide area services serves two different purposes: 1) It virtualizes local services, enabling interoperability through the adoption of a common model, and 2) It hides the topology of local infrastructure. The premise of Exposed Buffer Architecture is that we can separate these two tasks.

[346]  arXiv:2008.00992 [pdf, other]
Title: An Exploration of Target-Conditioned Segmentation Methods for Visual Object Trackers
Comments: European Conference on Computer Vision (ECCV) 2020, Visual Object Tracking Challenge VOT2020 workshop
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Visual object tracking is the problem of predicting a target object's state in a video. Generally, bounding-boxes have been used to represent states, and a surge of effort has been spent by the community to produce efficient causal algorithms capable of locating targets with such representations. As the field is moving towards binary segmentation masks to define objects more precisely, in this paper we propose to extensively explore target-conditioned segmentation methods available in the computer vision community, in order to transform any bounding-box tracker into a segmentation tracker. Our analysis shows that such methods allow trackers to compete with recently proposed segmentation trackers, while performing quasi real-time.

[347]  arXiv:2008.00993 [pdf]
Title: Independent publishers and social networks in the 21st century: the balance of power in the transatlantic Spanish-language book market
Subjects: Social and Information Networks (cs.SI); Digital Libraries (cs.DL)

The present paper uses Twitter to analyze the current state of the worldwide, Spanish-language, independent publishing market. The main purposes are to determine whether certain Latin American Spanish-language independent publishers function as gatekeepers of World Literature and to analyze the geopolitical structure of this global market, addressing both the Europe-America dialectic and neocolonial practices. After selecting the sample of publishers, we conducted a search for their Twitter profiles and located 131; we then downloaded data from the corresponding Twitter APIs. Finally, we applied social network analysis to study the presence of and interaction between our sample of independent publishers on this social media. Our results provide data-based evidence supporting the hypothesis of some literary critics who suggest that in Latin America, certain publishers act as gatekeepers to the mainstream book market. Therefore, Twitter could be considered a valid source of information to address the independent book market in Spanish. By extension, this approach could be applied to other cultural industries in which small and medium-sized agents develop a digital presence in social media. This paper combines social network analysis and literary criticism to provide new evidence about the Spanish-language book market. It helps validate the aforementioned hypothesis, proposed by literary critics, and opens up new paths along which to pursue an interpretative, comparative analysis.

[348]  arXiv:2008.00994 [pdf, other]
Title: Cluster-Based Cooperative Digital Over-the-Air Aggregation for Wireless Federated Edge Learning
Subjects: Information Theory (cs.IT); Machine Learning (cs.LG)

In this paper, we study a federated learning system at the wireless edge that uses over-the-air computation (AirComp). In such a system, users transmit their messages over a multi-access channel concurrently to achieve fast model aggregation. Recently, an AirComp scheme based on digital modulation has been proposed featuring one-bit gradient quantization and truncated channel inversion at users and a majority-voting based decoder at the fusion center (FC). We propose an improved digital AirComp scheme to relax its requirements on the transmitters, where users perform phase correction and transmit with full power. To characterize the decoding failure probability at the FC, we introduce the normalized detection signal-to-noise ratio (SNR), which can be interpreted as the effective participation rate of users. To mitigate wireless fading, we further propose a cluster-based system and design the relay selection scheme based on the normalized detection SNR. By local data fusion within each cluster and relay selection, our scheme can fully exploit spatial diversity to increase the effective number of voting users and accelerate model convergence.

[349]  arXiv:2008.00997 [pdf, other]
Title: Segmenting overlapped objects in images. A study to support the diagnosis of sickle cell disease
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)

Overlapped objects are found on multiple kinds of images, they are a source of problem due its partial information. Multiple types of algorithm are used to address this problem from simple and naive methods to more complex ones. In this work we propose a new method for the segmentation of overlapped object. Finally we compare the results of this algorithm with the state-of-art in two experiments: one with a new dataset, developed specially for this work, and red blood smears from sickle-cell disease patients.

[350]  arXiv:2008.01000 [pdf, other]
Title: Predicting Channel Quality Indicators for 5G Downlink Scheduling in a Deep Learning Approach
Subjects: Networking and Internet Architecture (cs.NI)

5G networks provide more bandwidth and more complex control to enhance user's experiences, while also requiring a more accurate estimation of the communication channels compared with previous mobile networks. In this paper, we propose a channel quality indicator (CQI) prediction method in a deep learning approach in that a Long Short-Term Memory (LSTM) algorithm. An online training module is introduced for the downlink scheduling in the 5G New Radio (NR) system, to reduce the negative impact of outdated CQI for communication degradation, especially in high-speed mobility scenarios. First, we analyze the impact of outdated CQI in the downlink scheduling of the 5G NR system. Then, we design a data generation and online training module to evaluate our prediction method in ns-3. The simulation results show that the proposed LSTM method outperforms the Feedforward Neural Networks (FNN) method on improving the system performance of the downlink transmission. Our study may provide insights into designing new deep learning algorithms to enhance the network performance of the 5G NR system.

[351]  arXiv:2008.01003 [pdf, other]
Title: Teacher-Student Training and Triplet Loss for Facial Expression Recognition under Occlusion
Comments: Accepted at ICPR 2020
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

In this paper, we study the task of facial expression recognition under strong occlusion. We are particularly interested in cases where 50% of the face is occluded, e.g. when the subject wears a Virtual Reality (VR) headset. While previous studies show that pre-training convolutional neural networks (CNNs) on fully-visible (non-occluded) faces improves the accuracy, we propose to employ knowledge distillation to achieve further improvements. First of all, we employ the classic teacher-student training strategy, in which the teacher is a CNN trained on fully-visible faces and the student is a CNN trained on occluded faces. Second of all, we propose a new approach for knowledge distillation based on triplet loss. During training, the goal is to reduce the distance between an anchor embedding, produced by a student CNN that takes occluded faces as input, and a positive embedding (from the same class as the anchor), produced by a teacher CNN trained on fully-visible faces, so that it becomes smaller than the distance between the anchor and a negative embedding (from a different class than the anchor), produced by the student CNN. Third of all, we propose to combine the distilled embeddings obtained through the classic teacher-student strategy and our novel teacher-student strategy based on triplet loss into a single embedding vector. We conduct experiments on two benchmarks, FER+ and AffectNet, with two CNN architectures, VGG-f and VGG-face, showing that knowledge distillation can bring significant improvements over the state-of-the-art methods designed for occluded faces in the VR setting.

[352]  arXiv:2008.01009 [pdf, other]
Title: The Splay-List: A Distribution-Adaptive Concurrent Skip-List
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Data Structures and Algorithms (cs.DS)

The design and implementation of efficient concurrent data structures have seen significant attention. However, most of this work has focused on concurrent data structures providing good \emph{worst-case} guarantees. In real workloads, objects are often accessed at different rates, since access distributions may be non-uniform. Efficient distribution-adaptive data structures are known in the sequential case, e.g. the splay-trees; however, they often are hard to translate efficiently in the concurrent case.
In this paper, we investigate distribution-adaptive concurrent data structures and propose a new design called the splay-list. At a high level, the splay-list is similar to a standard skip-list, with the key distinction that the height of each element adapts dynamically to its access rate: popular elements ``move up,'' whereas rarely-accessed elements decrease in height. We show that the splay-list provides order-optimal amortized complexity bounds for a subset of operations while being amenable to efficient concurrent implementation. Experimental results show that the splay-list can leverage distribution-adaptivity to improve on the performance of classic concurrent designs, and can outperform the only previously-known distribution-adaptive design in certain settings.

[353]  arXiv:2008.01013 [pdf, other]
Swipe dynamics as a means of authentication: results from a Bayesian unsupervised approach
Comments: 9 pages, 8 figures
Subjects: Cryptography and Security (cs.CR); Applications (stat.AP)

The field of behavioural biometrics stands as an appealing alternative to more traditional biometric systems, due to the ease of use from a user perspective and the potential robustness to presentation attacks. Due to the nature of the characteristic features being modelled, a person's behaviour can be measured in a myriad of ways, growing with the evolution of embedded sensor technologies. This paper focuses its attention to a specific type of behavioural biometric utilising swipe dynamics, also often referred to as touch gestures.
One characteristic of swipe authentication and new behavioural biometrics in general is the lack of available data to train and validate models, which makes unsupervised models particularly suited to the task. There is a strong usability requirement to be able to enrol a user with as few attempts as possible. From a machine learning perspective, this presents the classic curse of dimensionality problem, where one needs to train a model on a high dimensional feature space with only a few observations. The problem of modelling behavioural biometrics in this setting is discussed as one of learning probability distribution functions. This is viewed through the lens of Bayesian unsupervised models, which are well-suited to the low-data problem.
This paper presents results from a set of experiments consisting of 38 sessions with labelled victim as well as blind and over-the-shoulder presentation attacks. Three models are compared using this dataset; two single-mode models: a shrunk covariance and a Bayesian Gaussian, as well as a Bayesian non-parametric infinite mixture of Gaussians, modelled as a Dirichlet Process (DP). Equal Error Rates (EER) for the three models are compared and attention is paid to how EER varies across the two single-mode models at low number of enrolment samples.

[354]  arXiv:2008.01018 [pdf, other]
Title: RareAct: A video dataset of unusual interactions
Subjects: Computer Vision and Pattern Recognition (cs.CV)

This paper introduces a manually annotated video dataset of unusual actions, namely RareAct, including actions such as "blend phone", "cut keyboard" and "microwave shoes". RareAct aims at evaluating the zero-shot and few-shot compositionality of action recognition models for unlikely compositions of common action verbs and object nouns. It contains 122 different actions which were obtained by combining verbs and nouns rarely co-occurring together in the large-scale textual corpus from HowTo100M, but that frequently appear separately. We provide benchmarks using a state-of-the-art HowTo100M pretrained video and text model and show that zero-shot and few-shot compositionality of actions remains a challenging and unsolved task.

[355]  arXiv:2008.01023 [pdf, other]
Title: From Design Draft to Real Attire: Unaligned Fashion Image Translation
Comments: Accepted by ACMMM 2020. Our project website is available at: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)

Fashion manipulation has attracted growing interest due to its great application value, which inspires many researches towards fashion images. However, little attention has been paid to fashion design draft. In this paper, we study a new unaligned translation problem between design drafts and real fashion items, whose main challenge lies in the huge misalignment between the two modalities. We first collect paired design drafts and real fashion item images without pixel-wise alignment. To solve the misalignment problem, our main idea is to train a sampling network to adaptively adjust the input to an intermediate state with structure alignment to the output. Moreover, built upon the sampling network, we present design draft to real fashion item translation network (D2RNet), where two separate translation streams that focus on texture and shape, respectively, are combined tactfully to get both benefits. D2RNet is able to generate realistic garments with both texture and shape consistency to their design drafts. We show that this idea can be effectively applied to the reverse translation problem and present R2DNet accordingly. Extensive experiments on unaligned fashion design translation demonstrate the superiority of our method over state-of-the-art methods. Our project website is available at: https://victoriahy.github.io/MM2020/ .

[356]  arXiv:2008.01024 [pdf, other]
Title: Finite-time Control of Discrete-time Positive Linear Systems via Convex Optimization
Subjects: Systems and Control (eess.SY); Optimization and Control (math.OC)

In this paper, we study a class of finite-time control problems for discrete-time positive linear systems with time-varying state parameters. Although several interesting control problems appearing in population biology, economics, and network epidemiology can be described as the class of finite-time control problems, an efficient solution to the control problem has not been yet found in the literature. In this paper, we propose an optimization framework for solving the class of finite-time control problems via convex optimization. We illustrate the effectiveness of the proposed method by numerical simulation in the context of dynamical product development processes.

[357]  arXiv:2008.01034 [pdf, other]
Title: Project to Adapt: Domain Adaptation for Depth Completion from Noisy and Sparse Sensor Data
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Depth completion aims to predict a dense depth map from a sparse depth input. The acquisition of dense ground truth annotations for depth completion settings can be difficult and, at the same time, a significant domain gap between real LiDAR measurements and synthetic data has prevented from successful training of models in virtual settings. We propose a domain adaptation approach for sparse-to-dense depth completion that is trained from synthetic data, without annotations in the real domain or additional sensors. Our approach simulates the real sensor noise in an RGB+LiDAR set-up, and consists of three modules: simulating the real LiDAR input in the synthetic domain via projections, filtering the real noisy LiDAR for supervision and adapting the synthetic RGB image using a CycleGAN approach. We extensively evaluate these modules against the state-of-the-art in the KITTI depth completion benchmark, showing significant improvements.

[358]  arXiv:2008.01036 [pdf, other]
Title: Multiple Descent: Design Your Own Generalization Curve
Subjects: Machine Learning (cs.LG); Statistics Theory (math.ST); Machine Learning (stat.ML)

This paper explores the generalization loss of linear regression in variably parameterized families of models, both under-parameterized and over-parameterized. We show that the generalization curve can have an arbitrary number of peaks, and moreover, locations of those peaks can be explicitly controlled.
Our results highlight the fact that both classical U-shaped generalization curve and the recently observed double descent curve are not intrinsic properties of the model family. Instead, their emergence is due to the interaction between the properties of the data and the inductive biases of learning algorithms.

[359]  arXiv:2008.01039 [pdf, other]
Title: Spiking neuromophic chip learns entangled quantum states
Comments: 9+2 pages, 4+1 figures
Subjects: Emerging Technologies (cs.ET); Disordered Systems and Neural Networks (cond-mat.dis-nn); Neural and Evolutionary Computing (cs.NE); Quantum Physics (quant-ph)

Neuromorphic systems are designed to emulate certain structural and dynamical properties of biological neuronal networks, with the aim of inheriting the brain's functional performance and energy efficiency in artificial-intelligence applications [1,2]. Among the platforms existing today, the spike-based BrainScaleS system stands out by realizing fast analog dynamics which can boost computationally expensive tasks [3]. Here we use the latest BrainScaleS generation [4] for the algorithm-free simulation of quantum systems, thereby opening up an entirely new application space for these devices. This requires an appropriate spike-based representation of quantum states and an associated training method for imprinting a desired target state onto the network. We employ a representation of quantum states using probability distributions [5,6], enabling the use of a Bayesian sampling framework for spiking neurons [7]. For training, we developed a Hebbian learning scheme that explicitly exploits the inherent speed of the substrate, which enables us to realize a variety of network topologies. We encoded maximally entangled states of up to four qubits and observed fidelities that imply genuine $N$-partite entanglement. In particular, the encoding of entangled pure and mixed two-qubit states reaches a quality that allows the observation of Bell correlations, thus demonstrating that non-classical features of quantum systems can be captured by spiking neural dynamics. Our work establishes an intriguing connection between quantum systems and classical spiking networks, and demonstrates the feasibility of simulating quantum systems with neuromorphic hardware.

[360]  arXiv:2008.01040 [pdf, other]
Title: A Learned Performance Model for the Tensor Processing Unit
Subjects: Performance (cs.PF); Machine Learning (cs.LG)

Accurate hardware performance models are critical to efficient code generation. They can be used by compilers to make heuristic decisions, by superoptimizers as an minimization objective, or by autotuners to find an optimal configuration of a specific program. However, they are difficult to develop because contemporary processors are complex, and the recent proliferation of deep learning accelerators has increased the development burden. We demonstrate a method of learning performance models from a corpus of tensor computation graph programs for the Tensor Processing Unit (TPU). We train a neural network over kernel-level sub-graphs from the corpus and find that the learned model is competitive to a heavily-optimized analytical cost model used in the production XLA compiler.

[361]  arXiv:2008.01046 [pdf, other]
Title: Understanding and Improving Artifact Sharing in Software Engineering Research
Comments: 42 pages
Subjects: Software Engineering (cs.SE)

In recent years, many software engineering researchers have begun to include artifacts alongside their research papers. Ideally, artifacts, which include tools, benchmarks, data, and more, support the dissemination of ideas, provide evidence for research claims, and serve as a starting point for future research. This often takes the form of a link in the paper pointing to a website containing these additional materials. However, in practice, artifacts suffer from a variety of issues that prevent them from fully realising that potential.
To help the software engineering community realise the potential of artifacts, we seek to understand the challenges involved in the creation, sharing, and use of artifacts. To that end, we perform a mixed-methods study including a publication analysis and online survey of 153 software engineering researchers. We apply the established theory of diffusion of innovation, and draw from the field of implementation science, to make evidence-based recommendations.
By analysing the perspectives of artifact creators, users, and reviewers, we identify several high-level challenges that affect the quality of artifacts including mismatched expectations between these groups, and a lack of sufficient reward for both creators and reviewers. Using diffusion of innovation as a framework, we analyse how these challenges relate to one another, and build an understanding of the factors that affect the sharing and success of artifacts. Finally, using principles from implementation science, we make evidence-based recommendations for specific sub-communities (e.g., students and postdocs, artifact evaluation committees, funding bodies, and professional organisations) to improve the quality of artifacts.

[362]  arXiv:2008.01050 [pdf, other]
Title: Implicit automata in typed $λ$-calculi II: streaming transducers vs categorical semantics
Comments: 89 pages, 22 figures
Subjects: Logic in Computer Science (cs.LO); Formal Languages and Automata Theory (cs.FL)

We characterize regular string transductions as programs in a linear $\lambda$-calculus with additives. One direction of this equivalence is proved by encoding copyless streaming string transducers (SSTs), which compute regular functions, into our $\lambda$-calculus. For the converse, we consider a categorical framework for defining automata and transducers over words, which allows us to relate register updates in SSTs to the semantics of the linear $\lambda$-calculus in a suitable monoidal closed category. To illustrate the relevance of monoidal closure to automata theory, we also leverage this notion to give abstract generalizations of the arguments showing that copyless SSTs may be determinized and that the composition of two regular functions may be implemented by a copyless SST. Our main result is then generalized from strings to trees using a similar approach. In doing so, we exhibit a connection between a feature of streaming tree transducers and the multiplicative/additive distinction of linear logic.
Keywords: MSO transductions, implicit complexity, Dialectica categories, Church encodings

[363]  arXiv:2008.01051 [pdf, other]
Title: Enhancing autonomy transparency: an option-centric rationale approach
Subjects: Human-Computer Interaction (cs.HC)

While the advances in artificial intelligence and machine learning empower a new generation of autonomous systems for assisting human performance, one major concern arises from the human factors perspective: Humans have difficulty deciphering autonomy-generated solutions and increasingly perceive autonomy as a mysterious black box. The lack of transparency contributes to the lack of trust in autonomy and sub-optimal team performance. To enhance autonomy transparency, this study proposed an option-centric rationale display and evaluated its effectiveness. We developed a game Treasure Hunter wherein a human uncovers a map for treasures with the help from an intelligent assistant, and conducted a human-in-the-loop experiment with 34 participants. Results indicated that by conveying the intelligent assistant's decision-making rationale via the option-centric rationale display, participants had higher trust in the system and calibrated their trust faster. Additionally, higher trust led to higher acceptance of recommendations from the intelligent assistant, and in turn higher task performance.

[364]  arXiv:2008.01053 [pdf, other]
Title: Towards Leveraging End-of-Life Tools as an Asset: Value Co-Creation based on Deep Learning in the Machining Industry
Comments: Proceedings of the 53rd Hawaii International Conference on System Sciences | 2020
Subjects: Other Computer Science (cs.OH); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)

Sustainability is the key concept in the management of products that reached their end-of-life. We propose that end-of-life products have -- besides their value as recyclable assets -- additional value for producer and consumer. We argue this is especially true for the machining industry, where we illustrate an automatic characterization of worn cutting tools to foster value co-creation between tool manufacturer and tool user (customer) in the future. In the work at hand, we present a deep-learning-based computer vision system for the automatic classification of worn tools regarding flank wear and chipping. The resulting Matthews Correlation Coefficient of 0.878 and 0.644 confirms the feasibility of our system based on the VGG-16 network and Gradient Boosting. Based on these first results we derive a research agenda which addresses the need for a more holistic tool characterization by semantic segmentation and assesses the perceived business impact and usability by different user groups.

[365]  arXiv:2008.01054 [pdf, other]
Title: Solving Cosserat Rod Models via Collocation and the Magnus Expansion
Comments: Accepted for publication in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2020
Subjects: Robotics (cs.RO)

Choosing a kinematic model for a continuum robot typically involves making a tradeoff between accuracy and computational complexity. One common modeling approach is to use the Cosserat rod equations, which have been shown to be accurate for many types of continuum robots. This approach, however, still presents significant computational cost, particularly when many Cosserat rods are coupled via kinematic constraints. In this work, we propose a numerical method that combines orthogonal collocation on the local rod curvature and forward integration of the Cosserat rod kinematic equations via the Magnus expansion, allowing the equilibrium shape to be written as a product of matrix exponentials. We provide a bound on the maximum step size to guarantee convergence of the Magnus expansion for the case of Cosserat rods, compare in simulation against other approaches, and demonstrate the tradeoffs between speed and accuracy for the fourth and sixth order Magnus expansions as well as for different numbers of collocation points. Our results show that the proposed method can find accurate solutions to the Cosserat rod equations and can potentially be competitive in computation speed.

[366]  arXiv:2008.01055 [pdf]
Title: Value driven Analysis Framework of Service Ecosystem Evolution Mechanism
Comments: 14pages
Subjects: Other Computer Science (cs.OH)

With the development of cloud computing, service computing, IoT(Internet of Things) and mobile Internet, the diversity and sociality of services are increasingly apparent. To meet the customized user demands, Service Ecosystem is emerging as a complex social-technology system, which is formed with various IT services through cross-border integration. However, how to analyze and promote the evolution mechanism of service ecosystem is still a serious challenge in the field, which is of great significance to achieve the expected system evolution trends. Based on this, this paper proposes a value-driven analysis framework of service ecosystem, including value creation, value operation, value realization and value distribution. In addition, a computational experiment system is established to verify the effectiveness of the analysis framework, which stimulates the effect of different operation strategies on the value network in the service ecosystem. The result shows that our analysis framework can provide new means and ideas for the analysis of service ecosystem evolution, and can also support the design of operation strategies. Index

[367]  arXiv:2008.01057 [pdf, other]
Title: Residual Frames with Efficient Pseudo-3D CNN for Human Action Recognition
Subjects: Computer Vision and Pattern Recognition (cs.CV)

Human action recognition is regarded as a key cornerstone in domains such as surveillance or video understanding. Despite recent progress in the development of end-to-end solutions for video-based action recognition, achieving state-of-the-art performance still requires using auxiliary hand-crafted motion representations, e.g., optical flow, which are usually computationally demanding. In this work, we propose to use residual frames (i.e., differences between adjacent RGB frames) as an alternative "lightweight" motion representation, which carries salient motion information and is computationally efficient. In addition, we develop a new pseudo-3D convolution module which decouples 3D convolution into 2D and 1D convolution. The proposed module exploits residual information in the feature space to better structure motions, and is equipped with a self-attention mechanism that assists to recalibrate the appearance and motion features. Empirical results confirm the efficiency and effectiveness of residual frames as well as the proposed pseudo-3D convolution module.

[368]  arXiv:2008.01059 [pdf, other]
Title: Improving One-stage Visual Grounding by Recursive Sub-query Construction
Comments: ECCV 2020
Subjects: Computer Vision and Pattern Recognition (cs.CV)

We improve one-stage visual grounding by addressing current limitations on grounding long and complex queries. Existing one-stage methods encode the entire language query as a single sentence embedding vector, e.g., taking the embedding from BERT or the hidden state from LSTM. This single vector representation is prone to overlooking the detailed descriptions in the query. To address this query modeling deficiency, we propose a recursive sub-query construction framework, which reasons between image and query for multiple rounds and reduces the referring ambiguity step by step. We show our new one-stage method obtains 5.0%, 4.5%, 7.5%, 12.8% absolute improvements over the state-of-the-art one-stage baseline on ReferItGame, RefCOCO, RefCOCO+, and RefCOCOg, respectively. In particular, superior performances on longer and more complex queries validates the effectiveness of our query modeling.

[369]  arXiv:2008.01062 [pdf, other]
Title: QPLEX: Duplex Dueling Multi-Agent Q-Learning
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Multiagent Systems (cs.MA); Machine Learning (stat.ML)

We explore value-based multi-agent reinforcement learning (MARL) in the popular paradigm of centralized training with decentralized execution (CTDE). CTDE requires the consistency of the optimal joint action selection with optimal individual action selections, which is called the IGM (Individual-Global-Max) principle. However, in order to achieve scalability, existing MARL methods either limit representation expressiveness of their value function classes or relax the IGM consistency, which may lead to poor policies or even divergence. This paper presents a novel MARL approach, called duPLEX dueling multi-agent Q-learning (QPLEX), that takes a duplex dueling network architecture to factorize the joint value function. This duplex dueling architecture transforms the IGM principle to easily realized constraints on advantage functions and thus enables efficient value function learning. Theoretical analysis shows that QPLEX solves a rich class of tasks. Empirical experiments on StarCraft II unit micromanagement tasks demonstrate that QPLEX significantly outperforms state-of-the-art baselines in both online and offline task settings, and also reveal that QPLEX achieves high sample efficiency and can benefit from offline datasets without additional exploration.

[370]  arXiv:2008.01064 [pdf, other]
Title: Predicting What You Already Know Helps: Provable Self-Supervised Learning
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

Self-supervised representation learning solves auxiliary prediction tasks (known as pretext tasks), that do not require labeled data, to learn semantic representations. These pretext tasks are created solely using the input features, such as predicting a missing image patch, recovering the color channels of an image from context, or predicting missing words, yet predicting this $known\ $information helps in learning representations effective for downstream prediction tasks. This paper posits a mechanism based on conditional independence to formalize how solving certain pretext tasks can learn representations that provably decreases the sample complexity of downstream supervised tasks. Formally, we quantify how approximate independence between the components of the pretext task (conditional on the label and latent variables) allows us to learn representations that can solve the downstream task with drastically reduced sample complexity by just training a linear layer on top of the learned representation.

[371]  arXiv:2008.01065 [pdf, other]
Title: Memory-augmented Dense Predictive Coding for Video Representation Learning
Comments: ECCV2020, Spotlight
Subjects: Computer Vision and Pattern Recognition (cs.CV)

The objective of this paper is self-supervised learning from video, in particular for representations for action recognition. We make the following contributions: (i) We propose a new architecture and learning framework Memory-augmented Dense Predictive Coding (MemDPC) for the task. It is trained with a predictive attention mechanism over the set of compressed memories, such that any future states can always be constructed by a convex combination of the condense representations, allowing to make multiple hypotheses efficiently. (ii) We investigate visual-only self-supervised video representation learning from RGB frames, or from unsupervised optical flow, or both. (iii) We thoroughly evaluate the quality of learnt representation on four different downstream tasks: action recognition, video retrieval, learning with scarce annotations, and unintentional action classification. In all cases, we demonstrate state-of-the-art or comparable performance over other approaches with orders of magnitude fewer training data.

[372]  arXiv:2008.01066 [pdf, ps, other]
Title: Multifidelity Data Fusion via Gradient-Enhanced Gaussian Process Regression
Subjects: Computational Engineering, Finance, and Science (cs.CE); Machine Learning (stat.ML)

We propose a data fusion method based on multi-fidelity Gaussian process regression (GPR) framework. This method combines available data of the quantity of interest (QoI) and its gradients with different fidelity levels, namely, it is a Gradient-enhanced Cokriging method (GE-Cokriging). It provides the approximations of both the QoI and its gradients simultaneously with uncertainty estimates. We compare this method with the conventional multi-fidelity Cokriging method that does not use gradients information, and the result suggests that GE-Cokriging has a better performance in predicting both QoI and its gradients. Moreover, GE-Cokriging even shows better generalization result in some cases where Cokriging performs poorly due to the singularity of the covariance matrix. We demonstrate the application of GE-Cokriging in several practical cases including reconstructing the trajectories and velocity of an underdamped oscillator with respect to time simultaneously, and investigating the sensitivity of power factor of a load bus with respect to varying power inputs of a generator bus in a large scale power system. We also show that though GE-Cokriging method requires a little bit higher computational cost than Cokriging method, the result of accuracy comparison shows that this cost is usually worth it.

[373]  arXiv:2008.01068 [pdf, other]
Title: Unsupervised 3D Learning for Shape Analysis via Multiresolution Instance Discrimination
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)

Although unsupervised feature learning has demonstrated its advantages to reducing the workload of data labeling and network design in many fields, existing unsupervised 3D learning methods still cannot offer a generic network for various shape analysis tasks with competitive performance to supervised methods. In this paper, we propose an unsupervised method for learning a generic and efficient shape encoding network for different shape analysis tasks. The key idea of our method is to jointly encode and learn shape and point features from unlabeled 3D point clouds. For this purpose, we adapt HR-Net to octree-based convolutional neural networks for jointly encoding shape and point features with fused multiresolution subnetworks and design a simple-yet-efficient \emph{Multiresolution Instance Discrimination} (MID) loss for jointly learning the shape and point features. Our network takes a 3D point cloud as input and output both shape and point features. After training, the network is concatenated with simple task-specific back-end layers and fine-tuned for different shape analysis tasks. We evaluate the efficacy and generality of our method and validate our network and loss design with a set of shape analysis tasks, including shape classification, semantic shape segmentation, as well as shape registration tasks. With simple back-ends, our network demonstrates the best performance among all unsupervised methods and achieves competitive performance to supervised methods, especially in tasks with a small labeled dataset. For fine-grained shape segmentation, our method even surpasses existing supervised methods by a large margin.

[374]  arXiv:1204.6093 (cross-list from math.OC) [pdf, ps, other]
Title: Linear Consensus Algorithms Based on Balanced Asymmetric Chains
Comments: 15 pages
Subjects: Optimization and Control (math.OC); Systems and Control (eess.SY); Dynamical Systems (math.DS)

Multi agent consensus algorithms with update steps based on so-called balanced asymmetric chains, are analyzed. For such algorithms it is shown that (i) the set of accumulation points of states is finite, (ii) the asymptotic unconditional occurrence of single consensus or multiple consensuses is directly related to the property of absolute infinite flow for the underlying update chain. The results are applied to well known consensus models.

[375]  arXiv:1204.6624 (cross-list from math.DS) [pdf, ps, other]
Title: Theorems about Ergodicity and Class-Ergodicity of Chains with Applications in Known Consensus Models
Comments: 7 pages
Subjects: Dynamical Systems (math.DS); Systems and Control (eess.SY); Optimization and Control (math.OC)

In a multi-agent system, unconditional (multiple) consensus is the property of reaching to (multiple) consensus irrespective of the instant and values at which states are initialized. For linear algorithms, occurrence of unconditional (multiple) consensus turns out to be equivalent to (class-) ergodicity of the transition chain (A_n). For a wide class of chains, chains with so-called balanced asymmetry property, necessary and sufficient conditions for ergodicity and class-ergodicity are derived. The results are employed to analyze the limiting behavior of agents' states in the JLM model, the Krause model, and the Cucker-Smale model. In particular, unconditional single or multiple consensus occurs in all three models. Moreover, a necessary and sufficient condition for unconditional consensus in the JLM model and a sufficient condition for consensus in the Cucker-Smale model are obtained.

[376]  arXiv:1303.6674 (cross-list from math.DS) [pdf, other]
Title: Consensus Algorithms and the Decomposition-Separation Theorem
Comments: 33 pages
Subjects: Dynamical Systems (math.DS); Systems and Control (eess.SY); Optimization and Control (math.OC)

Convergence properties of time inhomogeneous Markov chain based discrete and continuous time linear consensus algorithms are analyzed. Provided that a so-called infinite jet flow property is satisfied by the underlying chains, necessary conditions for both consensus and multiple consensus are established. A recenet extension by Sonin of the classical Kolmogorov-Doeblin decomposition-separation for homogeneous Markov chains to the inhomogeneous case is then employed to show that the obtained necessary conditions are also sufficient when the chain is of Class P*, as defined by Touri and Nedic. It is also shown that Sonin's theorem leads to a rediscovery and generalization of most of the existing related consensus results in the literature.

[377]  arXiv:1409.7091 (cross-list from math.DS) [pdf, other]
Title: Eminence Grise Coalitions: On the Shaping of Public Opinion
Comments: 35 pages
Subjects: Dynamical Systems (math.DS); Systems and Control (eess.SY); Optimization and Control (math.OC)

We consider a network of evolving opinions. It includes multiple individuals with first-order opinion dynamics defined in continuous time and evolving based on a general exogenously defined time-varying underlying graph. In such a network, for an arbitrary fixed initial time, a subset of individuals forms an eminence grise coalition, abbreviated as EGC, if the individuals in that subset are capable of leading the entire network to agreeing on any desired opinion, through a cooperative choice of their own initial opinions. In this endeavor, the coalition members are assumed to have access to full profile of the underlying graph of the network as well as the initial opinions of all other individuals. While the complete coalition of individuals always qualifies as an EGC, we establish the existence of a minimum size EGC for an arbitrary time-varying network; also, we develop a non-trivial set of upper and lower bounds on that size. As a result, we show that, even when the underlying graph does not guarantee convergence to a global or multiple consensus, a generally restricted coalition of agents can steer public opinion towards a desired global consensus without affecting any of the predefined graph interactions, provided they can cooperatively adjust their own initial opinions. Geometric insights into the structure of EGC's are given. The results are also extended to the discrete time case where the relation with Decomposition-Separation Theorem is also made explicit.

[378]  arXiv:2008.00029 (cross-list from stat.ML) [pdf, other]
Title: Cold Posteriors and Aleatoric Uncertainty
Comments: 5 pages, 3 figures
Journal-ref: ICML workshop on Uncertainty and Robustness in Deep Learning (2020)
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)

Recent work has observed that one can outperform exact inference in Bayesian neural networks by tuning the "temperature" of the posterior on a validation set (the "cold posterior" effect). To help interpret this phenomenon, we argue that commonly used priors in Bayesian neural networks can significantly overestimate the aleatoric uncertainty in the labels on many classification datasets. This problem is particularly pronounced in academic benchmarks like MNIST or CIFAR, for which the quality of the labels is high. For the special case of Gaussian process regression, any positive temperature corresponds to a valid posterior under a modified prior, and tuning this temperature is directly analogous to empirical Bayes. On classification tasks, there is no direct equivalence between modifying the prior and tuning the temperature, however reducing the temperature can lead to models which better reflect our belief that one gains little information by relabeling existing examples in the training set. Therefore although cold posteriors do not always correspond to an exact inference procedure, we believe they may often better reflect our true prior beliefs.

[379]  arXiv:2008.00052 (cross-list from math.AP) [pdf, ps, other]
Title: Online Prediction With History-Dependent Experts: The General Case
Subjects: Analysis of PDEs (math.AP); Computer Science and Game Theory (cs.GT); Machine Learning (cs.LG); Optimization and Control (math.OC)

We study the problem of prediction of binary sequences with expert advice in the online setting, which is a classic example of online machine learning. We interpret the binary sequence as the price history of a stock, and view the predictor as an investor, which converts the problem into a stock prediction problem. In this framework, an investor, who predicts the daily movements of a stock, and an adversarial market, who controls the stock, play against each other over $N$ turns. The investor combines the predictions of $n\geq 2$ experts in order to make a decision about how much to invest at each turn, and aims to minimize their regret with respect to the best-performing expert at the end of the game. We consider the problem with history-dependent experts, in which each expert uses the previous $d$ days of history of the market in making their predictions. We prove that the value function for this game, rescaled appropriately, converges as $N\to \infty$ at a rate of $O(N^{-1/6})$ to the viscosity solution of a nonlinear degenerate elliptic PDE, which can be understood as the Hamilton-Jacobi-Issacs equation for the two-person game. As a result, we are able to deduce asymptotically optimal strategies for the investor. Our results extend those established by the first author and R.V.Kohn [13] for $n=2$ experts and $d\leq 4$ days of history.

[380]  arXiv:2008.00053 (cross-list from q-bio.NC) [pdf, other]
Title: Neural Network Degeneration and its Relationship to the Brain
Authors: Jacob Adamczyk
Subjects: Neurons and Cognition (q-bio.NC); Neural and Evolutionary Computing (cs.NE)

This report discusses the application of neural networks (NNs) as small segments of the brain. The networks representing the biological connectome are altered both spatially and temporally. The degradation techniques applied here are "weight degradation", "weight scrambling", and variable activation function. These methods aim to shine light on the study of neurodegenerative diseases such as Alzheimer's, Huntington's and Parkinson's disease as well as strokes and brain tumors disrupting the flow of information in the brain's network. Fundamental insights to memory loss and generalized learning dysfunction are gained by monitoring the network's error function during network degradation. The biological significance of each facet is also discussed.

[381]  arXiv:2008.00070 (cross-list from math.LO) [pdf, ps, other]
Title: Language Models for Some Extensions of the Lambek Calculus
Comments: Extended version of our WoLLIC 2019 paper. Submitted to Information and Computation (WoLLIC 2019 special issue)
Subjects: Logic (math.LO); Logic in Computer Science (cs.LO)

We investigate language interpretations of two extensions of the Lambek calculus: with additive conjunction and disjunction and with additive conjunction and the unit constant. For extensions with additive connectives, we show that conjunction and disjunction behave differently. Adding both of them leads to incompleteness due to the distributivity law. We show that with conjunction only no issues with distributivity arise. In contrast, there exists a corollary of the distributivity law in the language with disjunction only which is not derivable in the non-distributive system. Moreover, this difference keeps valid for systems with permutation and/or weakening structural rules, that is, intuitionistic linear and affine logics and affine multiplicative-additive Lambek calculus. For the extension of the Lambek with the unit constant, we present a calculus which reflects natural algebraic properties of the empty word. We do not claim completeness for this calculus, but we prove undecidability for the whole range of systems extending this minimal calculus and sound w.r.t. language models. As a corollary, we show that in the language with the unit there exissts a sequent that is true if all variables are interpreted by regular language, but not true in language models in general.

[382]  arXiv:2008.00107 (cross-list from eess.AS) [pdf, other]
Title: An Acoustic Segment Model Based Segment Unit Selection Approach to Acoustic Scene Classification with Partial Utterances
Comments: Accepted by Interspeech 2020
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)

In this paper, we propose a sub-utterance unit selection framework to remove acoustic segments in audio recordings that carry little information for acoustic scene classification (ASC). Our approach is built upon a universal set of acoustic segment units covering the overall acoustic scene space. First, those units are modeled with acoustic segment models (ASMs) used to tokenize acoustic scene utterances into sequences of acoustic segment units. Next, paralleling the idea of stop words in information retrieval, stop ASMs are automatically detected. Finally, acoustic segments associated with the stop ASMs are blocked, because of their low indexing power in retrieval of most acoustic scenes. In contrast to building scene models with whole utterances, the ASM-removed sub-utterances, i.e., acoustic utterances without stop acoustic segments, are then used as inputs to the AlexNet-L back-end for final classification. On the DCASE 2018 dataset, scene classification accuracy increases from 68%, with whole utterances, to 72.1%, with segment selection. This represents a competitive accuracy without any data augmentation, and/or ensemble strategy. Moreover, our approach compares favourably to AlexNet-L with attention.

[383]  arXiv:2008.00110 (cross-list from eess.AS) [pdf, other]
Title: Relational Teacher Student Learning with Neural Label Embedding for Device Adaptation in Acoustic Scene Classification
Comments: Accepted by Interspeech 2020
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)

In this paper, we propose a domain adaptation framework to address the device mismatch issue in acoustic scene classification leveraging upon neural label embedding (NLE) and relational teacher student learning (RTSL). Taking into account the structural relationships between acoustic scene classes, our proposed framework captures such relationships which are intrinsically device-independent. In the training stage, transferable knowledge is condensed in NLE from the source domain. Next in the adaptation stage, a novel RTSL strategy is adopted to learn adapted target models without using paired source-target data often required in conventional teacher student learning. The proposed framework is evaluated on the DCASE 2018 Task1b data set. Experimental results based on AlexNet-L deep classification models confirm the effectiveness of our proposed approach for mismatch situations. NLE-alone adaptation compares favourably with the conventional device adaptation and teacher student based adaptation techniques. NLE with RTSL further improves the classification accuracy.

[384]  arXiv:2008.00118 (cross-list from cond-mat.str-el) [pdf, other]
Title: Phases of two-dimensional spinless lattice fermions with first-quantized deep neural-network quantum states
Subjects: Strongly Correlated Electrons (cond-mat.str-el); Machine Learning (cs.LG); Quantum Physics (quant-ph)

First-quantized deep neural network techniques are developed for analyzing strongly coupled fermionic systems on the lattice. Using a Slater-Jastrow inspired ansatz which exploits deep residual networks with convolutional residual blocks, we approximately determine the ground state of spinless fermions on a square lattice with nearest-neighbor interactions. The flexibility of the neural-network ansatz results in a high level of accuracy when compared to exact diagonalization results on small systems, both for energy and correlation functions. On large systems, we obtain accurate estimates of the boundaries between metallic and charge ordered phases as a function of the interaction strength and the particle density.

[385]  arXiv:2008.00119 (cross-list from eess.IV) [pdf, other]
Title: CorrSigNet: Learning CORRelated Prostate Cancer SIGnatures from Radiology and Pathology Images for Improved Computer Aided Diagnosis
Comments: Accepted to MICCAI 2020
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

Magnetic Resonance Imaging (MRI) is widely used for screening and staging prostate cancer. However, many prostate cancers have subtle features which are not easily identifiable on MRI, resulting in missed diagnoses and alarming variability in radiologist interpretation. Machine learning models have been developed in an effort to improve cancer identification, but current models localize cancer using MRI-derived features, while failing to consider the disease pathology characteristics observed on resected tissue. In this paper, we propose CorrSigNet, an automated two-step model that localizes prostate cancer on MRI by capturing the pathology features of cancer. First, the model learns MRI signatures of cancer that are correlated with corresponding histopathology features using Common Representation Learning. Second, the model uses the learned correlated MRI features to train a Convolutional Neural Network to localize prostate cancer. The histopathology images are used only in the first step to learn the correlated features. Once learned, these correlated features can be extracted from MRI of new patients (without histopathology or surgery) to localize cancer. We trained and validated our framework on a unique dataset of 75 patients with 806 slices who underwent MRI followed by prostatectomy surgery. We tested our method on an independent test set of 20 prostatectomy patients (139 slices, 24 cancerous lesions, 1.12M pixels) and achieved a per-pixel sensitivity of 0.81, specificity of 0.71, AUC of 0.86 and a per-lesion AUC of $0.96 \pm 0.07$, outperforming the current state-of-the-art accuracy in predicting prostate cancer using MRI.

[386]  arXiv:2008.00148 (cross-list from eess.IV) [pdf]
Title: Diabetic Retinopathy Diagnosis based on Convolutional Neural Network
Comments: 8 pages
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

Diabetic Retinopathy DR is a popular disease for many people as a result of age or the diabetic, as a result, it can cause blindness. therefore, diagnosis of this disease especially in the early time can prevent its effect for a lot of patients. To achieve this diagnosis, eye retina must be examined continuously. Therefore, computer-aided tools can be used in the field based on computer vision techniques. Different works have been performed using various machine learning techniques. Convolutional Neural Network is one of the promise methods, so it was for Diabetic Retinopathy detection in this paper. Also, the proposed work contains visual enhancement in the pre-processing phase, then the CNN model is trained to be able for recognition and classification phase, to diagnosis the healthy and unhealthy retina image. Three public dataset DiaretDB0, DiaretDB1 and DrimDB were used in practical testing. The implementation of this work based on Matlab- R2019a, deep learning toolbox and deep network designer to design the architecture of the convolutional neural network and train it. The results were evaluated to different metrics; accuracy is one of them. The best accuracy that was achieved: for DiaretDB0 is 100%, DiaretDB1 is 99.495% and DrimDB is 97.55%.

[387]  arXiv:2008.00167 (cross-list from physics.comp-ph) [pdf, other]
Title: DeePKS: a comprehensive data-driven approach towards chemically accurate density functional theory
Subjects: Computational Physics (physics.comp-ph); Machine Learning (cs.LG); Chemical Physics (physics.chem-ph)

We propose a general machine learning-based framework for building an accurate and widely-applicable energy functional within the framework of generalized Kohn-Sham density functional theory. To this end, we develop a way of training self-consistent models that are capable of taking large datasets from different systems and different kinds of labels. We demonstrate that the functional that results from this training procedure gives chemically accurate predictions on energy, force, dipole, and electron density for a large class of molecules. It can be continuously improved when more and more data are available.

[388]  arXiv:2008.00195 (cross-list from eess.IV) [pdf, other]
Title: Joint Generative Learning and Super-Resolution For Real-World Camera-Screen Degradation
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

In real-world single image super-resolution (SISR) task, the low-resolution image suffers more complicated degradations, not only downsampled by unknown kernels. However, existing SISR methods are generally studied with the synthetic low-resolution generation such as bicubic interpolation (BI), which greatly limits their performance. Recently, some researchers investigate real-world SISR from the perspective of the camera and smartphone. However, except the acquisition equipment, the display device also involves more complicated degradations. In this paper, we focus on the camera-screen degradation and build a real-world dataset (Cam-ScreenSR), where HR images are original ground truths from the previous DIV2K dataset and corresponding LR images are camera-captured versions of HRs displayed on the screen. We conduct extensive experiments to demonstrate that involving more real degradations is positive to improve the generalization of SISR models. Moreover, we propose a joint two-stage model. Firstly, the downsampling degradation GAN(DD-GAN) is trained to model the degradation and produces more various of LR images, which is validated to be efficient for data augmentation. Then the dual residual channel attention network (DuRCAN) learns to recover the SR image. The weighted combination of L1 loss and proposed Laplacian loss are applied to sharpen the high-frequency edges. Extensive experimental results in both typical synthetic and complicated real-world degradations validate the proposed method outperforms than existing SOTA models with less parameters, faster speed and better visual results. Moreover, in real captured photographs, our model also delivers best visual quality with sharper edge, less artifacts, especially appropriate color enhancement, which has not been accomplished by previous methods.

[389]  arXiv:2008.00198 (cross-list from eess.AS) [pdf, other]
Title: Singer Identification Using Convolutional Acoustic Motif Embeddings
Comments: 5 pages
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)

Flamenco singing is characterized by pitch instability, micro-tonal ornamentations, large vibrato ranges, and a high degree of melodic variability. These musical features make the automatic identification of flamenco singers a difficult computational task. In this article we present an end-to-end pipeline for flamenco singer identification based on acoustic motif embeddings. In the approach taken, the fundamental frequency obtained directly from the raw audio signal is approximated. This approximation reduces the high variability of the audio signal and allows for small melodic patterns to be discovered using a sequential pattern mining technique, thus creating a dictionary of motifs. Several acoustic features are then used to extract fixed length embeddings of variable length motifs by using convolutional architectures. We test the quality of the embeddings in a flamenco singer identification task, comparing our approach with previous deep learning architectures, and study the effect of motivic patterns and acoustic features in the identification task. Results indicate that motivic patterns play a crucial role in identifying flamenco singers by minimizing the size of the signal to be learned, discarding information that is not relevant in the identification task. The deep learning architecture presented outperforms denser models used in large-scale audio classification problems.

[390]  arXiv:2008.00203 (cross-list from eess.AS) [pdf, other]
Title: Score-informed Networks for Music Performance Assessment
Comments: To appear at 21st International Society for Music Information Retrieval Conference, Montr\'eal, Canada, 2020
Subjects: Audio and Speech Processing (eess.AS); Information Retrieval (cs.IR); Machine Learning (cs.LG)

The assessment of music performances in most cases takes into account the underlying musical score being performed. While there have been several automatic approaches for objective music performance assessment (MPA) based on extracted features from both the performance audio and the score, deep neural network-based methods incorporating score information into MPA models have not yet been investigated. In this paper, we introduce three different models capable of score-informed performance assessment. These are (i) a convolutional neural network that utilizes a simple time-series input comprising of aligned pitch contours and score, (ii) a joint embedding model which learns a joint latent space for pitch contours and scores, and (iii) a distance matrix-based convolutional neural network which utilizes patterns in the distance matrix between pitch contours and musical score to predict assessment ratings. Our results provide insights into the suitability of different architectures and input representations and demonstrate the benefits of score-informed models as compared to score-independent models.

[391]  arXiv:2008.00209 (cross-list from eess.AS) [pdf]
Title: Neural ODE with Temporal Convolution and Time Delay Neural Networks for Small-Footprint Keyword Spotting
Comments: 5 pages, 5 figures
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)

In this paper, we propose neural network models based on the neural ordinary differential equation (NODE) for small-footprint keyword spotting (KWS). We present techniques to apply NODE to KWS that make it possible to adopt Batch Normalization to NODE-based network and to reduce the number of computations during inference. Finally, we show that the number of model parameters of the proposed model is smaller by 68% than that of the conventional KWS model.

[392]  arXiv:2008.00215 (cross-list from math.CO) [pdf, ps, other]
Title: Superregular matrices over small finite fields
Subjects: Combinatorics (math.CO); Information Theory (cs.IT); Number Theory (math.NT)

A trivially zero minor of a matrix is a minor having all its terms in the Leibniz formula equal to zero. A matrix is superregular if all of its minors that are not trivially zero are nonzero. In the area of Coding Theory, superregular matrices over finite fields are connected with codes with optimum distance proprieties. When a superregular matrix has all its entries nonzero, it is called full superregular and these matrices are used to construct Maximum Distance Separable block codes. In the context of convolutional codes, lower triangular Toeplitz superregular matrices are employed to build convolutional codes with optimal column distance. Although full superregular matrices over small fields are known (e.g. Cauchy matrices), the few known general constructions of these matrices having a lower triangular Toeplitz structure require very large field sizes. In this work we investigate lower triangular Toeplitz superregular matrices over small finite prime fields. Following the work of Hutchinson, Smarandache and Trumpf, we study the minimum number of different nontrivial minors that such a matrix have, and exhibit concrete constructions of superregular matrices of this kind.

[393]  arXiv:2008.00216 (cross-list from quant-ph) [pdf, other]
Title: Faster Schrödinger-style simulation of quantum circuits
Comments: 12 pages, 9 figures, 3 tables
Subjects: Quantum Physics (quant-ph); Emerging Technologies (cs.ET)

Recent demonstrations of superconducting quantum computers by Google and IBM and trapped-ion computers from IonQ fueled new research in quantum algorithms, compilation into quantum circuits, and empirical algorithmics. While online access to quantum hardware remains too limited to meet the demand, simulating quantum circuits on conventional computers satisfies many needs. We advance Schr\"odinger-style simulation of quantum circuits that is useful standalone and as a building block in layered simulation algorithms, both cases are illustrated in our results. Our algorithmic contributions show how to simulate multiple quantum gates at once, how to avoid floating-point multiplies, how to best use instruction-level and thread-level parallelism as well as CPU cache, and how to leverage these optimizations by reordering circuit gates. While not described previously, these techniques implemented by us supported published high-performance distributed simulations up to 64 qubits. To show additional impact, we benchmark our simulator against Microsoft, IBM and Google simulators on hard circuits from Google.

[394]  arXiv:2008.00239 (cross-list from eess.IV) [pdf, other]
Title: Exploring Multi-Scale Feature Propagation and Communication for Image Super Resolution
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

Multi-scale techniques have achieved great success in a wide range of computer vision tasks. However, while this technique is incorporated in existing works, there still lacks a comprehensive investigation on variants of multi-scale convolution in image super resolution. In this work, we present a unified formulation over widely-used multi-scale structures. With this framework, we systematically explore the two factors of multi-scale convolution -- feature propagation and cross-scale communication. Based on the investigation, we propose a generic and efficient multi-scale convolution unit -- Multi-Scale cross-Scale Share-weights convolution (MS$^3$-Conv). Extensive experiments demonstrate that the proposed MS$^3$-Conv can achieve better SR performance than the standard convolution with less parameters and computational cost. Beyond quantitative analysis, we comprehensively study the visual quality, which shows that MS$^3$-Conv behave better to recover high-frequency details.

[395]  arXiv:2008.00250 (cross-list from eess.SP) [pdf, ps, other]
Title: Deep Reinforcement Learning Based Mobile Edge Computing for Intelligent Internet of Things
Subjects: Signal Processing (eess.SP); Machine Learning (cs.LG); Systems and Control (eess.SY)

In this paper, we investigate mobile edge computing (MEC) networks for intelligent internet of things (IoT), where multiple users have some computational tasks assisted by multiple computational access points (CAPs). By offloading some tasks to the CAPs, the system performance can be improved through reducing the latency and energy consumption, which are the two important metrics of interest in the MEC networks. We devise the system by proposing the offloading strategy intelligently through the deep reinforcement learning algorithm. In this algorithm, Deep Q-Network is used to automatically learn the offloading decision in order to optimize the system performance, and a neural network (NN) is trained to predict the offloading action, where the training data is generated from the environmental system. Moreover, we employ the bandwidth allocation in order to optimize the wireless spectrum for the links between the users and CAPs, where several bandwidth allocation schemes are proposed. In further, we use the CAP selection in order to choose one best CAP to assist the computational tasks from the users. Simulation results are finally presented to show the effectiveness of the proposed reinforcement learning offloading strategy. In particular, the system cost of latency and energy consumption can be reduced significantly by the proposed deep reinforcement learning based algorithm.

[396]  arXiv:2008.00252 (cross-list from math.OC) [pdf, other]
Title: Distributed Nonconvex Optimization: Oracle-free Iterations and Globally Optimal Solution
Subjects: Optimization and Control (math.OC); Systems and Control (eess.SY)

Distributed optimization is concerned with using local computation and communication to realize a global aim of optimizing the sum of local objective functions. It has gained wide attention for a variety of applications in networked systems. This paper addresses a class of constrained distributed nonconvex optimization problems involving univariate objective functions, aiming to achieve global optimization with a simple iteration rule not requiring local oracle queries (i.e., evaluations of gradients or function values). We propose a novel algorithm named CPCA, exploiting the notion of combining Chebyshev polynomial approximation, average consensus and polynomial optimization. The proposed algorithm is i) able to yield $\epsilon$ globally optimal solutions for any arbitrarily small given tolerance $\epsilon$, ii) efficient in terms of both oracle complexities and inter-agent communication costs, and iii) distributed terminable when the specified precision requirement is met. The key insight is to use polynomial approximations to substitute for general objectives, and turn to solve an easier approximate version of the original problem. Due to the nice analytic properties owned by polynomials, this approximation not only facilitates efficient global optimization, but also allows the proposed algorithm's consensus-based iteration structure free from local oracle queries. We provide a comprehensive analysis of the accuracy and complexities of the proposed algorithm.

[397]  arXiv:2008.00263 (cross-list from q-bio.MN) [pdf, ps, other]
Title: Signal metrics analysis of oscillatory patterns in bacterial multi-omic networks
Comments: 8 pages, 5 figure, 3 algorithms, journal paper
Subjects: Molecular Networks (q-bio.MN); Computational Engineering, Finance, and Science (cs.CE)

Motivation: One of the branches of Systems Biology is focused on a deep understanding of underlying regulatory networks through the analysis of the biomolecules oscillations and their interplay. Synthetic Biology exploits gene or/and protein regulatory networks towards the design of oscillatory networks for producing useful compounds. Therefore, at different levels of application and for different purposes, the study of biomolecular oscillations can lead to different clues about the mechanisms underlying living cells. It is known that network-level interactions involve more than one type of biomolecule as well as biological processes operating at multiple omic levels. Combining network/pathway-level information with genetic information it is possible to describe well-understood or unknown bacterial mechanisms and organism-specific dynamics. Results: Network multi-omic integration has led to the discovery of interesting oscillatory signals. Following the methodologies used in signal processing and communication engineering, a new methodology is introduced to identify and quantify the extent of the multi-omic oscillations of the signal. New signal metrics are designed to allow further biotechnological explanations and provide important clues about the oscillatory nature of the pathways and their regulatory circuits. Our algorithms designed for the analysis of multi-omic signals are tested and validated on 11 different bacteria for thousands of multi-omic signals perturbed at the network level by different experimental conditions. Information on the order of genes, codon usage, gene expression, and protein molecular weight is integrated at three different functional levels. Oscillations show interesting evidence that network-level multi-omic signals present a synchronized response to perturbations and evolutionary relations along with taxa.

[398]  arXiv:2008.00264 (cross-list from eess.AS) [pdf, other]
Title: DCCRN: Deep Complex Convolution Recurrent Network for Phase-Aware Speech Enhancement
Comments: Accepted by Interspeech 2020
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)

Speech enhancement has benefited from the success of deep learning in terms of intelligibility and perceptual quality. Conventional time-frequency (TF) domain methods focus on predicting TF-masks or speech spectrum, via a naive convolution neural network (CNN) or recurrent neural network (RNN). Some recent studies use complex-valued spectrogram as a training target but train in a real-valued network, predicting the magnitude and phase component or real and imaginary part, respectively. Particularly, convolution recurrent network (CRN) integrates a convolutional encoder-decoder (CED) structure and long short-term memory (LSTM), which has been proven to be helpful for complex targets. In order to train the complex target more effectively, in this paper, we design a new network structure simulating the complex-valued operation, called Deep Complex Convolution Recurrent Network (DCCRN), where both CNN and RNN structures can handle complex-valued operation. The proposed DCCRN models are very competitive over other previous networks, either on objective or subjective metric. With only 3.7M parameters, our DCCRN models submitted to the Interspeech 2020 Deep Noise Suppression (DNS) challenge ranked first for the real-time-track and second for the non-real-time track in terms of Mean Opinion Score (MOS).

[399]  arXiv:2008.00268 (cross-list from math.CO) [pdf, ps, other]
Title: Big Ramsey degrees of 3-uniform hypergraphs are finite
Comments: 9 pages
Subjects: Combinatorics (math.CO); Discrete Mathematics (cs.DM); Logic (math.LO)

Generalizing the passing number construction by Sauer, we give a short proof of the fact that the universal homogeneous 3-uniform hypergraph has finite big Ramsey degrees. Our proof is based on vector (or product) form of the Milliken's tree theorem and demonstrates a general method to carry existing results on structures in binary relational languages to higher arities.

[400]  arXiv:2008.00280 (cross-list from physics.app-ph) [pdf, other]
Title: Analytical Modeling and Design of Gallium Oxide Schottky Barrier Diodes Beyond Unipolar Figure of Merit Using High-k Dielectric Superjunction Structures
Subjects: Applied Physics (physics.app-ph); Systems and Control (eess.SY)

This work presents the design of beta-Ga2O3 schottky barrier diode using high-k dielectric superjunction to significantly enhance the breakdown voltage vs on-resistance trade-off beyond its already high unipolar figure of merit. The device parameters are optimized using both TCAD simulations and analytical modeling using conformal mapping technique. The dielectric superjunction structure is found to be highly sensitive to the device dimensions and the dielectric constant of the insulator. The aspect ratio, which is the ratio of the length to the width of the drift region, is found to be the most important parameter in designing the structure and the proposed approach only works for aspect ratio much greater than one. The width of the dielectric layer and the dielectric constant also plays a crucial role in improving the device properties and are optimized to achieve maximum figure of merit. Using the optimized structure with an aspect ratio of 10 and a dielectric constant of 300, the structure is predicted to surpass the b-Ga2O3 unipolar figure of merit by four times indicating the promise of such structures for exceptional FOM vertical power electronics.

[401]  arXiv:2008.00315 (cross-list from stat.OT) [pdf, other]
Title: A fresh look at introductory data science
Subjects: Other Statistics (stat.OT); Computers and Society (cs.CY)

The proliferation of vast quantities of available datasets that are large and complex in nature has challenged universities to keep up with the demand for graduates trained in both the statistical and the computational set of skills required to effectively plan, acquire, manage, analyze, and communicate the findings of such data. To keep up with this demand, attracting students early on to data science as well as providing them a solid foray into the field becomes increasingly important. We present a case study of an introductory undergraduate course in data science that is designed to address these needs. Offered at Duke University, this course has no pre-requisites and serves a wide audience of aspiring statistics and data science majors as well as humanities, social sciences, and natural sciences students. We discuss the unique set of challenges posed by offering such a course and in light of these challenges, we present a detailed discussion into the pedagogical design elements, content, structure, computational infrastructure, and the assessment methodology of the course. We also offer a repository containing all teaching materials that are open-source, along with supplemental materials and the R code for reproducing the figures found in the paper.

[402]  arXiv:2008.00316 (cross-list from quant-ph) [pdf, other]
Title: Order from chaos in quantum walks on cyclic graphs
Comments: 5 pages, 7 figures
Subjects: Quantum Physics (quant-ph); Disordered Systems and Neural Networks (cond-mat.dis-nn); Neural and Evolutionary Computing (cs.NE); Quantum Algebra (math.QA); Chaotic Dynamics (nlin.CD)

It has been shown classically that combining two chaotic random walks can yield an ordered(periodic) walk. Our aim in this paper is to find a quantum analog for this rather counter-intuitive result. We study chaotic and periodic nature of cyclic quantum walks and focus on an unique situation wherein a periodic quantum walk on 3-cycle graph is generated via a deterministic combination of two chaotic quantum walks on the same graph.

[403]  arXiv:2008.00323 (cross-list from stat.ML) [pdf, other]
Title: Convergence of Sparse Variational Inference in Gaussian Processes Regression
Comments: Extended version of this http URL (arxiv version: arXiv:1903.03571 ). Published in Journal of Machine Learning Research: this http URL Code available at: this https URL
Journal-ref: Journal of Machine Learning Research, 21(131), 1-63 (2020)
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)

Gaussian processes are distributions over functions that are versatile and mathematically convenient priors in Bayesian modelling. However, their use is often impeded for data with large numbers of observations, $N$, due to the cubic (in $N$) cost of matrix operations used in exact inference. Many solutions have been proposed that rely on $M \ll N$ inducing variables to form an approximation at a cost of $\mathcal{O}(NM^2)$. While the computational cost appears linear in $N$, the true complexity depends on how $M$ must scale with $N$ to ensure a certain quality of the approximation. In this work, we investigate upper and lower bounds on how $M$ needs to grow with $N$ to ensure high quality approximations. We show that we can make the KL-divergence between the approximate model and the exact posterior arbitrarily small for a Gaussian-noise regression model with $M\ll N$. Specifically, for the popular squared exponential kernel and $D$-dimensional Gaussian distributed covariates, $M=\mathcal{O}((\log N)^D)$ suffice and a method with an overall computational cost of $\mathcal{O}(N(\log N)^{2D}(\log\log N)^2)$ can be used to perform inference.

[404]  arXiv:2008.00422 (cross-list from stat.ML) [pdf, other]
Title: Rule-based Bayesian regression
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Methodology (stat.ME)

We introduce a novel rule-based approach for handling regression problems. The new methodology carries elements from two frameworks: (i) it provides information about the uncertainty of the parameters of interest using Bayesian inference, and (ii) it allows the incorporation of expert knowledge through rule-based systems. The blending of those two different frameworks can be particularly beneficial for various domains (e.g. engineering), where, even though the significance of uncertainty quantification motivates a Bayesian approach, there is no simple way to incorporate researcher intuition into the model. We validate our models by applying them to synthetic applications: a simple linear regression problem and two more complex structures based on partial differential equations. Finally, we review the advantages of our methodology, which include the simplicity of the implementation, the uncertainty reduction due to the added information and, in some occasions, the derivation of better point predictions, and we address limitations, mainly from the computational complexity perspective, such as the difficulty in choosing an appropriate algorithm and the added computational burden.

[405]  arXiv:2008.00431 (cross-list from eess.SP) [pdf, other]
Title: Contact Classification in COVID-19 Tracing
Subjects: Signal Processing (eess.SP); Cryptography and Security (cs.CR)

The present paper addresses the task of reliably identifying critical contacts by using COVID-19 tracing apps. A reliable classification is crucial to ensure a high level of protection, and at the same time to prevent many people from being sent to quarantine by the app. Tracing apps are based on the capabilities of current smartphones to enable a broadest possible availability. Existing capabilities of smartphones include the exchange of Bluetooth Low Energy (BLE) signals and of audio signals, as well as the use of gyroscopes and magnetic sensors. The Bluetooth power measurements, which are often used today, may be complemented by audio ranging and attitude estimation in the future. Smartphones are worn in different ways, often in pockets and bags, which makes the propagation of signals and thus the classification rather unpredictable. Relying on the cooperation of users to wear their phones hanging from their neck would change the situation considerably. In this case the performance, achievable with BLE and audio measurements, becomes predictable. Our analysis identifies parameters that result in accurate warnings, at least within the scope of validity of the models. A significant reduction of the spreading of the disease can then be achieved by the apps, without causing many people to unduly go to quarantine. The present paper is the first of three papers which analyze the situation in some detail.

[406]  arXiv:2008.00466 (cross-list from quant-ph) [pdf, other]
Title: Complexity continuum within Ising formulation of NP problems
Comments: 11 pages, 4 figures
Subjects: Quantum Physics (quant-ph); Statistical Mechanics (cond-mat.stat-mech); Computational Complexity (cs.CC); Emerging Technologies (cs.ET); Computational Physics (physics.comp-ph)

A promising approach to achieve computational supremacy over the classical von Neumann architecture explores classical and quantum hardware as Ising machines. The minimisation of the Ising Hamiltonian is known to be NP-hard problem for certain interaction matrix classes, yet not all problem instances are equivalently hard to optimise. We propose to identify computationally simple instances with an `optimisation simplicity criterion'. Such optimisation simplicity can be found for a wide range of models from spin glasses to k-regular maximum cut problems. Many optical, photonic, and electronic systems are neuromorphic architectures that can naturally operate to optimise problems satisfying this criterion and, therefore, such problems are often chosen to illustrate the computational advantages of new Ising machines. We further probe an intermediate complexity for sparse and dense models by analysing circulant coupling matrices, that can be `rewired' to introduce greater complexity. A compelling approach for distinguishing easy and hard instances within the same NP-hard class of problems can be a starting point in developing a standardised procedure for the performance evaluation of emerging physical simulators and physics-inspired algorithms.

[407]  arXiv:2008.00492 (cross-list from math.AT) [pdf, ps, other]
Title: Extendability of simplicial maps is undecidable
Authors: A. Skopenkov
Comments: 11 pages, 2 figures
Subjects: Algebraic Topology (math.AT); Computational Geometry (cs.CG)

We present a short proof of the \v{C}adek-Kr\v{c}\'al-Matou\v{s}ek-Vok\v{r}\'inek-Wagner result from the title (in the following form due to Filakovsk\'y-Wagner-Zhechev).
For any fixed integer $l>1$ there is no algorithm recognizing the extendability of the identity map of $S^l\vee S^l$ to a PL map $X\to S^l\vee S^l$ of given $2l$-dimensional simplicial complex $X$ containing a subdivision of $S^l\vee S^l$ as a given subcomplex.
We also exhibit a gap in the Filakovsk\'y-Wagner-Zhechev proof that embeddability of complexes is undecidable in codimension $>1$.

[408]  arXiv:2008.00499 (cross-list from eess.IV) [pdf, other]
Title: Multi-level Wavelet-based Generative Adversarial Network for Perceptual Quality Enhancement of Compressed Video
Journal-ref: 16th European conference on computer vision. 2020 Aug 23
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

The past few years have witnessed fast development in video quality enhancement via deep learning. Existing methods mainly focus on enhancing the objective quality of compressed video while ignoring its perceptual quality. In this paper, we focus on enhancing the perceptual quality of compressed video. Our main observation is that enhancing the perceptual quality mostly relies on recovering high-frequency sub-bands in wavelet domain. Accordingly, we propose a novel generative adversarial network (GAN) based on multi-level wavelet packet transform (WPT) to enhance the perceptual quality of compressed video, which is called multi-level wavelet-based GAN (MW-GAN). In MW-GAN, we first apply motion compensation with a pyramid architecture to obtain temporal information. Then, we propose a wavelet reconstruction network with wavelet-dense residual blocks (WDRB) to recover the high-frequency details. In addition, the adversarial loss of MW-GAN is added via WPT to further encourage high-frequency details recovery for video frames. Experimental results demonstrate the superiority of our method.

[409]  arXiv:2008.00545 (cross-list from eess.AS) [pdf, other]
Title: Cross-Domain Adaptation of Spoken Language Identification for Related Languages: The Curious Case of Slavic Languages
Comments: Accepted for oral presentation in INTERSPEECH 2020
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL)

State-of-the-art spoken language identification (LID) systems, which are based on end-to-end deep neural networks, have shown remarkable success not only in discriminating between distant languages but also between closely-related languages or even different spoken varieties of the same language. However, it is still unclear to what extent neural LID models generalize to speech samples with different acoustic conditions due to domain shift. In this paper, we present a set of experiments to investigate the impact of domain mismatch on the performance of neural LID systems for a subset of six Slavic languages across two domains (read speech and radio broadcast) and examine two low-level signal descriptors (spectral and cepstral features) for this task. Our experiments show that (1) out-of-domain speech samples severely hinder the performance of neural LID models, and (2) while both spectral and cepstral features show comparable performance within-domain, spectral features show more robustness under domain mismatch. Moreover, we apply unsupervised domain adaptation to minimize the discrepancy between the two domains in our study. We achieve relative accuracy improvements that range from 9% to 77% depending on the diversity of acoustic conditions in the source domain.

[410]  arXiv:2008.00561 (cross-list from math.CO) [pdf, other]
Title: Tree pivot-minors and linear rank-width
Comments: 25 pages, 5 figures. An extended abstract of this paper appeared in the proceedings of EuroComb 2019
Subjects: Combinatorics (math.CO); Discrete Mathematics (cs.DM)

Tree-width and its linear variant path-width play a central role for the graph minor relation. In particular, Robertson and Seymour (1983) proved that for every tree~$T$, the class of graphs that do not contain $T$ as a minor has bounded path-width. For the pivot-minor relation, rank-width and linear rank-width take over the role from tree-width and path-width. As such, it is natural to examine if for every tree~$T$, the class of graphs that do not contain $T$ as a pivot-minor has bounded linear rank-width. We first prove that this statement is false whenever $T$ is a tree that is not a caterpillar. We conjecture that the statement is true if $T$ is a caterpillar. We are also able to give partial confirmation of this conjecture by proving: (1) for every tree $T$, the class of $T$-pivot-minor-free distance-hereditary graphs has bounded linear rank-width if and only if $T$ is a caterpillar; (2) for every caterpillar $T$ on at most four vertices, the class of $T$-pivot-minor-free graphs has bounded linear rank-width. To prove our second result, we only need to consider $T=P_4$ and $T=K_{1,3}$, but we follow a general strategy: first we show that the class of $T$-pivot-minor-free graphs is contained in some class of $(H_1,H_2)$-free graphs, which we then show to have bounded linear rank-width. In particular, we prove that the class of $(K_3,S_{1,2,2})$-free graphs has bounded linear rank-width, which strengthens a known result that this graph class has bounded rank-width.

[411]  arXiv:2008.00565 (cross-list from stat.ML) [pdf, other]
Title: Geometrically Enriched Latent Spaces
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)

A common assumption in generative models is that the generator immerses the latent space into a Euclidean ambient space. Instead, we consider the ambient space to be a Riemannian manifold, which allows for encoding domain knowledge through the associated Riemannian metric. Shortest paths can then be defined accordingly in the latent space to both follow the learned manifold and respect the ambient geometry. Through careful design of the ambient metric we can ensure that shortest paths are well-behaved even for deterministic generators that otherwise would exhibit a misleading bias. Experimentally we show that our approach improves interpretability of learned representations both using stochastic and deterministic generators.

[412]  arXiv:2008.00569 (cross-list from math.DS) [pdf, other]
Title: On Frink's type metrization of weighted graphs
Comments: 13 pages, 3 figures
Subjects: Dynamical Systems (math.DS); Machine Learning (cs.LG); Analysis of PDEs (math.AP); General Topology (math.GN)

Using the technique of the metrization theorem of uniformities with countable bases, in this note we provide, test and compare an explicit algorithm to produce a metric $d(x,y)$ between the vertices $x$ and $y$ of an affinity weighted undirected graph.

[413]  arXiv:2008.00572 (cross-list from eess.SP) [pdf, other]
Title: All-Digital FPGA-based DAC with None or Few External Components
Comments: Submitted to IEEE Transactions on Circuits and Systems I; 9 pages, 13 figures
Subjects: Signal Processing (eess.SP); Systems and Control (eess.SY)

One of the many limitations with the mixed-signal design is physically testing circuit ideas. While it is easier to test digital circuits with FPGAs, this can not be done usually with mixed-signal circuits. Although some FPGAs have built-in analog-to-digital and digital-to-analog converters, regular commercial FPGAs development boards and low-cost FPGAs lack built-in data converters. Here we introduce an all-digital FPGA-based DAC, which is one of the main blocks to enable mixed-signal experiments. The DAC can be synthesized entirely in an FPGA and does not require the use of external components. Furthermore, and to extend its range of applications, a discussion regarding the proposed DAC's problems and possible solutions is presented. Experimental demonstration of a 4-bit and a 5-bit DAC corroborate the theoretical analysis developed in this work. This work also suggests a scheme which includes few external resistors to improve the linearity (DNL$\leq$0.25LSB and an INL$\leq$0.5LSB), and the power consumption (5X improvement over the standalone configuration).

[414]  arXiv:2008.00573 (cross-list from math.CO) [pdf, other]
Title: On the degree sequences of dual graphs on surfaces
Subjects: Combinatorics (math.CO); Discrete Mathematics (cs.DM)

Given two graphs $G$ and $G^*$ with a one-to-one correspondence between their edges, when do $G$ and $G^*$ form a pair of dual graphs realizing the vertices and countries of a map embedded in a surface? A criterion was obtained by Jack Edmonds in 1965. Furthermore, let $\boldsymbol{d}=(d_1,\ldots,d_n)$ and $\boldsymbol{t}=(t_1,\ldots,t_m)$ be their degree sequences. Then, clearly, $\sum_{i=1}^n d_i = \sum_{j=1}^m t_j = 2\ell$, where $\ell$ is the number of edges in each of the two graphs, and $\chi = n - \ell + m$ is the Euler characteristic of the surface. Which sequences $\boldsymbol{d}$ and $\boldsymbol{t}$ satisfying these conditions still cannot be realized as the degree sequences? We make use of Edmonds' criterion to obtain several infinite series of exceptions for the sphere, $\chi = 2$, and projective plane, $\chi = 1$. We conjecture that there exist no exceptions for $\chi \leq 0$.

[415]  arXiv:2008.00605 (cross-list from eess.IV) [pdf, other]
Title: The Rate-Distortion-Accuracy Tradeoff: JPEG Case Study
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

Handling digital images is almost always accompanied by a lossy compression in order to facilitate efficient transmission and storage. This introduces an unavoidable tension between the allocated bit-budget (rate) and the faithfulness of the resulting image to the original one (distortion). An additional complicating consideration is the effect of the compression on recognition performance by given classifiers (accuracy). This work aims to explore this rate-distortion-accuracy tradeoff. As a case study, we focus on the design of the quantization tables in the JPEG compression standard. We offer a novel optimal tuning of these tables via continuous optimization, leveraging a differential implementation of both the JPEG encoder-decoder and an entropy estimator. This enables us to offer a unified framework that considers the interplay between rate, distortion and classification accuracy. In all these fronts, we report a substantial boost in performance by a simple and easily implemented modification of these tables.

[416]  arXiv:2008.00613 (cross-list from eess.AS) [pdf, other]
Title: Exploiting Deep Sentential Context for Expressive End-to-End Speech Synthesis
Comments: Accepted by Interspeech2020
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)

Attention-based seq2seq text-to-speech systems, especially those use self-attention networks (SAN), have achieved state-of-art performance. But an expressive corpus with rich prosody is still challenging to model as 1) prosodic aspects, which span across different sentential granularities and mainly determine acoustic expressiveness, are difficult to quantize and label and 2) the current seq2seq framework extracts prosodic information solely from a text encoder, which is easily collapsed to an averaged expression for expressive contents. In this paper, we propose a context extractor, which is built upon SAN-based text encoder, to sufficiently exploit the sentential context over an expressive corpus for seq2seq-based TTS. Our context extractor first collects prosodic-related sentential context information from different SAN layers and then aggregates them to learn a comprehensive sentence representation to enhance the expressiveness of the final generated speech. Specifically, we investigate two methods of context aggregation: 1) direct aggregation which directly concatenates the outputs of different SAN layers, and 2) weighted aggregation which uses multi-head attention to automatically learn contributions for different SAN layers. Experiments on two expressive corpora show that our approach can produce more natural speech with much richer prosodic variations, and weighted aggregation is more superior in modeling expressivity.

[417]  arXiv:2008.00616 (cross-list from eess.AS) [pdf, other]
Title: Multitask learning for instrument activation aware music source separation
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG)

Music source separation is a core task in music information retrieval which has seen a dramatic improvement in the past years. Nevertheless, most of the existing systems focus exclusively on the problem of source separation itself and ignore the utilization of other~---possibly related---~MIR tasks which could lead to additional quality gains. In this work, we propose a novel multitask structure to investigate using instrument activation information to improve source separation performance. Furthermore, we investigate our system on six independent instruments, a more realistic scenario than the three instruments included in the widely-used MUSDB dataset, by leveraging a combination of the MedleyDB and Mixing Secrets datasets. The results show that our proposed multitask model outperforms the baseline Open-Unmix model on the mixture of Mixing Secrets and MedleyDB dataset while maintaining comparable performance on the MUSDB dataset.

[418]  arXiv:2008.00620 (cross-list from eess.AS) [pdf, ps, other]
Title: Audiovisual Speech Synthesis using Tacotron2
Comments: This work has been submitted to the IEEE transactions on Multimedia for possible publication
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)

Audiovisual speech synthesis is the problem of synthesizing a talking face while maximizing the coherency of the acoustic and visual speech. In this paper, we propose and compare two audiovisual speech synthesis systems for 3D face models. The first system is the AVTacotron2, which is an end-to-end text-to-audiovisual speech synthesizer based on the Tacotron2 architecture. AVTacotron2 converts a sequence of phonemes representing the sentence to synthesize into a sequence of acoustic features and the corresponding controllers of a face model. The output acoustic features are used to condition a WaveRNN to reconstruct the speech waveform, and the output facial controllers are used to generate the corresponding video of the talking face. The second audiovisual speech synthesis system is modular, where acoustic speech is synthesized from text using the traditional Tacotron2. The reconstructed acoustic speech signal is then used to drive the facial controls of the face model using an independently trained audio-to-facial-animation neural network. We further condition both the end-to-end and modular approaches on emotion embeddings that encode the required prosody to generate emotional audiovisual speech. We analyze the performance of the two systems and compare them to the ground truth videos using subjective evaluation tests. The end-to-end and modular systems are able to synthesize close to human-like audiovisual speech with mean opinion scores (MOS) of 4.1 and 3.9, respectively, compared to a MOS of 4.1 for the ground truth generated from professionally recorded videos. While the end-to-end system gives a better overall quality, the modular approach is more flexible and the quality of acoustic speech and visual speech synthesis is almost independent of each other.

[419]  arXiv:2008.00651 (cross-list from physics.soc-ph) [pdf, ps, other]
Title: Effective Self-Healing Networks against Attacks or Disasters in Resource Allocation Control
Comments: 7 pages, 6 figures, 2 tables, Proc. of 12th Int. Conf. on Adaptive and Self-Adaptive Systems and Applications
Subjects: Physics and Society (physics.soc-ph); Social and Information Networks (cs.SI); Adaptation and Self-Organizing Systems (nlin.AO)

With increasing threats by large attacks or disasters, the time has come to reconstruct network infrastructures such as communication or transportation systems rather than to recover them as before in case of accidents, because many real networks are extremely vulnerable. Thus, we consider self-healing mechanisms by rewirings (reuse or addition of links) to be sustainable and resilient networks even against malicious attacks. In distributed local process for healing, the key strategies are the extension of candidates of linked nodes and enhancing loops by applying a message-passing algorithm inspired from statistical physics. Simulation results show that our proposed combination of ring formation and enhancing loops is particularly effective in comparison with the conventional methods, when more than half damaged links alive or are compensated from reserved ones.

[420]  arXiv:2008.00691 (cross-list from quant-ph) [pdf, other]
Title: Quantum versus Classical Generative Modelling in Finance
Comments: 17 Pages, 19 Figures
Subjects: Quantum Physics (quant-ph); Machine Learning (cs.LG)

Finding a concrete use case for quantum computers in the near term is still an open question, with machine learning typically touted as one of the first fields which will be impacted by quantum technologies. In this work, we investigate and compare the capabilities of quantum versus classical models for the task of generative modelling in machine learning. We use a real world financial dataset consisting of correlated currency pairs and compare two models in their ability to learn the resulting distribution - a restricted Boltzmann machine, and a quantum circuit Born machine. We provide extensive numerical results indicating that the simulated Born machine always at least matches the performance of the Boltzmann machine in this task, and demonstrates superior performance as the model scales. We perform experiments on both simulated and physical quantum chips using the Rigetti forest platform, and also are able to partially train the largest instance to date of a quantum circuit Born machine on quantum hardware. Finally, by studying the entanglement capacity of the training Born machines, we find that entanglement typically plays a role in the problem instances which demonstrate an advantage over the Boltzmann machine.

[421]  arXiv:2008.00702 (cross-list from eess.AS) [pdf, other]
Title: Multimodal Semi-supervised Learning Framework for Punctuation Prediction in Conversational Speech
Comments: Accepted for Interspeech 2020
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL)

In this work, we explore a multimodal semi-supervised learning approach for punctuation prediction by learning representations from large amounts of unlabelled audio and text data. Conventional approaches in speech processing typically use forced alignment to encoder per frame acoustic features to word level features and perform multimodal fusion of the resulting acoustic and lexical representations. As an alternative, we explore attention based multimodal fusion and compare its performance with forced alignment based fusion. Experiments conducted on the Fisher corpus show that our proposed approach achieves ~6-9% and ~3-4% absolute improvement (F1 score) over the baseline BLSTM model on reference transcripts and ASR outputs respectively. We further improve the model robustness to ASR errors by performing data augmentation with N-best lists which achieves up to an additional ~2-6% improvement on ASR outputs. We also demonstrate the effectiveness of semi-supervised learning approach by performing ablation study on various sizes of the corpus. When trained on 1 hour of speech and text data, the proposed model achieved ~9-18% absolute improvement over baseline model.

[422]  arXiv:2008.00705 (cross-list from quant-ph) [pdf, other]
Title: Certified Randomness From Steering Using Sequential Measurements
Comments: 35 pages, 9 Figures. This is a pre-published extended version of a workshop edition which appeared in the proceedings of PC 2018 (EPTCS 273, 2018, pp. 14-26). The published version of this work is available below
Journal-ref: Cryptography 2019, 3(4)
Subjects: Quantum Physics (quant-ph); Cryptography and Security (cs.CR)

The generation of certifiable randomness is one of the most promising applications of quantum technologies. Furthermore, the intrinsic non-locality of quantum correlations allow us to certify randomness in a device-independent way, i.e. one need not make assumptions about the devices used. Due to the work of Curchod et. al., a single entangled two-qubit pure state can be used to produce arbitrary amounts of certified randomness. However, the obtaining of this randomness is experimentally challenging as it requires a large number of measurements, both projective and general. Motivated by these difficulties in the device-independent setting, we instead consider the scenario of one-sided device independence where certain devices are trusted, and others not; a scenario motivated by asymmetric experimental set-ups such as ion-photon networks. We show how certain aspects of previous work can be adapted to this scenario and provide theoretical bounds on the amount of randomness which can be certified. Furthermore, we give a protocol for unbounded randomness certification in this scenario, and provide numerical results demonstrating the protocol in the ideal case. Finally, we numerically test the possibility of implementing this scheme on near-term quantum technologies, by considering the performance of the protocol on several physical platforms.

[423]  arXiv:2008.00713 (cross-list from quant-ph) [pdf, other]
Title: Exploiting degeneracy to construct good ternary quantum error correcting code
Subjects: Quantum Physics (quant-ph); Emerging Technologies (cs.ET)

Quantum error-correcting code for higher dimensional systems can, in general, be directly constructed from the codes for qubit systems. What remains unknown is whether there exist efficient code design techniques for higher dimensional systems. In this paper, we propose a 7-qutrit error-correcting code for the ternary quantum system and show that this design formulation has no equivalence in qubit systems. This code is optimum in the number of qutrits required to correct a single error while maintaining the CSS structure. This degenerate CSS code can (i) correct up to seven simultaneous phase errors and a single bit error, (ii) correct two simultaneous bit errors on pre-defined pairs of qutrits on eighteen out of twenty-one possible pairs, and (iii) in terms of the cost of implementation, the depth of the circuit of this code is only two more than that of the ternary Steane code. Our proposed code shows that it is possible to design better codes explicitly for ternary quantum systems instead of simply carrying over codes from binary quantum systems.

[424]  arXiv:2008.00731 (cross-list from eess.AS) [pdf]
Title: Unsupervised Discovery of Recurring Speech Patterns Using Probabilistic Adaptive Metrics
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL)

Unsupervised spoken term discovery (UTD) aims at finding recurring segments of speech from a corpus of acoustic speech data. One potential approach to this problem is to use dynamic time warping (DTW) to find well-aligning patterns from the speech data. However, automatic selection of initial candidate segments for the DTW-alignment and detection of "sufficiently good" alignments among those require some type of pre-defined criteria, often operationalized as threshold parameters for pair-wise distance metrics between signal representations. In the existing UTD systems, the optimal hyperparameters may differ across datasets, limiting their applicability to new corpora and truly low-resource scenarios. In this paper, we propose a novel probabilistic approach to DTW-based UTD named as PDTW. In PDTW, distributional characteristics of the processed corpus are utilized for adaptive evaluation of alignment quality, thereby enabling systematic discovery of pattern pairs that have similarity what would be expected by coincidence. We test PDTW on Zero Resource Speech Challenge 2017 datasets as a part of 2020 implementation of the challenge. The results show that the system performs consistently on all five tested languages using fixed hyperparameters, clearly outperforming the earlier DTW-based system in terms of coverage of the detected patterns.

[425]  arXiv:2008.00756 (cross-list from eess.AS) [pdf, other]
Title: Structure and Automatic Segmentation of Dhrupad Vocal Bandish Audio
Comments: Part of this work published in ISMIR 2020
Subjects: Audio and Speech Processing (eess.AS); Information Retrieval (cs.IR); Machine Learning (cs.LG)

A Dhrupad vocal concert comprises a composition section that is interspersed with improvised episodes of increased rhythmic activity involving the interaction between the vocals and the percussion. Tracking the changing rhythmic density, in relation to the underlying metric tempo of the piece, thus facilitates the detection and labeling of the improvised sections in the concert structure. This work concerns the automatic detection of the musically relevant rhythmic densities as they change in time across the bandish (composition) performance. An annotated dataset of Dhrupad bandish concert sections is presented. We investigate a CNN-based system, trained to detect local tempo relationships, and follow it with temporal smoothing. We also employ audio source separation as a pre-processing step to the detection of the individual surface densities of the vocals and the percussion. This helps us obtain the complete musical description of the concert sections in terms of capturing the changing rhythmic interaction of the two performers.

[426]  arXiv:2008.00768 (cross-list from eess.AS) [pdf, other]
Title: One Model, Many Languages: Meta-learning for Multilingual Text-to-Speech
Comments: Accepted to INTERSPEECH 2020; for the source files, see this https URL
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG)

We introduce an approach to multilingual speech synthesis which uses the meta-learning concept of contextual parameter generation and produces natural-sounding multilingual speech using more languages and less training data than previous approaches. Our model is based on Tacotron 2 with a fully convolutional input text encoder whose weights are predicted by a separate parameter generator network. To boost voice cloning, the model uses an adversarial speaker classifier with a gradient reversal layer that removes speaker-specific information from the encoder.
We arranged two experiments to compare our model with baselines using various levels of cross-lingual parameter sharing, in order to evaluate: (1) stability and performance when training on low amounts of data, (2) pronunciation accuracy and voice quality of code-switching synthesis. For training, we used the CSS10 dataset and our new small dataset based on Common Voice recordings in five languages. Our model is shown to effectively share information across languages and according to a subjective evaluation test, it produces more natural and accurate code-switching speech than the baselines.

[427]  arXiv:2008.00781 (cross-list from eess.AS) [pdf, other]
Title: MusiCoder: A Universal Music-Acoustic Encoder Based on Transformers
Comments: 12 pages, submitted to MMM2021
Subjects: Audio and Speech Processing (eess.AS); Multimedia (cs.MM)

Music annotation has always been one of the critical topics in the field of Music Information Retrieval (MIR). Traditional models use supervised learning for music annotation tasks. However, as supervised machine learning approaches increase in complexity, the increasing need for more annotated training data can often not be matched with available data. Moreover, over-reliance on labeled data when training supervised learning models can lead to unexpected results and open vulnerabilities for adversarial attacks. In this paper, a new self-supervised music acoustic representation learning approach named MusiCoder is proposed. Inspired by the success of BERT, MusiCoder builds upon the architecture of self-attention bidirectional transformers. Two pre-training objectives, including Contiguous Frames Masking (CFM) and Contiguous Channels Masking (CCM), are designed to adapt BERT-like masked reconstruction pre-training to continuous acoustic frame domain. The performance of MusiCoder is evaluated in two downstream music annotation tasks. The results show that MusiCoder outperforms the state-of-the-art models in both music genre classification and auto-tagging tasks. The effectiveness of MusiCoder indicates a great potential of a new self-supervised learning approach to understand music: first apply masked reconstruction tasks to pre-train a transformer-based model with massive unlabeled music acoustic data, and then finetune the model on specific downstream tasks with labeled data.

[428]  arXiv:2008.00802 (cross-list from eess.IV) [pdf, other]
Title: Multi-Scale Deep Compressive Imaging
Comments: 12 pages, 11 figures
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

Recently, deep learning-based compressive imaging (DCI) has surpassed the conventional compressive imaging in reconstruction quality and faster running time. While multi-scale has shown superior performance over single-scale, research in DCI has been limited to single-scale sampling. Despite training with single-scale images, DCI tends to favor low-frequency components similar to the conventional multi-scale sampling, especially at low subrate. From this perspective, it would be easier for the network to learn multi-scale features with a multi-scale sampling architecture. In this work, we proposed a multi-scale deep compressive imaging (MS-DCI) framework which jointly learns to decompose, sample, and reconstruct images at multi-scale. A three-phase end-to-end training scheme was introduced with an initial and two enhance reconstruction phases to demonstrate the efficiency of multi-scale sampling and further improve the reconstruction performance. We analyzed the decomposition methods (including Pyramid, Wavelet, and Scale-space), sampling matrices, and measurements and showed the empirical benefit of MS-DCI which consistently outperforms both conventional and deep learning-based approaches.

[429]  arXiv:2008.00816 (cross-list from eess.AS) [pdf, other]
Title: Evolving Multi-Resolution Pooling CNN for Monaural Singing Voice Separation
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)

Monaural Singing Voice Separation (MSVS) is a challenging task and has been studied for decades. Deep neural networks (DNNs) are the current state-of-the-art methods for MSVS. However, the existing DNNs are often designed manually, which is time-consuming and error-prone. In addition, the network architectures are usually pre-defined, and not adapted to the training data. To address these issues, we introduce a Neural Architecture Search (NAS) method to the structure design of DNNs for MSVS. Specifically, we propose a new multi-resolution Convolutional Neural Network (CNN) framework for MSVS namely Multi-Resolution Pooling CNN (MRP-CNN), which uses various-size pooling operators to extract multi-resolution features. Based on the NAS, we then develop an evolving framework namely Evolving MRP-CNN (E-MRP-CNN), by automatically searching the effective MRP-CNN structures using genetic algorithms, optimized in terms of a single-objective considering only separation performance, or multi-objective considering both the separation performance and the model complexity. The multi-objective E-MRP-CNN gives a set of Pareto-optimal solutions, each providing a trade-off between separation performance and model complexity. Quantitative and qualitative evaluations on the MIR-1K and DSD100 datasets are used to demonstrate the advantages of the proposed framework over several recent baselines.

[430]  arXiv:2008.00817 (cross-list from eess.IV) [pdf, other]
Title: Retinal Image Segmentation with a Structure-Texture Demixing Network
Comments: Accepted to MICCAI 2020
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

Retinal image segmentation plays an important role in automatic disease diagnosis. This task is very challenging because the complex structure and texture information are mixed in a retinal image, and distinguishing the information is difficult. Existing methods handle texture and structure jointly, which may lead biased models toward recognizing textures and thus results in inferior segmentation performance. To address it, we propose a segmentation strategy that seeks to separate structure and texture components and significantly improve the performance. To this end, we design a structure-texture demixing network (STD-Net) that can process structures and textures differently and better. Extensive experiments on two retinal image segmentation tasks (i.e., blood vessel segmentation, optic disc and cup segmentation) demonstrate the effectiveness of the proposed method.

[431]  arXiv:2008.00826 (cross-list from physics.soc-ph) [pdf, other]
Title: A Generalized SIS Epidemic Model on Temporal Networks with Asymptomatic Carriers and Comments on Decay Ratio
Subjects: Physics and Society (physics.soc-ph); Social and Information Networks (cs.SI); Systems and Control (eess.SY)

We study the class of SIS epidemics on temporal networks and propose a new activity-driven and adaptive epidemic model that captures the impact of asymptomatic and infectious individuals in the network. In the proposed model, referred to as the A-SIYS epidemic, each node can be in three possible states: susceptible, infected without symptoms or asymptomatic and infected with symptoms or symptomatic. Both asymptomatic and symptomatic individuals are infectious. We show that the proposed A-SIYS epidemic captures several well-established epidemic models as special cases and obtain sufficient conditions under which the disease gets eradicated by resorting to mean-field approximations.
In addition, we highlight a potential inaccuracy in the derivation of the upper bound on the decay ratio in the activity-driven adaptive SIS (A-SIS) model in (Ogura et. al., 2019) and present a more general version of their result. We numerically illustrate the evolution of the fraction of infected nodes in the A-SIS epidemic model and show that the bound in (Ogura et. al., 2019) often fails to capture the behavior of the epidemic in contrast with our results.

[432]  arXiv:2008.00864 (cross-list from eess.IV) [pdf, other]
Title: Intensity-only Mode Decomposition on Multimode Fibers using a Densely Connected Convolutional Network
Comments: 17 pages
Subjects: Image and Video Processing (eess.IV); Systems and Control (eess.SY)

The use of multimode fibers offers advantages in the field of communication technology in terms of transferable information density and information security. For applications using physical layer security or mode division multiplexing, the complex transmission matrix must be known. To measure the transmission matrix, the individual modes of the multimode fiber are excited sequentially at the input and a mode decomposition is performed at the output. Mode decomposition is usually performed using digital holography, which requires the provision of a reference wave and leads to high efforts. To overcome these drawbacks, a neural network is proposed, which performs mode decomposition with intensity-only camera recordings of the multimode fiber facet. Due to the high computational complexity of the problem, this approach was usually limited to a number of 6 modes. In this work, it could be shown for the first time that by using a DenseNet with 121 layers it is possible to break through the hurdle of 6 modes. The advancement is demonstrated by a mode decomposition with 10 modes experimentally. The training process is based on synthetic data. The proposed method is quantitatively compared to the conventional approach with digital holography. In addition, it is shown that the network can perform mode decomposition on a 55-mode fiber, which also supports modes unknown to the neural network. The smart detection using a DenseNet opens new ways for the application of multimode fibers in optical communication networks for physical layer security.

[433]  arXiv:2008.00889 (cross-list from eess.AS) [pdf, other]
Title: Speaker dependent articulatory-to-acoustic mapping using real-time MRI of the vocal tract
Comments: 5 pages, accepted for publication at Interspeech 2020
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Image and Video Processing (eess.IV)

Articulatory-to-acoustic (forward) mapping is a technique to predict speech using various articulatory acquisition techniques (e.g. ultrasound tongue imaging, lip video). Real-time MRI (rtMRI) of the vocal tract has not been used before for this purpose. The advantage of MRI is that it has a high `relative' spatial resolution: it can capture not only lingual, labial and jaw motion, but also the velum and the pharyngeal region, which is typically not possible with other techniques. In the current paper, we train various DNNs (fully connected, convolutional and recurrent neural networks) for articulatory-to-speech conversion, using rtMRI as input, in a speaker-specific way. We use two male and two female speakers of the USC-TIMIT articulatory database, each of them uttering 460 sentences. We evaluate the results with objective (Normalized MSE and MCD) and subjective measures (perceptual test) and show that CNN-LSTM networks are preferred which take multiple images as input, and achieve MCD scores between 2.8-4.5 dB. In the experiments, we find that the predictions of speaker `m1' are significantly weaker than other speakers. We show that this is caused by the fact that 74% of the recordings of speaker `m1' are out of sync.

[434]  arXiv:2008.00893 (cross-list from physics.soc-ph) [pdf, other]
Title: Tracing carbon dioxide emissions in the European electricity markets
Comments: Accepted conference proceedings paper for the EEM 2020
Subjects: Physics and Society (physics.soc-ph); Systems and Control (eess.SY)

Consumption-based carbon emission measures aim to account for emissions associated with power transmission from distant regions, as opposed to measures which only consider local power generation. Outlining key differences between two different methodological variants of this approach, we report results on consumption-based emission intensities of power generation for European countries from 2016 to 2019. We find that in particular for well connected smaller countries, the consideration of imports has a significant impact on the attributed emissions. For these countries, implicit methodological choices in the input-output model are reflected in both hourly and average yearly emission measures.

[435]  arXiv:2008.00901 (cross-list from eess.IV) [pdf, other]
Title: Automated Segmentation of Brain Gray Matter Nuclei on Quantitative Susceptibility Mapping Using Deep Convolutional Neural Network
Comments: submitted to IEEE Transactions on Medical Imaging
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

Abnormal iron accumulation in the brain subcortical nuclei has been reported to be correlated to various neurodegenerative diseases, which can be measured through the magnetic susceptibility from the quantitative susceptibility mapping (QSM). To quantitively measure the magnetic susceptibility, the nuclei should be accurately segmented, which is a tedious task for clinicians. In this paper, we proposed a double-branch residual-structured U-Net (DB-ResUNet) based on 3D convolutional neural network (CNN) to automatically segment such brain gray matter nuclei. To better tradeoff between segmentation accuracy and the memory efficiency, the proposed DB-ResUNet fed image patches with high resolution and the patches with low resolution but larger field of view into the local and global branches, respectively. Experimental results revealed that by jointly using QSM and T$_\text{1}$ weighted imaging (T$_\text{1}$WI) as inputs, the proposed method was able to achieve better segmentation accuracy over its single-branch counterpart, as well as the conventional atlas-based method and the classical 3D-UNet structure. The susceptibility values and the volumes were also measured, which indicated that the measurements from the proposed DB-ResUNet are able to present high correlation with values from the manually annotated regions of interest.

[436]  arXiv:2008.00930 (cross-list from eess.IV) [pdf, other]
Title: FaultFace: Deep Convolutional Generative Adversarial Network (DCGAN) based Ball-Bearing Failure Detection Method
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Signal Processing (eess.SP)

Failure detection is employed in the industry to improve system performance and reduce costs due to unexpected malfunction events. So, a good dataset of the system is desirable for designing an automated failure detection system. However, industrial process datasets are unbalanced and contain little information about failure behavior due to the uniqueness of these events and the high cost for running the system just to get information about the undesired behaviors. For this reason, performing correct training and validation of automated failure detection methods is challenging. This paper proposes a methodology called FaultFace for failure detection on Ball-Bearing joints for rotational shafts using deep learning techniques to create balanced datasets. The FaultFace methodology uses 2D representations of vibration signals denominated faceportraits obtained by time-frequency transformation techniques. From the obtained faceportraits, a Deep Convolutional Generative Adversarial Network is employed to produce new faceportraits of the nominal and failure behaviors to get a balanced dataset. A Convolutional Neural Network is trained for fault detection employing the balanced dataset. The FaultFace methodology is compared with other deep learning techniques to evaluate its performance in for fault detection with unbalanced datasets. Obtained results show that FaultFace methodology has a good performance for failure detection for unbalanced datasets.

[437]  arXiv:2008.00953 (cross-list from eess.AS) [pdf, other]
Title: Modular End-to-end Automatic Speech Recognition Framework for Acoustic-to-word Model
Comments: Accepted by IEEE TASLP
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)

End-to-end (E2E) systems have played a more and more important role in automatic speech recognition (ASR) and achieved great performance. However, E2E systems recognize output word sequences directly with the input acoustic feature, which can only be trained on limited acoustic data. The extra text data is widely used to improve the results of traditional artificial neural network-hidden Markov model (ANN-HMM) hybrid systems. The involving of extra text data to standard E2E ASR systems may break the E2E property during decoding. In this paper, a novel modular E2E ASR system is proposed. The modular E2E ASR system consists of two parts: an acoustic-to-phoneme (A2P) model and a phoneme-to-word (P2W) model. The A2P model is trained on acoustic data, while extra data including large scale text data can be used to train the P2W model. This additional data enables the modular E2E ASR system to model not only the acoustic part but also the language part. During the decoding phase, the two models will be integrated and act as a standard acoustic-to-word (A2W) model. In other words, the proposed modular E2E ASR system can be easily trained with extra text data and decoded in the same way as a standard E2E ASR system. Experimental results on the Switchboard corpus show that the modular E2E model achieves better word error rate (WER) than standard A2W models.

[438]  arXiv:2008.00966 (cross-list from cond-mat.soft) [pdf, other]
Title: Using neural networks to predict icephobic performance
Subjects: Soft Condensed Matter (cond-mat.soft); Materials Science (cond-mat.mtrl-sci); Machine Learning (cs.LG)

Icephobic surfaces inspired by superhydrophobic surfaces offer a passive solution to the problem of icing. However, modeling icephobicity is challenging because some material features that aid superhydrophobicity can adversely affect the icephobic performance. This study presents a new approach based on artificial neural networks to model icephobicity. Artificial neural network models were developed to predict the icephobic performance of concrete. The models were trained on experimental data to predict the surface ice adhesion strength and the coefficient of restitution (COR) of water droplet bouncing off the surface under freezing conditions. The material and coating compositions, and environmental condition were used as the models' input variables. A multilayer perceptron was trained to predict COR with a root mean squared error of 0.08, and a 90% confidence interval of [0.042, 0.151]. The model had a coefficient of determination of 0.92 after deployment. Since ice adhesion strength varied over a wide range of values for the samples, a mixture density network was model was developed to learn the underlying relationship in the multimodal data. Coefficient of determination for the model was 0.96. The relative importance of the input variables in icephobic performance were calculated using permutation importance. The developed models will be beneficial to optimize icephobicity of concrete.

[439]  arXiv:2008.01004 (cross-list from q-bio.BM) [pdf]
Title: Identification of 1H-NMR Spectra of Xyloglucan Oligosaccharides: A Comparative Study of Artificial Neural Networks and Bayesian Classification Using Nonparametric Density Estimation
Comments: 6 pages. Published in IEEE ICAI99
Journal-ref: Published in IEEE ICAI 1999 549-553
Subjects: Biomolecules (q-bio.BM); Neural and Evolutionary Computing (cs.NE)

Proton nuclear magnetic resonance (1H-NMR) is a widely used tool for chemical structural analysis. However, 1H-NMR spectra suffer from natural aberrations that render computer-assisted automated identification of these spectra difficult, and at times impossible. Previous efforts have successfully implemented instrument dependent or conditional identification of these spectra. In this paper, we report the first instrument independent computer-assisted automated identification system for a group of complex carbohydrates known as the xyloglucan oligosaccharides. The developed system is also implemented on the world wide web (this http URL) as part of an identification package called the CCRC-Net and is intended to recognize any submitted 1H-NMR spectrum of these structures with reasonable signal-to-noise ratio, recorded on any 500 MHz NMR instrument. The system uses Artificial Neural Networks (ANNs) technology and is insensitive to the instrument and environment-dependent variations in 1H-NMR spectroscopy. In this paper, comparative results of the ANN engine versus a multidimensional Bayes' classifier is also presented.

[440]  arXiv:2008.01007 (cross-list from stat.ME) [pdf, other]
Title: Parametric Copula-GP model for analyzing multidimensional neuronal and behavioral relationships
Subjects: Methodology (stat.ME); Neural and Evolutionary Computing (cs.NE)

One of the main challenges in current systems neuroscience is the analysis of high-dimensional neuronal and behavioral data that are characterized by different statistics and timescales of the recorded variables. We propose a parametric copula model which separates the statistics of the individual variables from their dependence structure, and escapes the curse of dimensionality by using vine copula constructions. We use a Bayesian framework with Gaussian Process (GP) priors over copula parameters, conditioned on a continuous task-related variable. We validate the model on synthetic data and compare its performance in estimating mutual information against the commonly used non-parametric algorithms.
Our model provides accurate information estimates when the dependencies in the data match the parametric copulas used in our framework. When the exact density estimation with a parametric model is not possible, our Copula-GP model is still able to provide reasonable information estimates, close to the ground truth and comparable to those obtained with a neural network estimator. Finally, we apply our framework to real neuronal and behavioral recordings obtained in awake mice. We demonstrate the ability of our framework to
1) produce accurate and interpretable bivariate models for the analysis of inter-neuronal noise correlations or behavioral modulations;
2) expand to more than 100 dimensions and measure information content in the whole-population statistics. These results demonstrate that the Copula-GP framework is particularly useful for the analysis of complex multidimensional relationships between neuronal, sensory and behavioral data.

[441]  arXiv:2008.01011 (cross-list from math.FA) [pdf, other]
Title: Phase Transitions in Rate Distortion Theory and Deep Learning
Subjects: Functional Analysis (math.FA); Machine Learning (cs.LG)

Rate distortion theory is concerned with optimally encoding a given signal class $\mathcal{S}$ using a budget of $R$ bits, as $R\to\infty$. We say that $\mathcal{S}$ can be compressed at rate $s$ if we can achieve an error of $\mathcal{O}(R^{-s})$ for encoding $\mathcal{S}$; the supremal compression rate is denoted $s^\ast(\mathcal{S})$. Given a fixed coding scheme, there usually are elements of $\mathcal{S}$ that are compressed at a higher rate than $s^\ast(\mathcal{S})$ by the given coding scheme; we study the size of this set of signals. We show that for certain "nice" signal classes $\mathcal{S}$, a phase transition occurs: We construct a probability measure $\mathbb{P}$ on $\mathcal{S}$ such that for every coding scheme $\mathcal{C}$ and any $s >s^\ast(\mathcal{S})$, the set of signals encoded with error $\mathcal{O}(R^{-s})$ by $\mathcal{C}$ forms a $\mathbb{P}$-null-set. In particular our results apply to balls in Besov and Sobolev spaces that embed compactly into $L^2(\Omega)$ for a bounded Lipschitz domain $\Omega$. As an application, we show that several existing sharpness results concerning function approximation using deep neural networks are generically sharp.
We also provide quantitative and non-asymptotic bounds on the probability that a random $f\in\mathcal{S}$ can be encoded to within accuracy $\varepsilon$ using $R$ bits. This result is applied to the problem of approximately representing $f\in\mathcal{S}$ to within accuracy $\varepsilon$ by a (quantized) neural network that is constrained to have at most $W$ nonzero weights and is generated by an arbitrary "learning" procedure. We show that for any $s >s^\ast(\mathcal{S})$ there are constants $c,C$ such that, no matter how we choose the "learning" procedure, the probability of success is bounded from above by $\min\big\{1,2^{C\cdot W\lceil\log_2(1+W)\rceil^2 -c\cdot\varepsilon^{-1/s}}\big\}$.

[442]  arXiv:2008.01012 (cross-list from math.RA) [pdf, ps, other]
Title: The Largest Entry in the Inverse of a Vandermonde Matrix
Subjects: Rings and Algebras (math.RA); Numerical Analysis (math.NA)

We investigate the size of the largest entry (in absolute value) in the inverse of certain Vandermonde matrices. More precisely, for every real $b > 1$, let $M_b(n)$ be the maximum of the absolute values of the entries of the inverse of the $n \times n$ matrix $[b^{i j}]_{0 \leq i, j < n}$. We prove that $\lim_{n \to +\infty} M_b(n)$ exists, and we provide some formulas for it.

[443]  arXiv:2008.01047 (cross-list from math-ph) [pdf, ps, other]
Title: A Matrix Basis Formulation For The Green's Functions Of Maxwell's Equations And The Elastic Wave Equations In Layered Media
Subjects: Mathematical Physics (math-ph); Numerical Analysis (math.NA)

A matrix basis formulation is introduced to represent the 3 x 3 dyadic Green's functions in the frequency domain for the Maxwell's equations and the elastic wave equation in layered media. The formulation can be used to decompose the Maxwell's Green's functions into independent TE and TM components, each satisfying a Helmholtz equation, and decompose the elastic wave Green's function into the S-wave and the P-wave components. In addition, a derived vector basis formulation is applied to the case for acoustic wave sources from a non-viscous fluid layer.

[444]  arXiv:2008.01056 (cross-list from math.CO) [pdf, other]
Title: On the Broadcast Dimension of a Graph
Authors: Emily Zhang
Comments: 23 pages, 8 figures
Subjects: Combinatorics (math.CO); Discrete Mathematics (cs.DM)

A function $f:V(G)\rightarrow \mathbb{Z}^+ \cup \{0\}$ is a resolving broadcast of a graph $G$ if, for any distinct $x,y\in V(G)$, there exists a vertex $z\in V(G)$ with $f(z)>0$ such that $\min\{d(x,z), f(z)+1\} \neq \min\{d(y,z), f(z)+1\}.$ The broadcast dimension of $G$ is the minimum of $\sum_{v\in V(G)}f(v)$ over all resolving broadcasts $f$ of $G$. The concept of broadcast dimension was introduced by Geneson and Yi as a variant of metric dimension and has applications in areas such as network discovery and robot navigation.
In this paper, we derive an asymptotically tight lower bound on the broadcast dimension of an acyclic graph in the number of vertices, and we show that a lower bound by Geneson and Yi on the broadcast dimension of a general graph in the adjacency dimension is asymptotically tight. We also study the change in the broadcast dimension of a graph under a single edge deletion. We show that both the additive increase and decrease of the broadcast dimension of a graph under edge deletion is unbounded. Moreover, we show that under edge deletion, the broadcast dimension of any graph increases by a multiplicative factor of at most 3. These results fully answer three questions asked by Geneson and Yi.

Replacements for Tue, 4 Aug 20

[445]  arXiv:1509.06837 (replaced) [pdf]
Title: Generalization of the Truth-relevant Semantics to the Predicate Calculus
Authors: X. Y. Newberry
Subjects: Logic in Computer Science (cs.LO)
[446]  arXiv:1612.01041 (replaced) [pdf, ps, other]
Title: The Optimality of Correlated Sampling
Comments: 12 pages; Improved presentation based on feedback from anonymous reviewers
Subjects: Computational Complexity (cs.CC); Information Theory (cs.IT)
[447]  arXiv:1701.05054 (replaced) [pdf, other]
Title: POD reduced order modeling for evolution equations utilizing arbitrary finite element discretizations
Subjects: Numerical Analysis (math.NA)
[448]  arXiv:1701.07657 (replaced) [pdf, other]
Title: Operationalizing Declarative and Procedural Knowledge: a Benchmark on Logic Programming Petri Nets (LPPNs)
Authors: Giovanni Sileno
Comments: draft version -- updated
Subjects: Artificial Intelligence (cs.AI)
[449]  arXiv:1705.04351 (replaced) [pdf, other]
Title: A rational analysis of curiosity
Comments: Conference paper in CogSci 2017
Journal-ref: 39th Annual Conference of the Cognitive Science Society (CogSci), 2017
Subjects: Artificial Intelligence (cs.AI)
[450]  arXiv:1710.01867 (replaced) [pdf, ps, other]
Title: Improved Schemes for Asymptotically Optimal Repair of MDS Codes
Comments: Submitted to IEEE Transactions on Information Theory
Subjects: Information Theory (cs.IT)
[451]  arXiv:1711.04268 (replaced) [pdf, other]
Title: Active Sampling for the Quickest Detection of Markov Networks
Comments: 50 pages, 12 figures
Subjects: Methodology (stat.ME); Information Theory (cs.IT)
[452]  arXiv:1712.08500 (replaced) [pdf, other]
Title: On Perfect Privacy
Subjects: Information Theory (cs.IT)
[453]  arXiv:1801.05605 (replaced) [pdf, other]
Title: Efficient Test Collection Construction via Active Learning
Comments: Accepted as a full paper in ICTIR 2020. this https URL
Subjects: Information Retrieval (cs.IR)
[454]  arXiv:1802.09503 (replaced) [pdf, other]
Title: Online Coloring of Short Intervals
Comments: APPROX 2020
Subjects: Combinatorics (math.CO); Discrete Mathematics (cs.DM); Data Structures and Algorithms (cs.DS)
[455]  arXiv:1803.03248 (replaced) [pdf, other]
Title: Improved Distributed $Δ$-Coloring
Subjects: Data Structures and Algorithms (cs.DS); Distributed, Parallel, and Cluster Computing (cs.DC)
[456]  arXiv:1803.09040 (replaced) [pdf, other]
Title: A Bounded Formulation for The School Bus Scheduling Problem
Subjects: Optimization and Control (math.OC); Data Structures and Algorithms (cs.DS)
[457]  arXiv:1804.11331 (replaced) [pdf, ps, other]
Title: Optimal error estimates of Galerkin finite element methods for stochastic Allen-Cahn equation with additive noise
Comments: 22 pages, 6 figures
Journal-ref: Journal of Scientific Computing, 2019, 80(2): 1171-1194
Subjects: Numerical Analysis (math.NA); Probability (math.PR)
[458]  arXiv:1805.08342 (replaced) [pdf, other]
Title: Nearest neighbor density functional estimation from inverse Laplace transform
Comments: 53 pages, 4 figures. Submitted to the IEEE Transactions on Information Theory
Subjects: Statistics Theory (math.ST); Information Theory (cs.IT); Methodology (stat.ME); Machine Learning (stat.ML)
[459]  arXiv:1805.09887 (replaced) [pdf, other]
Title: Super-stability in the Student-Project Allocation Problem with Ties
Comments: 28 pages (including Appendix), 6 figures, 2 tables. A preliminary version of a part of this paper appeared in Proceedings of International Conference on Combinatorial Optimisation and Applications (COCOA) 2018. This paper has been accepted for publication in a special issue of Journal of Combinatorial Optimisation featuring selected papers from COCOA 2018
Subjects: Data Structures and Algorithms (cs.DS)
[460]  arXiv:1805.11021 (replaced) [pdf, ps, other]
Title: A Generalized Modality for Recursion
Authors: Adrien Guatto
Comments: 17 pages, 13 figures, LICS 2018 (extended version); (fixed typos in op. semantics on 2020-08-03)
Subjects: Programming Languages (cs.PL); Logic in Computer Science (cs.LO)
[461]  arXiv:1806.05866 (replaced) [pdf, other]
Title: Cliques and a new measure of clustering: with application to U.S. domestic airlines
Comments: 33 pages, 15 figures
Subjects: Social and Information Networks (cs.SI); Physics and Society (physics.soc-ph)
[462]  arXiv:1808.00691 (replaced) [pdf, other]
Title: On Triangle Estimation using Tripartite Independent Set Queries
Comments: 27 pages. A preliminary version has been appeared in ISAAC'2019. This version contains improved bound on query complexity
Subjects: Data Structures and Algorithms (cs.DS)
[463]  arXiv:1808.03682 (replaced) [src]
Title: Research on Control Method and Evaluation System of Ground Unmanned Vehicle Formation Transform
Comments: This paper's conclusion is not adequate and the method is not innovative. we want to improve the paper's quality
Subjects: Robotics (cs.RO)
[464]  arXiv:1808.05924 (replaced) [pdf, other]
Title: A Projector-Based Approach to Quantifying Total and Excess Uncertainties for Sketched Linear Regression
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Numerical Analysis (math.NA)
[465]  arXiv:1808.06887 (replaced) [pdf, other]
Title: Multimodal Interaction-aware Motion Prediction for Autonomous Street Crossing
Comments: The International Journal of Robotics Research (2020)
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[466]  arXiv:1809.01561 (replaced) [pdf, other]
Title: Imitation learning-based framework for learning 6-D linear compliant motions
Comments: Submitted to Autonomous Robots
Subjects: Robotics (cs.RO)
[467]  arXiv:1810.00873 (replaced) [pdf, other]
Title: Extending Stan for Deep Probabilistic Programming
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Programming Languages (cs.PL); Machine Learning (stat.ML)
[468]  arXiv:1811.01421 (replaced) [pdf, other]
Title: Why Extension-Based Proofs Fail
Comments: This version of the paper is for the NIS model. Previous versions of the paper are for the NIIS model
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
[469]  arXiv:1811.03254 (replaced) [pdf, ps, other]
Title: Fully Asynchronous Stochastic Coordinate Descent: A Tight Lower Bound on the Parallelism Achieving Linear Speedup
Comments: Accepted for publication in Mathematical Programming (Series A)
Subjects: Optimization and Control (math.OC); Distributed, Parallel, and Cluster Computing (cs.DC)
[470]  arXiv:1811.03311 (replaced) [pdf, other]
Title: Speaker-adaptive neural vocoders for parametric speech synthesis systems
Comments: Accepted to the IEEE Workshop of MMSP 2020
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[471]  arXiv:1812.01776 (replaced) [pdf, other]
Title: InferLine: ML Prediction Pipeline Provisioning and Management for Tight Latency Objectives
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
[472]  arXiv:1812.02464 (replaced) [pdf, other]
Title: Pseudo-Rehearsal: Achieving Deep Reinforcement Learning without Catastrophic Forgetting
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[473]  arXiv:1812.04530 (replaced) [pdf, other]
Title: Generating Summaries for Methods of Event-Driven Programs: an Android Case Study
Subjects: Software Engineering (cs.SE)
[474]  arXiv:1812.09851 (replaced) [pdf, other]
Title: On the Distortion Value of the Elections with Abstention
Comments: Revised version of the paper appeared in AAAI-19
Subjects: Computer Science and Game Theory (cs.GT)
[475]  arXiv:1901.01156 (replaced) [pdf, ps, other]
Title: Signal and System Design for Wireless Power Transfer : Prototype, Experiment and Validation
Comments: Accepted to IEEE Transactions on Wireless Communications
Subjects: Information Theory (cs.IT)
[476]  arXiv:1901.03931 (replaced) [pdf, other]
Title: Joint Placement and Allocation of VNF Nodes with Budget and Capacity Constraints
Authors: Gamal Sallam, Bo Ji
Subjects: Networking and Internet Architecture (cs.NI); Data Structures and Algorithms (cs.DS)
[477]  arXiv:1901.06731 (replaced) [pdf, ps, other]
Title: Four Deviations Suffice for Rank 1 Matrices
Comments: Minor revisions. To appear in Advances in Mathematics
Subjects: Combinatorics (math.CO); Discrete Mathematics (cs.DM); Functional Analysis (math.FA)
[478]  arXiv:1901.09663 (replaced) [pdf]
Title: A multi-dimensional framework for characterizing the citation impact of scientific publications
Comments: 32 pages, 9 figures, 7 tables
Subjects: Digital Libraries (cs.DL)
[479]  arXiv:1902.01946 (replaced) [pdf, ps, other]
Title: Thirty Years of Machine Learning: The Road to Pareto-Optimal Wireless Networks
Comments: 46 pages, 22 figs
Subjects: Networking and Internet Architecture (cs.NI); Machine Learning (cs.LG); Machine Learning (stat.ML)
[480]  arXiv:1902.07138 (replaced) [pdf, other]
Title: Who started this rumor? Quantifying the natural differential privacy guarantees of gossip protocols
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Cryptography and Security (cs.CR)
[481]  arXiv:1902.07399 (replaced) [pdf, other]
Title: LipschitzLR: Using theoretically computed adaptive learning rates for fast convergence
Comments: v4; comparison studies added
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
[482]  arXiv:1902.08551 (replaced) [pdf, ps, other]
Title: Ring Learning With Errors: A crossroads between postquantum cryptography, machine learning and number theory
Comments: arXiv admin note: text overlap with arXiv:1508.01375 by other authors/ comment of the author: quotation has been added to Theorem 5.8
Subjects: Cryptography and Security (cs.CR); Information Theory (cs.IT)
[483]  arXiv:1902.08657 (replaced) [pdf, ps, other]
Title: Two-Multicast Channel with Confidential Messages
Subjects: Information Theory (cs.IT)
[484]  arXiv:1902.10222 (replaced) [pdf, other]
Title: ROMANet: Fine-Grained Reuse-Driven Off-Chip Memory Access Management and Data Organization for Deep Neural Network Accelerators
Comments: Submitted to the IEEE-TVLSI journal, 14 pages, 26 figures
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Hardware Architecture (cs.AR); Machine Learning (cs.LG)
[485]  arXiv:1903.07019 (replaced) [pdf, other]
Title: Circumscribing Polygons and Polygonizations for Disjoint Line Segments
Comments: Extended version (preliminary abstract accepted in the proceedings of SoCG 2019)
Subjects: Computational Geometry (cs.CG)
[486]  arXiv:1903.07938 (replaced) [pdf, other]
Title: Optimal reduced model algorithms for data-based state estimation
Subjects: Numerical Analysis (math.NA)
[487]  arXiv:1904.06788 (replaced) [pdf, other]
Title: Multi-Branch Tensor Network Structure for Tensor-Train Discriminant Analysis
Subjects: Signal Processing (eess.SP); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[488]  arXiv:1905.09943 (replaced) [pdf, ps, other]
Title: On Pruning for Score-Based Bayesian Network Structure Learning
Authors: Alv