We gratefully acknowledge support from
the Simons Foundation and member institutions.

Artificial Intelligence

New submissions

[ total of 45 entries: 1-45 ]
[ showing up to 1000 entries per page: fewer | more ]

New submissions for Thu, 27 Jan 22

[1]  arXiv:2201.10556 [pdf]
Title: Learning Norms via Natural Language Teachings
Comments: Presented at The Ninth Advances in Cognitive Systems (ACS) Conference 2021 (arXiv:2201.06134)
Subjects: Artificial Intelligence (cs.AI)

To interact with humans, artificial intelligence (AI) systems must understand our social world. Within this world norms play an important role in motivating and guiding agents. However, very few computational theories for learning social norms have been proposed. There also exists a long history of debate on the distinction between what is normal (is) and what is normative (ought). Many have argued that being capable of learning both concepts and recognizing the difference is necessary for all social agents. This paper introduces and demonstrates a computational approach to learning norms from natural language text that accounts for both what is normal and what is normative. It provides a foundation for everyday people to train AI systems about social norms.

[2]  arXiv:2201.10822 [pdf, other]
Title: An Explainable Artificial Intelligence Framework for Quality-Aware IoE Service Delivery
Comments: Accepted article by IEEE International Conference on Communications (ICC 2022), Copyright 2022 IEEE
Subjects: Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Networking and Internet Architecture (cs.NI)

One of the core envisions of the sixth-generation (6G) wireless networks is to accumulate artificial intelligence (AI) for autonomous controlling of the Internet of Everything (IoE). Particularly, the quality of IoE services delivery must be maintained by analyzing contextual metrics of IoE such as people, data, process, and things. However, the challenges incorporate when the AI model conceives a lake of interpretation and intuition to the network service provider. Therefore, this paper provides an explainable artificial intelligence (XAI) framework for quality-aware IoE service delivery that enables both intelligence and interpretation. First, a problem of quality-aware IoE service delivery is formulated by taking into account network dynamics and contextual metrics of IoE, where the objective is to maximize the channel quality index (CQI) of each IoE service user. Second, a regression problem is devised to solve the formulated problem, where explainable coefficients of the contextual matrices are estimated by Shapley value interpretation. Third, the XAI-enabled quality-aware IoE service delivery algorithm is implemented by employing ensemble-based regression models for ensuring the interpretation of contextual relationships among the matrices to reconfigure network parameters. Finally, the experiment results show that the uplink improvement rate becomes 42.43% and 16.32% for the AdaBoost and Extra Trees, respectively, while the downlink improvement rate reaches up to 28.57% and 14.29%. However, the AdaBoost-based approach cannot maintain the CQI of IoE service users. Therefore, the proposed Extra Trees-based regression model shows significant performance gain for mitigating the trade-off between accuracy and interpretability than other baselines.

[3]  arXiv:2201.11109 [pdf]
Title: Using a Novel COVID-19 Calculator to Measure Positive U.S. Socio-Economic Impact of a COVID-19 Pre-Screening Solution (AI/ML)
Subjects: Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

The COVID-19 pandemic has been a scourge upon humanity, claiming the lives of more than 5.1 million people worldwide; the global economy contracted by 3.5% in 2020. This paper presents a COVID-19 calculator, synthesizing existing published calculators and data points, to measure the positive U.S. socio-economic impact of a COVID-19 AI/ML pre-screening solution (algorithm & application).

[4]  arXiv:2201.11117 [pdf]
Title: Cybertrust: From Explainable to Actionable and Interpretable AI (AI2)
Subjects: Artificial Intelligence (cs.AI)

To benefit from AI advances, users and operators of AI systems must have reason to trust it. Trust arises from multiple interactions, where predictable and desirable behavior is reinforced over time. Providing the system's users with some understanding of AI operations can support predictability, but forcing AI to explain itself risks constraining AI capabilities to only those reconcilable with human cognition. We argue that AI systems should be designed with features that build trust by bringing decision-analytic perspectives and formal tools into AI. Instead of trying to achieve explainable AI, we should develop interpretable and actionable AI. Actionable and Interpretable AI (AI2) will incorporate explicit quantifications and visualizations of user confidence in AI recommendations. In doing so, it will allow examining and testing of AI system predictions to establish a basis for trust in the systems' decision making and ensure broad benefits from deploying and advancing its computational capabilities.

Cross-lists for Thu, 27 Jan 22

[5]  arXiv:2201.10592 (cross-list from cs.SE) [pdf, other]
Title: DebtFree: Minimizing Labeling Cost in Self-Admitted Technical Debt Identification using Semi-Supervised Learning
Authors: Huy Tu, Tim Menzies
Comments: Accepted at EMSE
Subjects: Software Engineering (cs.SE); Artificial Intelligence (cs.AI)

Keeping track of and managing Self-Admitted Technical Debts (SATDs) is important for maintaining a healthy software project. Current active-learning SATD recognition tool involves manual inspection of 24% of the test comments on average to reach 90% of the recall. Among all the test comments, about 5% are SATDs. The human experts are then required to read almost a quintuple of the SATD comments which indicates the inefficiency of the tool. Plus, human experts are still prone to error: 95% of the false-positive labels from previous work were actually true positives.
To solve the above problems, we propose DebtFree, a two-mode framework based on unsupervised learning for identifying SATDs. In mode1, when the existing training data is unlabeled, DebtFree starts with an unsupervised learner to automatically pseudo-label the programming comments in the training data. In contrast, in mode2 where labels are available with the corresponding training data, DebtFree starts with a pre-processor that identifies the highly prone SATDs from the test dataset. Then, our machine learning model is employed to assist human experts in manually identifying the remaining SATDs. Our experiments on 10 software projects show that both models yield a statistically significant improvement in effectiveness over the state-of-the-art automated and semi-automated models. Specifically, DebtFree can reduce the labeling effort by 99% in mode1 (unlabeled training data), and up to 63% in mode2 (labeled training data) while improving the current active learner's F1 relatively to almost 100%.

[6]  arXiv:2201.10631 (cross-list from cs.GT) [pdf, other]
Title: The Price of Strategyproofing Peer Assessment
Subjects: Computer Science and Game Theory (cs.GT); Artificial Intelligence (cs.AI)

Strategic behavior is a fundamental problem in a variety of real-world applications that require some form of peer assessment, such as peer grading of assignments, grant proposal review, conference peer review, and peer assessment of employees. Since an individual's own work is in competition with the submissions they are evaluating, they may provide dishonest evaluations to increase the relative standing of their own submission. This issue is typically addressed by partitioning the individuals and assigning them to evaluate the work of only those from different subsets. Although this method ensures strategyproofness, each submission may require a different type of expertise for effective evaluation. In this paper, we focus on finding an assignment of evaluators to submissions that maximizes assigned expertise subject to the constraint of strategyproofness. We analyze the price of strategyproofness: that is, the amount of compromise on the assignment quality required in order to get strategyproofness. We establish several polynomial-time algorithms for strategyproof assignment along with assignment-quality guarantees. Finally, we evaluate the methods on a dataset from conference peer review.

[7]  arXiv:2201.10643 (cross-list from cs.HC) [pdf, other]
Title: Intersectionality Goes Analytical: Taming Combinatorial Explosion Through Type Abstraction
Subjects: Human-Computer Interaction (cs.HC); Artificial Intelligence (cs.AI)

HCI researchers' and practitioners' awareness of intersectionality has been expanding, producing knowledge, recommendations, and prototypes for supporting intersectional populations. However, doing intersectional HCI work is uniquely expensive: it leads to a combinatorial explosion of empirical work (expense 1), and little of the work on one intersectional population can be leveraged to serve another (expense 2). In this paper, we explain how representations employed by certain analytical design methods correspond to type abstractions, and use that correspondence to identify a (de)compositional model in which a population's diverse identity properties can be joined and split. We formally prove the model's correctness, and show how it enables HCI designers to harness existing analytical HCI methods for use on new intersectional populations of interest. We illustrate through four design use-cases, how the model can reduce the amount of expense 1 and enable designers to leverage prior work to new intersectional populations, addressing expense 2.

[8]  arXiv:2201.10650 (cross-list from cs.CV) [pdf, other]
Title: Beyond Visual Image: Automated Diagnosis of Pigmented Skin Lesions Combining Clinical Image Features with Patient Data
Comments: 33 pages, 11 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)

kin cancer is considered one of the most common type of cancer in several countries. Due to the difficulty and subjectivity in the clinical diagnosis of skin lesions, Computer-Aided Diagnosis systems are being developed for assist experts to perform more reliable diagnosis. The clinical analysis and diagnosis of skin lesions relies not only on the visual information but also on the context information provided by the patient. This work addresses the problem of pigmented skin lesions detection from smartphones captured images. In addition to the features extracted from images, patient context information was collected to provide a more accurate diagnosis. The experiments showed that the combination of visual features with context information improved final results. Experimental results are very promising and comparable to experts.

[9]  arXiv:2201.10675 (cross-list from cs.CV) [pdf]
Title: Virtual Adversarial Training for Semi-supervised Breast Mass Classification
Comments: To appear in the conference Biophotonics and Immune Responses of SPIE
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Image and Video Processing (eess.IV); Signal Processing (eess.SP)

This study aims to develop a novel computer-aided diagnosis (CAD) scheme for mammographic breast mass classification using semi-supervised learning. Although supervised deep learning has achieved huge success across various medical image analysis tasks, its success relies on large amounts of high-quality annotations, which can be challenging to acquire in practice. To overcome this limitation, we propose employing a semi-supervised method, i.e., virtual adversarial training (VAT), to leverage and learn useful information underlying in unlabeled data for better classification of breast masses. Accordingly, our VAT-based models have two types of losses, namely supervised and virtual adversarial losses. The former loss acts as in supervised classification, while the latter loss aims at enhancing model robustness against virtual adversarial perturbation, thus improving model generalizability. To evaluate the performance of our VAT-based CAD scheme, we retrospectively assembled a total of 1024 breast mass images, with equal number of benign and malignant masses. A large CNN and a small CNN were used in this investigation, and both were trained with and without the adversarial loss. When the labeled ratios were 40% and 80%, VAT-based CNNs delivered the highest classification accuracy of 0.740 and 0.760, respectively. The experimental results suggest that the VAT-based CAD scheme can effectively utilize meaningful knowledge from unlabeled data to better classify mammographic breast mass images.

[10]  arXiv:2201.10728 (cross-list from cs.CV) [pdf, other]
Title: Training Vision Transformers with Only 2040 Images
Comments: 11 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)

Vision Transformers (ViTs) is emerging as an alternative to convolutional neural networks (CNNs) for visual recognition. They achieve competitive results with CNNs but the lack of the typical convolutional inductive bias makes them more data-hungry than common CNNs. They are often pretrained on JFT-300M or at least ImageNet and few works study training ViTs with limited data. In this paper, we investigate how to train ViTs with limited data (e.g., 2040 images). We give theoretical analyses that our method (based on parametric instance discrimination) is superior to other methods in that it can capture both feature alignment and instance similarities. We achieve state-of-the-art results when training from scratch on 7 small datasets under various ViT backbones. We also investigate the transferring ability of small datasets and find that representations learned from small datasets can even improve large-scale ImageNet training.

[11]  arXiv:2201.10737 (cross-list from cs.CV) [pdf, other]
Title: Class-Aware Generative Adversarial Transformers for Medical Image Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Image and Video Processing (eess.IV)

Transformers have made remarkable progress towards modeling long-range dependencies within the medical image analysis domain. However, current transformer-based models suffer from several disadvantages: 1) existing methods fail to capture the important features of the images due to the naive tokenization scheme; 2) the models suffer from information loss because they only consider single-scale feature representations; and 3) the segmentation label maps generated by the models are not accurate enough without considering rich semantic contexts and anatomical textures. In this work, we present CA-GANformer, a novel type of generative adversarial transformers, for medical image segmentation. First, we take advantage of the pyramid structure to construct multi-scale representations and handle multi-scale variations. We then design a novel class-aware transformer module to better learn the discriminative regions of objects with semantic structures. Lastly, we utilize an adversarial training strategy that boosts segmentation accuracy and correspondingly allows a transformer-based discriminator to capture high-level semantically correlated contents and low-level anatomical features. Our experiments demonstrate that CA-GANformer dramatically outperforms previous state-of-the-art transformer-based approaches on three benchmarks, obtaining absolute 2.54%-5.88% improvements in Dice over previous models. Further qualitative experiments provide a more detailed picture of the model's inner workings, shed light on the challenges in improved transparency, and demonstrate that transfer learning can greatly improve performance and reduce the size of medical image datasets in training, making CA-GANformer a strong starting point for downstream medical image analysis tasks. Codes and models will be available to the public.

[12]  arXiv:2201.10746 (cross-list from cs.RO) [pdf, other]
Title: A Cooperation-Aware Lane Change Method for Autonomous Vehicles
Comments: 13 pages, 14 figures, 2 tables, submitted to IEEE Transactions on Vehicular Technology
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI)

Lane change for autonomous vehicles (AVs) is an important but challenging task in complex dynamic traffic environments. Due to difficulties in guarantee safety as well as a high efficiency, AVs are inclined to choose relatively conservative strategies for lane change. To avoid the conservatism, this paper presents a cooperation-aware lane change method utilizing interactions between vehicles. We first propose an interactive trajectory prediction method to explore possible cooperations between an AV and the others. Further, an evaluation is designed to make a decision on lane change, in which safety, efficiency and comfort are taken into consideration. Thereafter, we propose a motion planning algorithm based on model predictive control (MPC), which incorporates AV's decision and surrounding vehicles' interactive behaviors into constraints so as to avoid collisions during lane change. Quantitative testing results show that compared with the methods without an interactive prediction, our method enhances driving efficiencies of the AV and other vehicles by 14.8$\%$ and 2.6$\%$ respectively, which indicates that a proper utilization of vehicle interactions can effectively reduce the conservatism of the AV and promote the cooperation between the AV and others.

[13]  arXiv:2201.10753 (cross-list from cs.CV) [pdf, other]
Title: Interactive Image Inpainting Using Semantic Guidance
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)

Image inpainting approaches have achieved significant progress with the help of deep neural networks. However, existing approaches mainly focus on leveraging the priori distribution learned by neural networks to produce a single inpainting result or further yielding multiple solutions, where the controllability is not well studied. This paper develops a novel image inpainting approach that enables users to customize the inpainting result by their own preference or memory. Specifically, our approach is composed of two stages that utilize the prior of neural network and user's guidance to jointly inpaint corrupted images. In the first stage, an autoencoder based on a novel external spatial attention mechanism is deployed to produce reconstructed features of the corrupted image and a coarse inpainting result that provides semantic mask as the medium for user interaction. In the second stage, a semantic decoder that takes the reconstructed features as prior is adopted to synthesize a fine inpainting result guided by user's customized semantic mask, so that the final inpainting result will share the same content with user's guidance while the textures and colors reconstructed in the first stage are preserved. Extensive experiments demonstrate the superiority of our approach in terms of inpainting quality and controllability.

[14]  arXiv:2201.10803 (cross-list from cs.LG) [pdf, other]
Title: Exploiting Semantic Epsilon Greedy Exploration Strategy in Multi-Agent Reinforcement Learning
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Multiagent Systems (cs.MA)

Multi-agent reinforcement learning (MARL) can model many real world applications. However, many MARL approaches rely on epsilon greedy for exploration, which may discourage visiting advantageous states in hard scenarios. In this paper, we propose a new approach QMIX(SEG) for tackling MARL. It makes use of the value function factorization method QMIX to train per-agent policies and a novel Semantic Epsilon Greedy (SEG) exploration strategy. SEG is a simple extension to the conventional epsilon greedy exploration strategy, yet it is experimentally shown to greatly improve the performance of MARL. We first cluster actions into groups of actions with similar effects and then use the groups in a bi-level epsilon greedy exploration hierarchy for action selection. We argue that SEG facilitates semantic exploration by exploring in the space of groups of actions, which have richer semantic meanings than atomic actions. Experiments show that QMIX(SEG) largely outperforms QMIX and leads to strong performance competitive with current state-of-the-art MARL approaches on the StarCraft Multi-Agent Challenge (SMAC) benchmark.

[15]  arXiv:2201.10808 (cross-list from econ.GN) [pdf, other]
Title: Speed, Quality, and the Optimal Timing of Complex Decisions: Field Evidence
Subjects: General Economics (econ.GN); Artificial Intelligence (cs.AI); Applications (stat.AP)

This paper presents an empirical investigation of the relation between decision speed and decision quality for a real-world setting of cognitively-demanding decisions in which the timing of decisions is endogenous: professional chess. Move-by-move data provide exceptionally detailed and precise information about decision times and decision quality, based on a comparison of actual decisions to a computational benchmark of best moves constructed using the artificial intelligence of a chess engine. The results reveal that faster decisions are associated with better performance. The findings are consistent with the predictions of procedural decision models like drift-diffusion-models in which decision makers sequentially acquire information about decision alternatives with uncertain valuations.

[16]  arXiv:2201.10859 (cross-list from cs.LG) [pdf, other]
Title: Visualizing the diversity of representations learned by Bayesian neural networks
Comments: 15 pages, 13 figures
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)

Explainable artificial intelligence (XAI) aims to make learning machines less opaque, and offers researchers and practitioners various tools to reveal the decision-making strategies of neural networks. In this work, we investigate how XAI methods can be used for exploring and visualizing the diversity of feature representations learned by Bayesian neural networks (BNNs). Our goal is to provide a global understanding of BNNs by making their decision-making strategies a) visible and tangible through feature visualizations and b) quantitatively measurable with a distance measure learned by contrastive learning. Our work provides new insights into the posterior distribution in terms of human-understandable feature information with regard to the underlying decision-making strategies. Our main findings are the following: 1) global XAI methods can be applied to explain the diversity of decision-making strategies of BNN instances, 2) Monte Carlo dropout exhibits increased diversity in feature representations compared to the multimodal posterior approximation of MultiSWAG, 3) the diversity of learned feature representations highly correlates with the uncertainty estimates, and 4) the inter-mode diversity of the multimodal posterior decreases as the network width increases, while the intra-mode diversity increases. Our findings are consistent with the recent deep neural networks theory, providing additional intuitions about what the theory implies in terms of humanly understandable concepts.

[17]  arXiv:2201.10860 (cross-list from cs.LG) [pdf, other]
Title: A deep learning method based on patchwise training for reconstructing temperature field
Comments: 18 pages, 16 figures, 42 conference
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)

Physical field reconstruction is highly desirable for the measurement and control of engineering systems. The reconstruction of the temperature field from limited observation plays a crucial role in thermal management for electronic equipment. Deep learning has been employed in physical field reconstruction, whereas the accurate estimation for the regions with large gradients is still diffcult. To solve the problem, this work proposes a novel deep learning method based on patchwise training to reconstruct the temperature field of electronic equipment accurately from limited observation. Firstly, the temperature field reconstruction (TFR) problem of the electronic equipment is modeled mathematically and transformed as an image-to-image regression task. Then a patchwise training and inference framework consisting of an adaptive UNet and a shallow multilayer perceptron (MLP) is developed to establish the mapping from the observation to the temperature field. The adaptive UNet is utilized to reconstruct the whole temperature field while the MLP is designed to predict the patches with large temperature gradients. Experiments employing finite element simulation data are conducted to demonstrate the accuracy of the proposed method. Furthermore, the generalization is evaluated by investigating cases under different heat source layouts, different power intensities, and different observation point locations. The maximum absolute errors of the reconstructed temperature field are less than 1K under the patchwise training approach.

[18]  arXiv:2201.10890 (cross-list from cs.LG) [pdf, other]
Title: One Student Knows All Experts Know: From Sparse to Dense
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)

Human education system trains one student by multiple experts. Mixture-of-experts (MoE) is a powerful sparse architecture including multiple experts. However, sparse MoE model is hard to implement, easy to overfit, and not hardware-friendly. In this work, inspired by human education model, we propose a novel task, knowledge integration, to obtain a dense student model (OneS) as knowledgeable as one sparse MoE. We investigate this task by proposing a general training framework including knowledge gathering and knowledge distillation. Specifically, we first propose Singular Value Decomposition Knowledge Gathering (SVD-KG) to gather key knowledge from different pretrained experts. We then refine the dense student model by knowledge distillation to offset the noise from gathering. On ImageNet, our OneS preserves $61.7\%$ benefits from MoE. OneS can achieve $78.4\%$ top-1 accuracy with only $15$M parameters. On four natural language processing datasets, OneS obtains $88.2\%$ MoE benefits and outperforms SoTA by $51.7\%$ using the same architecture and training data. In addition, compared with the MoE counterpart, OneS can achieve $3.7 \times$ inference speedup due to the hardware-friendly architecture.

[19]  arXiv:2201.10908 (cross-list from cs.LG) [pdf, other]
Title: Improving robustness and calibration in ensembles with diversity regularization
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)

Calibration and uncertainty estimation are crucial topics in high-risk environments. We introduce a new diversity regularizer for classification tasks that uses out-of-distribution samples and increases the overall accuracy, calibration and out-of-distribution detection capabilities of ensembles. Following the recent interest in the diversity of ensembles, we systematically evaluate the viability of explicitly regularizing ensemble diversity to improve calibration on in-distribution data as well as under dataset shift. We demonstrate that diversity regularization is highly beneficial in architectures, where weights are partially shared between the individual members and even allows to use fewer ensemble members to reach the same level of robustness. Experiments on CIFAR-10, CIFAR-100, and SVHN show that regularizing diversity can have a significant impact on calibration and robustness, as well as out-of-distribution detection.

[20]  arXiv:2201.10918 (cross-list from cs.RO) [pdf, other]
Title: Behavior Tree-Based Asynchronous Task Planning for Multiple Mobile Robots using a Data Distribution Service
Comments: 8 pages, 11 figures
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Multiagent Systems (cs.MA); Networking and Internet Architecture (cs.NI)

In this study, we propose task planning framework for multiple robots that builds on a behavior tree (BT). BTs communicate with a data distribution service (DDS) to send and receive data. Since the standard BT derived from one root node with a single tick is unsuitable for multiple robots, a novel type of BT action and improved nodes are proposed to control multiple robots through a DDS asynchronously. To plan tasks for robots efficiently, a single task planning unit is implemented with the proposed task types. The task planning unit assigns tasks to each robot simultaneously through a single coalesced BT. If any robot falls into a fault while performing its assigned task, another BT embedded in the robot is executed; the robot enters the recovery mode in order to overcome the fault. To perform this function, the action in the BT corresponding to the task is defined as a variable, which is shared with the DDS so that any action can be exchanged between the task planning unit and robots. To show the feasibility of our framework in a real-world application, three mobile robots were experimentally coordinated for them to travel alternately to four goal positions by the proposed single task planning unit via a DDS.

[21]  arXiv:2201.10945 (cross-list from cs.SI) [pdf, other]
Title: On the Power of Gradual Network Alignment Using Dual-Perception Similarities
Comments: 16 pages, 11 figures, 4 tables; its two-page extended summary to be presented in the AAAI-22 Student Abstract and Poster Program
Subjects: Social and Information Networks (cs.SI); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Networking and Internet Architecture (cs.NI)

Network alignment (NA) is the task of finding the correspondence of nodes between two networks based on the network structure and node attributes. Our study is motivated by the fact that, since most of existing NA methods have attempted to discover all node pairs at once, they do not harness information enriched through interim discovery of node correspondences to more accurately find the next correspondences during the node matching. To tackle this challenge, we propose Grad-Align, a new NA method that gradually discovers node pairs by making full use of node pairs exhibiting strong consistency, which are easy to be discovered in the early stage of gradual matching. Specifically, Grad-Align first generates node embeddings of the two networks based on graph neural networks along with our layer-wise reconstruction loss, a loss built upon capturing the first-order and higher-order neighborhood structures. Then, nodes are gradually aligned by computing dual-perception similarity measures including the multi-layer embedding similarity as well as the Tversky similarity, an asymmetric set similarity using the Tversky index applicable to networks with different scales. Additionally, we incorporate an edge augmentation module into Grad-Align to reinforce the structural consistency. Through comprehensive experiments using real-world and synthetic datasets, we empirically demonstrate that Grad-Align consistently outperforms state-of-the-art NA methods.

[22]  arXiv:2201.10947 (cross-list from cs.LG) [pdf, ps, other]
Title: Enabling Deep Learning on Edge Devices through Filter Pruning and Knowledge Transfer
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)

Deep learning models have introduced various intelligent applications to edge devices, such as image classification, speech recognition, and augmented reality. There is an increasing need of training such models on the devices in order to deliver personalized, responsive, and private learning. To address this need, this paper presents a new solution for deploying and training state-of-the-art models on the resource-constrained devices. First, the paper proposes a novel filter-pruning-based model compression method to create lightweight trainable models from large models trained in the cloud, without much loss of accuracy. Second, it proposes a novel knowledge transfer method to enable the on-device model to update incrementally in real time or near real time using incremental learning on new data and enable the on-device model to learn the unseen categories with the help of the in-cloud model in an unsupervised fashion. The results show that 1) our model compression method can remove up to 99.36% parameters of WRN-28-10, while preserving a Top-1 accuracy of over 90% on CIFAR-10; 2) our knowledge transfer method enables the compressed models to achieve more than 90% accuracy on CIFAR-10 and retain good accuracy on old categories; 3) it allows the compressed models to converge within real time (three to six minutes) on the edge for incremental learning tasks; 4) it enables the model to classify unseen categories of data (78.92% Top-1 accuracy) that it is never trained with.

[23]  arXiv:2201.10972 (cross-list from cs.CV) [pdf, other]
Title: How Robust are Discriminatively Trained Zero-Shot Learning Models?
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)

Data shift robustness is an active research topic, however, it has been primarily investigated from a fully supervised perspective, and robustness of zero-shot learning (ZSL) models have been largely neglected. In this paper, we present a novel analysis on the robustness of discriminative ZSL to image corruptions. We leverage the well-known label embedding model and subject it to a large set of common corruptions and defenses. In order to realize the corruption analysis, we curate and release the first ZSL corruption robustness datasets SUN-C, CUB-C and AWA2-C. We analyse our results by taking into account the dataset characteristics, class imbalance, class transition trends between seen and unseen classes and the discrepancies between ZSL and GZSL performances. Our results show that discriminative ZSL suffer from corruptions and this trend is further exacerbated by the severe class imbalance and model weakness inherent in ZSL methods. We then combine our findings with those based on adversarial attacks in ZSL, and highlight the different effects of corruptions and adversarial examples, such as the pseudo-robustness effect present under adversarial attacks. We also obtain new strong baselines for the label embedding model with certain corruption robustness enhancement methods. Finally, our experiments show that although existing methods to improve robustness somewhat work for ZSL models, they do not produce a tangible effect.

[24]  arXiv:2201.10983 (cross-list from cs.IR) [pdf, other]
Title: Online POI Recommendation: Learning Dynamic Geo-Human Interactions in Streams
Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

In this paper, we focus on the problem of modeling dynamic geo-human interactions in streams for online POI recommendations. Specifically, we formulate the in-stream geo-human interaction modeling problem into a novel deep interactive reinforcement learning framework, where an agent is a recommender and an action is a next POI to visit. We uniquely model the reinforcement learning environment as a joint and connected composition of users and geospatial contexts (POIs, POI categories, functional zones). An event that a user visits a POI in stream updates the states of both users and geospatial contexts; the agent perceives the updated environment state to make online recommendations. Specifically, we model a mixed-user event stream by unifying all users, visits, and geospatial contexts as a dynamic knowledge graph stream, in order to model human-human, geo-human, geo-geo interactions. We design an exit mechanism to address the expired information challenge, devise a meta-path method to address the recommendation candidate generation challenge, and develop a new deep policy network structure to address the varying action space challenge, and, finally, propose an effective adversarial training method for optimization. Finally, we present extensive experiments to demonstrate the enhanced performance of our method.

[25]  arXiv:2201.10985 (cross-list from cs.CV) [pdf, other]
Title: Jalisco's multiclass land cover analysis and classification using a novel lightweight convnet with real-world multispectral and relief data
Comments: 12 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)

The understanding of global climate change, agriculture resilience, and deforestation control rely on the timely observations of the Land Use and Land Cover Change (LULCC). Recently, some deep learning (DL) methods have been adapted to make an automatic classification of Land Cover (LC) for global and homogeneous data. However, most of these DL models can not apply effectively to real-world data. i.e. a large number of classes, multi-seasonal data, diverse climate regions, high imbalance label dataset, and low-spatial resolution. In this work, we present our novel lightweight (only 89k parameters) Convolution Neural Network (ConvNet) to make LC classification and analysis to handle these problems for the Jalisco region. In contrast to the global approaches, the regional data provide the context-specificity that is required for policymakers to plan the land use and management, conservation areas, or ecosystem services. In this work, we combine three real-world open data sources to obtain 13 channels. Our embedded analysis anticipates the limited performance in some classes and gives us the opportunity to group the most similar, as a result, the test accuracy performance increase from 73 % to 83 %. We hope that this research helps other regional groups with limited data sources or computational resources to attain the United Nations Sustainable Development Goal (SDG) concerning Life on Land.

[26]  arXiv:2201.10989 (cross-list from stat.ML) [pdf, other]
Title: Uphill Roads to Variational Tightness: Monotonicity and Monte Carlo Objectives
Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Computation (stat.CO); Methodology (stat.ME)

We revisit the theory of importance weighted variational inference (IWVI), a promising strategy for learning latent variable models. IWVI uses new variational bounds, known as Monte Carlo objectives (MCOs), obtained by replacing intractable integrals by Monte Carlo estimates -- usually simply obtained via importance sampling. Burda, Grosse and Salakhutdinov (2016) showed that increasing the number of importance samples provably tightens the gap between the bound and the likelihood. Inspired by this simple monotonicity theorem, we present a series of nonasymptotic results that link properties of Monte Carlo estimates to tightness of MCOs. We challenge the rationale that smaller Monte Carlo variance leads to better bounds. We confirm theoretically the empirical findings of several recent papers by showing that, in a precise sense, negative correlation reduces the variational gap. We also generalise the original monotonicity theorem by considering non-uniform weights. We discuss several practical consequences of our theoretical results. Our work borrows many ideas and results from the theory of stochastic orders.

[27]  arXiv:2201.11091 (cross-list from cs.CV) [pdf, ps, other]
Title: Momentum Capsule Networks
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

Capsule networks are a class of neural networks that achieved promising results on many computer vision tasks. However, baseline capsule networks have failed to reach state-of-the-art results on more complex datasets due to the high computation and memory requirements. We tackle this problem by proposing a new network architecture, called Momentum Capsule Network (MoCapsNet). MoCapsNets are inspired by Momentum ResNets, a type of network that applies reversible residual building blocks. Reversible networks allow for recalculating activations of the forward pass in the backpropagation algorithm, so those memory requirements can be drastically reduced. In this paper, we provide a framework on how invertible residual building blocks can be applied to capsule networks. We will show that MoCapsNet beats the accuracy of baseline capsule networks on MNIST, SVHN and CIFAR-10 while using considerably less memory. The source code is available on https://github.com/moejoe95/MoCapsNet.

[28]  arXiv:2201.11104 (cross-list from cs.LG) [pdf, other]
Title: Combining optimal path search with task-dependent learning in a neural network
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)

Finding optimal paths in connected graphs requires determining the smallest total cost for traveling along the graph's edges. This problem can be solved by several classical algorithms where, usually, costs are predefined for all edges. Conventional planning methods can, thus, normally not be used when wanting to change costs in an adaptive way following the requirements of some task. Here we show that one can define a neural network representation of path finding problems by transforming cost values into synaptic weights, which allows for online weight adaptation using network learning mechanisms. When starting with an initial activity value of one, activity propagation in this network will lead to solutions, which are identical to those found by the Bellman Ford algorithm. The neural network has the same algorithmic complexity as Bellman Ford and, in addition, we can show that network learning mechanisms (such as Hebbian learning) can adapt the weights in the network augmenting the resulting paths according to some task at hand. We demonstrate this by learning to navigate in an environment with obstacles as well as by learning to follow certain sequences of path nodes. Hence, the here-presented novel algorithm may open up a different regime of applications where path-augmentation (by learning) is directly coupled with path finding in a natural way.

[29]  arXiv:2201.11105 (cross-list from cs.MM) [pdf]
Title: Do You See What I See? Capabilities and Limits of Automated Multimedia Content Analysis
Subjects: Multimedia (cs.MM); Artificial Intelligence (cs.AI)

The ever-increasing amount of user-generated content online has led, in recent years, to an expansion in research and investment in automated content analysis tools. Scrutiny of automated content analysis has accelerated during the COVID-19 pandemic, as social networking services have placed a greater reliance on these tools due to concerns about health risks to their moderation staff from in-person work. At the same time, there are important policy debates around the world about how to improve content moderation while protecting free expression and privacy. In order to advance these debates, we need to understand the potential role of automated content analysis tools.
This paper explains the capabilities and limitations of tools for analyzing online multimedia content and highlights the potential risks of using these tools at scale without accounting for their limitations. It focuses on two main categories of tools: matching models and computer prediction models. Matching models include cryptographic and perceptual hashing, which compare user-generated content with existing and known content. Predictive models (including computer vision and computer audition) are machine learning techniques that aim to identify characteristics of new or previously unknown content.

[30]  arXiv:2201.11113 (cross-list from cs.LG) [pdf, other]
Title: Post-training Quantization for Neural Networks with Provable Guarantees
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

While neural networks have been remarkably successful in a wide array of applications, implementing them in resource-constrained hardware remains an area of intense research. By replacing the weights of a neural network with quantized (e.g., 4-bit, or binary) counterparts, massive savings in computation cost, memory, and power consumption are attained. We modify a post-training neural-network quantization method, GPFQ, that is based on a greedy path-following mechanism, and rigorously analyze its error. We prove that for quantizing a single-layer network, the relative square error essentially decays linearly in the number of weights -- i.e., level of over-parametrization. Our result holds across a range of input distributions and for both fully-connected and convolutional architectures. To empirically evaluate the method, we quantize several common architectures with few bits per weight, and test them on ImageNet, showing only minor loss of accuracy. We also demonstrate that standard modifications, such as bias correction and mixed precision quantization, further improve accuracy.

[31]  arXiv:2201.11114 (cross-list from cs.CV) [pdf, other]
Title: Natural Language Descriptions of Deep Visual Features
Comments: To be published as a conference paper at ICLR 2022
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)

Some neurons in deep networks specialize in recognizing highly specific perceptual, structural, or semantic features of inputs. In computer vision, techniques exist for identifying neurons that respond to individual concept categories like colors, textures, and object classes. But these techniques are limited in scope, labeling only a small subset of neurons and behaviors in any network. Is a richer characterization of neuron-level computation possible? We introduce a procedure (called MILAN, for mutual-information-guided linguistic annotation of neurons) that automatically labels neurons with open-ended, compositional, natural language descriptions. Given a neuron, MILAN generates a description by searching for a natural language string that maximizes pointwise mutual information with the image regions in which the neuron is active. MILAN produces fine-grained descriptions that capture categorical, relational, and logical structure in learned features. These descriptions obtain high agreement with human-generated feature descriptions across a diverse set of model architectures and tasks, and can aid in understanding and controlling learned models. We highlight three applications of natural language neuron descriptions. First, we use MILAN for analysis, characterizing the distribution and importance of neurons selective for attribute, category, and relational information in vision models. Second, we use MILAN for auditing, surfacing neurons sensitive to protected categories like race and gender in models trained on datasets intended to obscure these features. Finally, we use MILAN for editing, improving robustness in an image classifier by deleting neurons sensitive to text features spuriously correlated with class labels.

Replacements for Thu, 27 Jan 22

[32]  arXiv:2012.09966 (replaced) [pdf, other]
Title: Predicting Decisions in Language Based Persuasion Games
Comments: Under review for the Journal of Artificial Intelligence Research (JAIR)
Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Science and Game Theory (cs.GT)
[33]  arXiv:2010.01069 (replaced) [pdf, other]
Title: A Deeper Look at Discounting Mismatch in Actor-Critic Algorithms
Comments: AAMAS 2022
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
[34]  arXiv:2012.09737 (replaced) [pdf, other]
Title: Model-free and Bayesian Ensembling Model-based Deep Reinforcement Learning for Particle Accelerator Control Demonstrated on the FERMI FEL
Comments: 13 pages, 17 figures - minor changes and adaption to physical review journals
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Systems and Control (eess.SY); Accelerator Physics (physics.acc-ph)
[35]  arXiv:2107.10998 (replaced) [pdf, other]
Title: Pruning Ternary Quantization
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[36]  arXiv:2110.09074 (replaced) [pdf, other]
Title: Towards General Deep Leakage in Federated Learning
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
[37]  arXiv:2110.14754 (replaced) [pdf, other]
Title: Confidence-Aware Imitation Learning from Demonstrations with Varying Optimality
Comments: 18 pages, 4 figures, 3 tables. Published at Conference on Neural Information Processing Systems (NeurIPS) 2021
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[38]  arXiv:2110.15803 (replaced) [pdf, other]
Title: Natural Language Processing for Smart Healthcare
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
[39]  arXiv:2111.00131 (replaced) [pdf, other]
Title: Three approaches to facilitate DNN generalization to objects in out-of-distribution orientations and illuminations
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[40]  arXiv:2111.08995 (replaced) [pdf, other]
Title: Self-Learning Tuning for Post-Silicon Validation
Comments: Paper is currently under review for TuZ 22 (Testmethoden und Zuverl\"assigkeit von Schaltungen und Systemen)
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
[41]  arXiv:2112.14706 (replaced) [pdf]
Title: Intersection focused Situation Coverage-based Verification and Validation Framework for Autonomous Vehicles Implemented in CARLA
Comments: International Conference on Modelling and Simulation for Autonomous Systems, MESAS 2021
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Multiagent Systems (cs.MA); Software Engineering (cs.SE)
[42]  arXiv:2201.03891 (replaced) [pdf, other]
Title: A Saliency based Feature Fusion Model for EEG Emotion Estimation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[43]  arXiv:2201.06626 (replaced) [pdf, other]
Title: Neural Network Compression of ACAS Xu is Unsafe: Closed-Loop Verification through Quantized State Backreachability
Subjects: Numerical Analysis (math.NA); Artificial Intelligence (cs.AI); Logic in Computer Science (cs.LO)
[44]  arXiv:2201.07708 (replaced) [pdf, other]
Title: Debiased Graph Neural Networks with Agnostic Label Selection Bias
Comments: Accepted by TNNLS;12 pages
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
[45]  arXiv:2201.10474 (replaced) [pdf, other]
Title: Whose Language Counts as High Quality? Measuring Language Ideologies in Text Data Selection
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
[ total of 45 entries: 1-45 ]
[ showing up to 1000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, recent, 2201, contact, help  (Access key information)