We gratefully acknowledge support from
the Simons Foundation and member institutions.

Machine Learning

New submissions

[ total of 57 entries: 1-57 ]
[ showing up to 2000 entries per page: fewer | more ]

New submissions for Thu, 3 Dec 20

[1]  arXiv:2012.01012 [pdf, other]
Title: Information Theory in Density Destructors
Comments: Accepted at the Workshop on Invertible Neural Nets and Normalizing Flows, ICML 2019
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)

Density destructors are differentiable and invertible transforms that map multivariate PDFs of arbitrary structure (low entropy) into non-structured PDFs (maximum entropy). Multivariate Gaussianization and multivariate equalization are specific examples of this family, which break down the complexity of the original PDF through a set of elementary transforms that progressively remove the structure of the data. We demonstrate how this property of density destructive flows is connected to classical information theory, and how density destructors can be used to get more accurate estimates of information theoretic quantities. Experiments with total correlation and mutual information inmultivariate sets illustrate the ability of density destructors compared to competing methods. These results suggest that information theoretic measures may be an alternative optimization criteria when learning density destructive flows.

[2]  arXiv:2012.01089 [pdf, other]
Title: Aligning Hyperbolic Representations: an Optimal Transport-based approach
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)

Hyperbolic-spaces are better suited to represent data with underlying hierarchical relationships, e.g., tree-like data. However, it is often necessary to incorporate, through alignment, different but related representations meaningfully. This aligning is an important class of machine learning problems, with applications as ontology matching and cross-lingual alignment. Optimal transport (OT)-based approaches are a natural choice to tackle the alignment problem as they aim to find a transformation of the source dataset to match a target dataset, subject to some distribution constraints. This work proposes a novel approach based on OT of embeddings on the Poincar\'e model of hyperbolic spaces. Our method relies on the gyrobarycenter mapping on M\"obius gyrovector spaces. As a result of this formalism, we derive extensions to some existing Euclidean methods of OT-based domain adaptation to their hyperbolic counterparts. Empirically, we show that both Euclidean and hyperbolic methods have similar performances in the context of retrieval.

Cross-lists for Thu, 3 Dec 20

[3]  arXiv:2012.00745 (cross-list from econ.EM) [pdf, ps, other]
Title: Double machine learning for sample selection models
Comments: arXiv admin note: text overlap with arXiv:2012.00370
Subjects: Econometrics (econ.EM); Methodology (stat.ME); Machine Learning (stat.ML)

This paper considers treatment evaluation when outcomes are only observed for a subpopulation due to sample selection or outcome attrition/non-response. For identification, we combine a selection-on-observables assumption for treatment assignment with either selection-on-observables or instrumental variable assumptions concerning the outcome attrition/sample selection process. To control in a data-driven way for potentially high dimensional pre-treatment covariates that motivate the selection-on-observables assumptions, we adapt the double machine learning framework to sample selection problems. That is, we make use of (a) Neyman-orthogonal and doubly robust score functions, which imply the robustness of treatment effect estimation to moderate regularization biases in the machine learning-based estimation of the outcome, treatment, or sample selection models and (b) sample splitting (or cross-fitting) to prevent overfitting bias. We demonstrate that the proposed estimators are asymptotically normal and root-n consistent under specific regularity conditions concerning the machine learners. The estimator is available in the causalweight package for the statistical software R.

[4]  arXiv:2012.00752 (cross-list from cs.LG) [pdf, other]
Title: Forecasting Black Sigatoka Infection Risks with Latent Neural ODEs
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

Black Sigatoka disease severely decreases global banana production, and climate change aggravates the problem by altering fungal species distributions. Due to the heavy financial burden of managing this infectious disease, farmers in developing countries face significant banana crop losses. Though scientists have produced mathematical models of infectious diseases, adapting these models to incorporate climate effects is difficult. We present MR. NODE (Multiple predictoR Neural ODE), a neural network that models the dynamics of black Sigatoka infection learnt directly from data via Neural Ordinary Differential Equations. Our method encodes external predictor factors into the latent space in addition to the variable that we infer, and it can also predict the infection risk at an arbitrary point in time. Empirically, we demonstrate on historical climate data that our method has superior generalization performance on time points up to one month in the future and unseen irregularities. We believe that our method can be a useful tool to control the spread of black Sigatoka.

[5]  arXiv:2012.00780 (cross-list from cs.LG) [pdf, other]
Title: Refining Deep Generative Models via Wasserstein Gradient Flows
Comments: Preprint
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

Deep generative modeling has seen impressive advances in recent years, to the point where it is now commonplace to see simulated samples (e.g., images) that closely resemble real-world data. However, generation quality is generally inconsistent for any given model and can vary dramatically between samples. We introduce Discriminator Gradient flow (DGflow), a new technique that improves generated samples via the gradient flow of entropy-regularized f-divergences between the real and the generated data distributions. The gradient flow takes the form of a non-linear Fokker-Plank equation, which can be easily simulated by sampling from the equivalent McKean-Vlasov process. By refining inferior samples, our technique avoids wasteful sample rejection used by previous methods (DRS & MH-GAN). Compared to existing works that focus on specific GAN variants, we show our refinement approach can be applied to GANs with vector-valued critics and even other deep generative models such as VAEs and Normalizing Flows. Empirical results on multiple synthetic, image, and text datasets demonstrate that DGflow leads to significant improvement in the quality of generated samples for a variety of generative models, outperforming the state-of-the-art Discriminator Optimal Transport (DOT) and Discriminator Driven Latent Sampling (DDLS) methods.

[6]  arXiv:2012.00805 (cross-list from cs.LG) [pdf, other]
Title: Stochastic Approximation with Markov Noise: Analysis and applications in reinforcement learning
Comments: 124 pages, PhD thesis, IIS, Bangalore (2020)
Subjects: Machine Learning (cs.LG); Dynamical Systems (math.DS); Probability (math.PR); Machine Learning (stat.ML)

We present for the first time an asymptotic convergence analysis of two time-scale stochastic approximation driven by "controlled" Markov noise. In particular, the faster and slower recursions have non-additive controlled Markov noise components in addition to martingale difference noise. We analyze the asymptotic behavior of our framework by relating it to limiting differential inclusions in both time scales that are defined in terms of the ergodic occupation measures associated with the controlled Markov processes. Using a special case of our results, we present a solution to the off-policy convergence problem for temporal-difference learning with linear function approximation. We compile several aspects of the dynamics of stochastic approximation algorithms with Markov iterate-dependent noise when the iterates are not known to be stable beforehand. We achieve the same by extending the lock-in probability (i.e. the probability of convergence to a specific attractor of the limiting o.d.e. given that the iterates are in its domain of attraction after a sufficiently large number of iterations (say) n_0) framework to such recursions. We use these results to prove almost sure convergence of the iterates to the specified attractor when the iterates satisfy an "asymptotic tightness" condition. This, in turn, is shown to be useful in analyzing the tracking ability of general "adaptive" algorithms. Finally, we obtain the first informative error bounds on function approximation for the policy evaluation algorithm proposed by Basu et al. when the aim is to find the risk-sensitive cost represented using exponential utility. We show that this happens due to the absence of difference term in the earlier bound which is always present in all our bounds when the state space is large.

[7]  arXiv:2012.00807 (cross-list from math.ST) [pdf, ps, other]
Title: Minimum $\ell_1-$norm interpolation via basis pursuit is robust to errors
Subjects: Statistics Theory (math.ST); Information Theory (cs.IT); Numerical Analysis (math.NA); Machine Learning (stat.ML)

This article studies basis pursuit, i.e. minimum $\ell_1$-norm interpolation, in sparse linear regression with additive errors. No conditions on the errors are imposed. It is assumed that the number of i.i.d. Gaussian features grows superlinear in the number of samples. The main result is that under these conditions the Euclidean error of recovering the true regressor is of the order of the average noise level. Hence, the regressor recovered by basis pursuit is close to the truth if the average noise level is small. Lower bounds that show near optimality of the results complement the analysis. In addition, these results are extended to low rank trace regression. The proofs rely on new lower tail bounds for maxima of Gaussians vectors and the spectral norm of Gaussian matrices, respectively, and might be of independent interest as they are significantly stronger than the corresponding upper tail bounds.

[8]  arXiv:2012.00898 (cross-list from cs.CL) [pdf, ps, other]
Title: Federated Marginal Personalization for ASR Rescoring
Authors: Zhe Liu, Fuchun Peng
Subjects: Computation and Language (cs.CL); Machine Learning (stat.ML)

We introduce federated marginal personalization (FMP), a novel method for continuously updating personalized neural network language models (NNLMs) on private devices using federated learning (FL). Instead of fine-tuning the parameters of NNLMs on personal data, FMP regularly estimates global and personalized marginal distributions of words, and adjusts the probabilities from NNLMs by an adaptation factor that is specific to each word. Our presented approach can overcome the limitations of federated fine-tuning and efficiently learn personalized NNLMs on devices. We study the application of FMP on second-pass ASR rescoring tasks. Experiments on two speech evaluation datasets show modest word error rate (WER) reductions. We also demonstrate that FMP could offer reasonable privacy with only a negligible cost in speech recognition accuracy.

[9]  arXiv:2012.01064 (cross-list from cs.LG) [pdf, other]
Title: About contrastive unsupervised representation learning for classification and its convergence
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

Contrastive representation learning has been recently proved to be very efficient for self-supervised training. These methods have been successfully used to train encoders which perform comparably to supervised training on downstream classification tasks. A few works have started to build a theoretical framework around contrastive learning in which guarantees for its performance can be proven. We provide extensions of these results to training with multiple negative samples and for multiway classification. Furthermore, we provide convergence guarantees for the minimization of the contrastive training error with gradient descent of an overparametrized deep neural encoder, and provide some numerical experiments that complement our theoretical findings

[10]  arXiv:2012.01088 (cross-list from math.OC) [pdf, ps, other]
Title: Residuals-based distributionally robust optimization with covariate information
Subjects: Optimization and Control (math.OC); Machine Learning (stat.ML)

We consider data-driven approaches that integrate a machine learning prediction model within distributionally robust optimization (DRO) given limited joint observations of uncertain parameters and covariates. Our framework is flexible in the sense that it can accommodate a variety of learning setups and DRO ambiguity sets. We investigate the asymptotic and finite sample properties of solutions obtained using Wasserstein, sample robust optimization, and phi-divergence-based ambiguity sets within our DRO formulations, and explore cross-validation approaches for sizing these ambiguity sets. Through numerical experiments, we validate our theoretical results, study the effectiveness of our approaches for sizing ambiguity sets, and illustrate the benefits of our DRO formulations in the limited data regime even when the prediction model is misspecified.

[11]  arXiv:2012.01185 (cross-list from math.OC) [pdf, other]
Title: New Algorithms And Fast Implementations To Approximate Stochastic Processes
Subjects: Optimization and Control (math.OC); Machine Learning (stat.ML)

We present new algorithms and fast implementations to find efficient approximations for modelling stochastic processes. For many numerical computations it is essential to develop finite approximations for stochastic processes. While the goal is always to find a finite model, which represents a given knowledge about the real data process as accurate as possible, the ways of estimating the discrete approximating model may be quite different: (i) if the stochastic model is known as a solution of a stochastic differential equation, e.g., one may generate the scenario tree directly from the specified model; (ii) if a simulation algorithm is available, which allows simulating trajectories from all conditional distributions, a scenario tree can be generated by stochastic approximation; (iii) if only some observed trajectories of the scenario process are available, the construction of the approximating process can be based on non-parametric conditional density estimates.

[12]  arXiv:2012.01194 (cross-list from math.NA) [pdf, ps, other]
Title: Deep learning based numerical approximation algorithms for stochastic partial differential equations and high-dimensional nonlinear filtering problems
Comments: arXiv admin note: text overlap with arXiv:1907.03452
Subjects: Numerical Analysis (math.NA); Machine Learning (cs.LG); Probability (math.PR); Machine Learning (stat.ML)

In this article we introduce and study a deep learning based approximation algorithm for solutions of stochastic partial differential equations (SPDEs). In the proposed approximation algorithm we employ a deep neural network for every realization of the driving noise process of the SPDE to approximate the solution process of the SPDE under consideration. We test the performance of the proposed approximation algorithm in the case of stochastic heat equations with additive noise, stochastic heat equations with multiplicative noise, stochastic Black--Scholes equations with multiplicative noise, and Zakai equations from nonlinear filtering. In each of these SPDEs the proposed approximation algorithm produces accurate results with short run times in up to 50 space dimensions.

[13]  arXiv:2012.01205 (cross-list from cs.LG) [pdf, other]
Title: VisEvol: Visual Analytics to Support Hyperparameter Search through Evolutionary Optimization
Comments: This manuscript is currently under review
Subjects: Machine Learning (cs.LG); Human-Computer Interaction (cs.HC); Machine Learning (stat.ML)

During the training phase of machine learning (ML) models, it is usually necessary to configure several hyperparameters. This process is computationally intensive and requires an extensive search to infer the best hyperparameter set for the given problem. The challenge is exacerbated by the fact that most ML models are complex internally, and training involves trial-and-error processes that could remarkably affect the predictive result. Moreover, each hyperparameter of an ML algorithm is potentially intertwined with the others, and changing it might result in unforeseeable impacts on the remaining hyperparameters. Evolutionary optimization is a promising method to try and address those issues. According to this method, performant models are stored, while the remainder are improved through crossover and mutation processes inspired by genetic algorithms. We present VisEvol, a visual analytics tool that supports interactive exploration of hyperparameters and intervention in this evolutionary procedure. In summary, our proposed tool helps the user to generate new models through evolution and eventually explore powerful hyperparameter combinations in diverse regions of the extensive hyperparameter space. The outcome is a voting ensemble (with equal rights) that boosts the final predictive performance. The utility and applicability of VisEvol are demonstrated with two use cases and interviews with ML experts who evaluated the effectiveness of the tool.

[14]  arXiv:2012.01293 (cross-list from cs.LG) [pdf, other]
Title: The Self-Simplifying Machine: Exploiting the Structure of Piecewise Linear Neural Networks to Create Interpretable Models
Authors: William Knauth
Comments: 38 pages, 32 figures, appendices A, B
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

Today, it is more important than ever before for users to have trust in the models they use. As Machine Learning models fall under increased regulatory scrutiny and begin to see more applications in high-stakes situations, it becomes critical to explain our models. Piecewise Linear Neural Networks (PLNN) with the ReLU activation function have quickly become extremely popular models due to many appealing properties; however, they still present many challenges in the areas of robustness and interpretation. To this end, we introduce novel methodology toward simplification and increased interpretability of Piecewise Linear Neural Networks for classification tasks. Our methods include the use of a trained, deep network to produce a well-performing, single-hidden-layer network without further stochastic training, in addition to an algorithm to reduce flat networks to a smaller, more interpretable size with minimal loss in performance. On these methods, we conduct preliminary studies of model performance, as well as a case study on Wells Fargo's Home Lending dataset, together with visual model interpretation.

[15]  arXiv:2012.01349 (cross-list from stat.AP) [pdf, other]
Title: The temporal overfitting problem with applications in wind power curve modeling
Comments: 30 pages, 6 figures
Subjects: Applications (stat.AP); Machine Learning (stat.ML)

This paper is concerned with a nonparametric regression problem in which the independence assumption of the input variables and the residuals is no longer valid. Using existing model selection methods, like cross validation, the presence of temporal autocorrelation in the input variables and the error terms leads to model overfitting. This phenomenon is referred to as temporal overfitting, which causes loss of performance while predicting responses for a time domain different from the training time domain. We propose a new method to tackle the temporal overfitting problem. Our nonparametric model is partitioned into two parts -- a time-invariant component and a time-varying component, each of which is modeled through a Gaussian process regression. The key in our inference is a thinning-based strategy, an idea borrowed from Markov chain Monte Carlo sampling, to estimate the two components, respectively. Our specific application in this paper targets the power curve modeling in wind energy. In our numerical studies, we compare extensively our proposed method with both existing power curve models and available ideas for handling temporal overfitting. Our approach yields significant improvement in prediction both in and outside the time domain covered by the training data.

Replacements for Thu, 3 Dec 20

[16]  arXiv:1706.06296 (replaced) [pdf, ps, other]
Title: Approximate Kernel PCA Using Random Features: Computational vs. Statistical Trade-off
Comments: 57 pages
Subjects: Machine Learning (stat.ML); Statistics Theory (math.ST)
[17]  arXiv:1805.11028 (replaced) [pdf, other]
Title: Autoencoding any Data through Kernel Autoencoders
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)
[18]  arXiv:1810.07924 (replaced) [pdf, other]
Title: Explaining Machine Learning Models using Entropic Variable Projection
Authors: François Bachoc (IMT), Fabrice Gamboa (IMT), Max Halford (IMT, IRIT), Jean-Michel Loubes (IMT), Laurent Risser (IMT)
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)
[19]  arXiv:1908.05339 (replaced) [pdf]
Title: Mixed pooling of seasonality for time series forecasting: An application to pallet transport data
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Applications (stat.AP)
[20]  arXiv:2003.06281 (replaced) [pdf, other]
Title: BayesFlow: Learning complex stochastic models with invertible neural networks
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)
[21]  arXiv:2006.08539 (replaced) [pdf, other]
Title: Layer-wise Learning of Kernel Dependence Networks
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)
[22]  arXiv:1807.04252 (replaced) [pdf, other]
Title: Last-Iterate Convergence: Zero-Sum Games and Constrained Min-Max Optimization
Comments: Appeared in ITCS 2019
Subjects: Optimization and Control (math.OC); Computer Science and Game Theory (cs.GT); Machine Learning (stat.ML)
[23]  arXiv:1811.03666 (replaced) [pdf, other]
Title: Statistical Characteristics of Deep Representations: An Empirical Investigation
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
[24]  arXiv:1901.00109 (replaced) [pdf, other]
Title: Morphological Network: How Far Can We Go with Morphological Neurons?
Comments: 35 pages, 19 figures, 7 tables
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)
[25]  arXiv:1902.03466 (replaced) [pdf, other]
Title: Hierarchical Multi-task Deep Neural Network Architecture for End-to-End Driving
Comments: 18 pages, 17 plots and figures
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO); Machine Learning (stat.ML)
[26]  arXiv:1903.03332 (replaced) [pdf, other]
Title: GCOMB: Learning Budget-constrained Combinatorial Algorithms over Billion-sized Graphs
Comments: To appear in NeurIPS 2020 this https URL
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
[27]  arXiv:1905.00568 (replaced) [pdf, ps, other]
Title: Weight Map Layer for Noise and Adversarial Attack Robustness
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
[28]  arXiv:1911.01067 (replaced) [pdf, other]
Title: Blind Network Revenue Management and Bandits with Knapsacks under Limited Switches
Subjects: Machine Learning (cs.LG); Computer Science and Game Theory (cs.GT); Optimization and Control (math.OC); Machine Learning (stat.ML)
[29]  arXiv:1911.02497 (replaced) [pdf, other]
Title: A Programmable Approach to Neural Network Compression
Comments: This is an updated version of a paper published in IEEE Micro, vol. 40, no. 5, pp. 17-25, Sept.-Oct. 2020 at this https URL
Journal-ref: IEEE Micro, Volume: 40, Issue: 5, Sept.-Oct. 2020, pp. 17-25
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
[30]  arXiv:2002.02826 (replaced) [pdf, other]
Title: Deep Moment Matching Kernel for Multi-source Gaussian Processes
Comments: Revised version, title changed, experiments on high dimensional multi-fidelity data added, related works revised
Subjects: Machine Learning (cs.LG); Disordered Systems and Neural Networks (cond-mat.dis-nn); Machine Learning (stat.ML)
[31]  arXiv:2002.06910 (replaced) [pdf, other]
Title: t-viSNE: Interactive Assessment and Interpretation of t-SNE Projections
Comments: This manuscript is published in the IEEE Transactions on Visualization and Computer Graphics Journal (IEEE TVCG)
Journal-ref: IEEE TVCG 2020, 26(8), 2696-2714
Subjects: Machine Learning (cs.LG); Human-Computer Interaction (cs.HC); Machine Learning (stat.ML)
[32]  arXiv:2002.08104 (replaced) [pdf, other]
Title: Analyzing Neural Networks Based on Random Graphs
Comments: Added new results and discussion
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Neurons and Cognition (q-bio.NC); Machine Learning (stat.ML)
[33]  arXiv:2003.00295 (replaced) [pdf, other]
Title: Adaptive Federated Optimization
Subjects: Machine Learning (cs.LG); Distributed, Parallel, and Cluster Computing (cs.DC); Optimization and Control (math.OC); Machine Learning (stat.ML)
[34]  arXiv:2003.02214 (replaced) [pdf, ps, other]
Title: Maximal Causes for Exponential Family Observables
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
[35]  arXiv:2003.07201 (replaced) [pdf, ps, other]
Title: The Elliptical Processes: a Family of Fat-tailed Stochastic Processes
Subjects: Methodology (stat.ME); Machine Learning (cs.LG); Machine Learning (stat.ML)
[36]  arXiv:2003.13299 (replaced) [pdf, other]
Title: Variable fusion for Bayesian linear regression via spike-and-slab priors
Comments: 19 pages
Subjects: Methodology (stat.ME); Machine Learning (stat.ML)
[37]  arXiv:2004.08889 (replaced) [pdf, other]
Title: Sequential hypothesis testing in machine learning, and crude oil price jump size detection
Comments: 24 pages, 7 figures
Subjects: Methodology (stat.ME); Mathematical Finance (q-fin.MF); Machine Learning (stat.ML)
[38]  arXiv:2005.01317 (replaced) [pdf, ps, other]
Title: Robust Non-Linear Matrix Factorization for Dictionary Learning, Denoising, and Clustering
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
[39]  arXiv:2006.06179 (replaced) [pdf, other]
Title: Recovery and Generalization in Over-Realized Dictionary Learning
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
[40]  arXiv:2006.06830 (replaced) [pdf, other]
Title: Data Augmentation for Graph Neural Networks
Comments: AAAI 2021. This complete version contains the Appendix
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
[41]  arXiv:2006.09762 (replaced) [pdf, other]
Title: Maximum Roaming Multi-Task Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Machine Learning (stat.ML)
[42]  arXiv:2006.10598 (replaced) [pdf, other]
Title: Shapeshifter Networks: Decoupling Layers from Parameters for Scalable and Effective Deep Learning
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
[43]  arXiv:2006.10803 (replaced) [pdf, other]
Title: Supervision Accelerates Pre-training in Contrastive Semi-Supervised Learning of Visual Representations
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
[44]  arXiv:2007.01669 (replaced) [pdf, other]
Title: Gaussian Process Regression with Local Explanation
Subjects: Machine Learning (cs.LG); Methodology (stat.ME); Machine Learning (stat.ML)
[45]  arXiv:2007.07383 (replaced) [pdf, ps, other]
Title: Supervised learning from noisy observations: Combining machine-learning techniques with data assimilation
Subjects: Data Analysis, Statistics and Probability (physics.data-an); Machine Learning (cs.LG); Computational Physics (physics.comp-ph); Methodology (stat.ME); Machine Learning (stat.ML)
[46]  arXiv:2008.03658 (replaced) [pdf, other]
Title: DIET-SNN: Direct Input Encoding With Leakage and Threshold Optimization in Deep Spiking Neural Networks
Subjects: Neural and Evolutionary Computing (cs.NE); Machine Learning (cs.LG); Machine Learning (stat.ML)
[47]  arXiv:2009.00437 (replaced) [pdf, other]
Title: NATS-Bench: Benchmarking NAS Algorithms for Architecture Topology and Size
Comments: An extended version of NAS-Bench-201 published in ICLR 2020 [arXiv:2001.00326]
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
[48]  arXiv:2009.01339 (replaced) [pdf, other]
Title: Heterogeneous Explore-Exploit Strategies on Multi-Star Networks
Subjects: Optimization and Control (math.OC); Machine Learning (stat.ML)
[49]  arXiv:2009.02183 (replaced) [pdf, other]
Title: On the implementation of a global optimization method for mixed-variable problems
Subjects: Machine Learning (cs.LG); Discrete Mathematics (cs.DM); Optimization and Control (math.OC); Machine Learning (stat.ML)
[50]  arXiv:2009.03455 (replaced) [pdf, other]
Title: Addressing Cold Start in Recommender Systems with Hierarchical Graph Neural Networks
Comments: V2 with multiple changes
Journal-ref: IEEE Big Data 2020
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
[51]  arXiv:2009.03727 (replaced) [pdf]
Title: Highly Accurate CNN Inference Using Approximate Activation Functions over Homomorphic Encryption
Comments: Accepted at 7th International Workshop on Privacy and Security of Big Data in conjunction with 2020 IEEE International Conference on Big Data (IEEE BigData 2020)
Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR); Machine Learning (stat.ML)
[52]  arXiv:2009.14702 (replaced) [pdf, ps, other]
Title: Some Remarks on Replicated Simulated Annealing
Subjects: Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Optimization and Control (math.OC); Probability (math.PR); Machine Learning (stat.ML)
[53]  arXiv:2010.00578 (replaced) [pdf, other]
Title: Understanding Self-supervised Learning with Dual Deep Networks
Comments: Extended version after conference rebuttal. Remove assumptions in main theorems and add more experiments / theoretical analysis
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
[54]  arXiv:2010.15116 (replaced) [pdf, other]
Title: On Graph Neural Networks versus Graph-Augmented MLPs
Subjects: Machine Learning (cs.LG); Combinatorics (math.CO); Machine Learning (stat.ML)
[55]  arXiv:2011.14420 (replaced) [pdf]
Title: Improving Neural Network with Uniform Sparse Connectivity
Authors: Weijun Luo
Comments: paper accepted by IEEE Access
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
[56]  arXiv:2011.14439 (replaced) [pdf, other]
Title: Scaling down Deep Learning
Authors: Sam Greydanus
Comments: 10 pages, 9 figures
Subjects: Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)
[57]  arXiv:2011.14572 (replaced) [pdf, ps, other]
Title: Gradient Sparsification Can Improve Performance of Differentially-Private Convex Machine Learning
Authors: Farhad Farokhi
Comments: Fixed typos and a mistake in the proof of Proposition 1
Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR); Optimization and Control (math.OC); Machine Learning (stat.ML)
[ total of 57 entries: 1-57 ]
[ showing up to 2000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, stat, recent, 2012, contact, help  (Access key information)