We gratefully acknowledge support from
the Simons Foundation and member institutions.

Machine Learning

New submissions

[ total of 42 entries: 1-42 ]
[ showing up to 2000 entries per page: fewer | more ]

New submissions for Mon, 18 Oct 21

[1]  arXiv:2110.07618 [pdf, other]
Title: Sparse Implicit Processes for Approximate Inference
Comments: 10 pages for the main text (with 3 figures and 1 table), and 9 pages of supplementary material (with 6 figures and 3 tables)
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)

Implicit Processes (IPs) are flexible priors that can describe models such as Bayesian neural networks, neural samplers and data generators. IPs allow for approximate inference in function-space. This avoids some degenerate problems of parameter-space approximate inference due to the high number of parameters and strong dependencies. For this, an extra IP is often used to approximate the posterior of the prior IP. However, simultaneously adjusting the parameters of the prior IP and the approximate posterior IP is a challenging task. Existing methods that can tune the prior IP result in a Gaussian predictive distribution, which fails to capture important data patterns. By contrast, methods producing flexible predictive distributions by using another IP to approximate the posterior process cannot fit the prior IP to the observed data. We propose here a method that can carry out both tasks. For this, we rely on an inducing-point representation of the prior IP, as often done in the context of sparse Gaussian processes. The result is a scalable method for approximate inference with IPs that can tune the prior IP parameters to the data, and that provides accurate non-Gaussian predictive distributions.

[2]  arXiv:2110.07739 [pdf, other]
Title: Model-Change Active Learning in Graph-Based Semi-Supervised Learning
Comments: Submitted to SIAM Journal on Mathematics of Data Science (SIMODS)
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)

Active learning in semi-supervised classification involves introducing additional labels for unlabelled data to improve the accuracy of the underlying classifier. A challenge is to identify which points to label to best improve performance while limiting the number of new labels. "Model-change" active learning quantifies the resulting change incurred in the classifier by introducing the additional label(s). We pair this idea with graph-based semi-supervised learning methods, that use the spectrum of the graph Laplacian matrix, which can be truncated to avoid prohibitively large computational and storage costs. We consider a family of convex loss functions for which the acquisition function can be efficiently approximated using the Laplace approximation of the posterior distribution. We show a variety of multiclass examples that illustrate improved performance over prior state-of-art.

[3]  arXiv:2110.07756 [pdf, other]
Title: Learning Mean-Field Equations from Particle Data Using WSINDy
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Dynamical Systems (math.DS); Numerical Analysis (math.NA); Optimization and Control (math.OC); Probability (math.PR)

We develop a weak-form sparse identification method for interacting particle systems (IPS) with the primary goals of reducing computational complexity for large particle number $N$ and offering robustness to either intrinsic or extrinsic noise. In particular, we use concepts from mean-field theory of IPS in combination with the weak-form sparse identification of nonlinear dynamics algorithm (WSINDy) to provide a fast and reliable system identification scheme for recovering the governing stochastic differential equations for an IPS when the number of particles per experiment $N$ is on the order of several thousand and the number of experiments $M$ is less than 100. This is in contrast to existing work showing that system identification for $N$ less than 100 and $M$ on the order of several thousand is feasible using strong-form methods. We prove that under some standard regularity assumptions the scheme converges with rate $\mathcal{O}(N^{-1/2})$ in the ordinary least squares setting and we demonstrate the convergence rate numerically on several systems in one and two spatial dimensions. Our examples include a canonical problem from homogenization theory (as a first step towards learning coarse-grained models), the dynamics of an attractive-repulsive swarm, and the IPS description of the parabolic-elliptic Keller-Segel model for chemotaxis.

[4]  arXiv:2110.07788 [pdf, other]
Title: Gaussian Process Bandit Optimization with Few Batches
Subjects: Machine Learning (stat.ML); Information Theory (cs.IT); Machine Learning (cs.LG); Optimization and Control (math.OC)

In this paper, we consider the problem of black-box optimization using Gaussian Process (GP) bandit optimization with a small number of batches. Assuming the unknown function has a low norm in the Reproducing Kernel Hilbert Space (RKHS), we introduce a batch algorithm inspired by batched finite-arm bandit algorithms, and show that it achieves the cumulative regret upper bound $O^\ast(\sqrt{T\gamma_T})$ using $O(\log\log T)$ batches within time horizon $T$, where the $O^\ast(\cdot)$ notation hides dimension-independent logarithmic factors and $\gamma_T$ is the maximum information gain associated with the kernel. This bound is near-optimal for several kernels of interest and improves on the typical $O^\ast(\sqrt{T}\gamma_T)$ bound, and our approach is arguably the simplest among algorithms attaining this improvement. In addition, in the case of a constant number of batches (not depending on $T$), we propose a modified version of our algorithm, and characterize how the regret is impacted by the number of batches, focusing on the squared exponential and Mat\'ern kernels. The algorithmic upper bounds are shown to be nearly minimax optimal via analogous algorithm-independent lower bounds.

[5]  arXiv:2110.08045 [pdf, ps, other]
Title: Compressive Independent Component Analysis: Theory and Algorithms
Comments: 27 pages, 8 figures, under review
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Signal Processing (eess.SP)

Compressive learning forms the exciting intersection between compressed sensing and statistical learning where one exploits forms of sparsity and structure to reduce the memory and/or computational complexity of the learning task. In this paper, we look at the independent component analysis (ICA) model through the compressive learning lens. In particular, we show that solutions to the cumulant based ICA model have particular structure that induces a low dimensional model set that resides in the cumulant tensor space. By showing a restricted isometry property holds for random cumulants e.g. Gaussian ensembles, we prove the existence of a compressive ICA scheme. Thereafter, we propose two algorithms of the form of an iterative projection gradient (IPG) and an alternating steepest descent (ASD) algorithm for compressive ICA, where the order of compression asserted from the restricted isometry property is realised through empirical results. We provide analysis of the CICA algorithms including the effects of finite samples. The effects of compression are characterised by a trade-off between the sketch size and the statistical efficiency of the ICA estimates. By considering synthetic and real datasets, we show the substantial memory gains achieved over well-known ICA algorithms by using one of the proposed CICA algorithms. Finally, we conclude the paper with open problems including interesting challenges from the emerging field of compressive learning.

[6]  arXiv:2110.08087 [pdf, other]
Title: Causal Identification with Additive Noise Models: Quantifying the Effect of Noise
Comments: Presented at 10\`emes Journ\'ees Francophones sur les R\'eseaux Bay\'esiens et les Mod\`eles Graphiques Probabilistes (JFRB-2021), this https URL
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)

In recent years, a lot of research has been conducted within the area of causal inference and causal learning. Many methods have been developed to identify the cause-effect pairs in models and have been successfully applied to observational real-world data to determine the direction of causal relationships. Yet in bivariate situations, causal discovery problems remain challenging. One class of such methods, that also allows tackling the bivariate case, is based on Additive Noise Models (ANMs). Unfortunately, one aspect of these methods has not received much attention until now: what is the impact of different noise levels on the ability of these methods to identify the direction of the causal relationship. This work aims to bridge this gap with the help of an empirical study. We test Regression with Subsequent Independence Test (RESIT) using an exhaustive range of models where the level of additive noise gradually changes from 1\% to 10000\% of the causes' noise level (the latter remains fixed). Additionally, the experiments in this work consider several different types of distributions as well as linear and non-linear models. The results of the experiments show that ANMs methods can fail to capture the true causal direction for some levels of noise.

[7]  arXiv:2110.08111 [pdf, other]
Title: An active learning approach for improving the performance of equilibrium based chemical simulations
Comments: 22 pages, 17 figures
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)

In this paper, we propose a novel sequential data-driven method for dealing with equilibrium based chemical simulations, which can be seen as a specific machine learning approach called active learning. The underlying idea of our approach is to consider the function to estimate as a sample of a Gaussian process which allows us to compute the global uncertainty on the function estimation. Thanks to this estimation and with almost no parameter to tune, the proposed method sequentially chooses the most relevant input data at which the function to estimate has to be evaluated to build a surrogate model. Hence, the number of evaluations of the function to estimate is dramatically limited. Our active learning method is validated through numerical experiments and applied to a complex chemical system commonly used in geoscience.

[8]  arXiv:2110.08217 [pdf, other]
Title: Choice functions based multi-objective Bayesian optimisation
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)

In this work we introduce a new framework for multi-objective Bayesian optimisation where the multi-objective functions can only be accessed via choice judgements, such as ``I pick options A,B,C among this set of five options A,B,C,D,E''. The fact that the option D is rejected means that there is at least one option among the selected ones A,B,C that I strictly prefer over D (but I do not have to specify which one). We assume that there is a latent vector function f for some dimension $n_e$ which embeds the options into the real vector space of dimension n, so that the choice set can be represented through a Pareto set of non-dominated options. By placing a Gaussian process prior on f and deriving a novel likelihood model for choice data, we propose a Bayesian framework for choice functions learning. We then apply this surrogate model to solve a novel multi-objective Bayesian optimisation from choice data problem.

Cross-lists for Mon, 18 Oct 21

[9]  arXiv:2110.07751 (cross-list from cs.LG) [pdf, other]
Title: Leveraging Spatial and Temporal Correlations in Sparsified Mean Estimation
Comments: Accepted to NeurIPS 2021
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)

We study the problem of estimating at a central server the mean of a set of vectors distributed across several nodes (one vector per node). When the vectors are high-dimensional, the communication cost of sending entire vectors may be prohibitive, and it may be imperative for them to use sparsification techniques. While most existing work on sparsified mean estimation is agnostic to the characteristics of the data vectors, in many practical applications such as federated learning, there may be spatial correlations (similarities in the vectors sent by different nodes) or temporal correlations (similarities in the data sent by a single node over different iterations of the algorithm) in the data vectors. We leverage these correlations by simply modifying the decoding method used by the server to estimate the mean. We provide an analysis of the resulting estimation error as well as experiments for PCA, K-Means and Logistic Regression, which show that our estimators consistently outperform more sophisticated and expensive sparsification methods.

[10]  arXiv:2110.07810 (cross-list from cs.LG) [pdf, other]
Title: Towards Statistical and Computational Complexities of Polyak Step Size Gradient Descent
Comments: First three authors contributed equally. 40 pages, 4 figures
Subjects: Machine Learning (cs.LG); Statistics Theory (math.ST); Machine Learning (stat.ML)

We study the statistical and computational complexities of the Polyak step size gradient descent algorithm under generalized smoothness and Lojasiewicz conditions of the population loss function, namely, the limit of the empirical loss function when the sample size goes to infinity, and the stability between the gradients of the empirical and population loss functions, namely, the polynomial growth on the concentration bound between the gradients of sample and population loss functions. We demonstrate that the Polyak step size gradient descent iterates reach a final statistical radius of convergence around the true parameter after logarithmic number of iterations in terms of the sample size. It is computationally cheaper than the polynomial number of iterations on the sample size of the fixed-step size gradient descent algorithm to reach the same final statistical radius when the population loss function is not locally strongly convex. Finally, we illustrate our general theory under three statistical examples: generalized linear model, mixture model, and mixed linear regression model.

[11]  arXiv:2110.07959 (cross-list from cs.LG) [pdf, other]
Title: Low-rank Matrix Recovery With Unknown Correspondence
Subjects: Machine Learning (cs.LG); Information Retrieval (cs.IR); Machine Learning (stat.ML)

We study a matrix recovery problem with unknown correspondence: given the observation matrix $M_o=[A,\tilde P B]$, where $\tilde P$ is an unknown permutation matrix, we aim to recover the underlying matrix $M=[A,B]$. Such problem commonly arises in many applications where heterogeneous data are utilized and the correspondence among them are unknown, e.g., due to privacy concerns. We show that it is possible to recover $M$ via solving a nuclear norm minimization problem under a proper low-rank condition on $M$, with provable non-asymptotic error bound for the recovery of $M$. We propose an algorithm, $\text{M}^3\text{O}$ (Matrix recovery via Min-Max Optimization) which recasts this combinatorial problem as a continuous minimax optimization problem and solves it by proximal gradient with a Max-Oracle. $\text{M}^3\text{O}$ can also be applied to a more general scenario where we have missing entries in $M_o$ and multiple groups of data with distinct unknown correspondence. Experiments on simulated data, the MovieLens 100K dataset and Yale B database show that $\text{M}^3\text{O}$ achieves state-of-the-art performance over several baselines and can recover the ground-truth correspondence with high accuracy.

[12]  arXiv:2110.08138 (cross-list from math.DG) [pdf, ps, other]
Title: Convergence of Laplacian Eigenmaps and its Rate for Submanifolds with Singularities
Authors: Masayuki Aino
Comments: 63 pages
Subjects: Differential Geometry (math.DG); Machine Learning (stat.ML)

In this paper, we give a spectral approximation result for the Laplacian on submanifolds of Euclidean spaces with singularities by the $\epsilon$-neighborhood graph constructed from random points on the submanifold. Our convergence rate for the eigenvalue of the Laplacian is $O\left(\left(\log n/n\right)^{1/(m+2)}\right)$, where $m$ and $n$ denote the dimension of the manifold and the sample size, respectively.

[13]  arXiv:2110.08150 (cross-list from math.OC) [pdf, ps, other]
Title: Halpern-Type Accelerated and Splitting Algorithms For Monotone Inclusions
Comments: 33 pages
Subjects: Optimization and Control (math.OC); Machine Learning (stat.ML)

In this paper, we develop a new type of accelerated algorithms to solve some classes of maximally monotone equations as well as monotone inclusions. Instead of using Nesterov's accelerating approach, our methods rely on a so-called Halpern-type fixed-point iteration in [32], and recently exploited by a number of researchers, including [24, 70]. Firstly, we derive a new variant of the anchored extra-gradient scheme in [70] based on Popov's past extra-gradient method to solve a maximally monotone equation $G(x) = 0$. We show that our method achieves the same $\mathcal{O}(1/k)$ convergence rate (up to a constant factor) as in the anchored extra-gradient algorithm on the operator norm $\Vert G(x_k)\Vert$, , but requires only one evaluation of $G$ at each iteration, where $k$ is the iteration counter. Next, we develop two splitting algorithms to approximate a zero point of the sum of two maximally monotone operators. The first algorithm originates from the anchored extra-gradient method combining with a splitting technique, while the second one is its Popov's variant which can reduce the per-iteration complexity. Both algorithms appear to be new and can be viewed as accelerated variants of the Douglas-Rachford (DR) splitting method. They both achieve $\mathcal{O}(1/k)$ rates on the norm $\Vert G_{\gamma}(x_k)\Vert$ of the forward-backward residual operator $G_{\gamma}(\cdot)$ associated with the problem. We also propose a new accelerated Douglas-Rachford splitting scheme for solving this problem which achieves $\mathcal{O}(1/k)$ convergence rate on $\Vert G_{\gamma}(x_k)\Vert$ under only maximally monotone assumptions. Finally, we specify our first algorithm to solve convex-concave minimax problems and apply our accelerated DR scheme to derive a new variant of the alternating direction method of multipliers (ADMM).

[14]  arXiv:2110.08161 (cross-list from stat.ME) [pdf, other]
Title: SAFFRON and LORD Ensure Online Control of the False Discovery Rate Under Positive Dependence
Authors: Aaron Fisher
Subjects: Methodology (stat.ME); Machine Learning (stat.ML)

Online testing procedures assume that hypotheses are observed in sequence, and allow the significance thresholds for upcoming tests to depend on the test statistics observed so far. Some of the most popular online methods include alpha investing, LORD++ (hereafter, LORD), and SAFFRON. These three methods have been shown to provide online control of the "modified" false discovery rate (mFDR). However, to our knowledge, they have only been shown to control the traditional false discovery rate (FDR) under an independence condition on the test statistics. Our work bolsters these results by showing that SAFFRON and LORD additionally ensure online control of the FDR under nonnegative dependence. Because alpha investing can be recovered as a special case of the SAFFRON framework, the same result applies to this method as well. Our result also allows for certain forms of adaptive stopping times, for example, stopping after a certain number of rejections have been observed.

[15]  arXiv:2110.08205 (cross-list from stat.ME) [pdf, other]
Title: Fast Online Changepoint Detection via Functional Pruning CUSUM statistics
Subjects: Methodology (stat.ME); Computation (stat.CO); Machine Learning (stat.ML)

Many modern applications of online changepoint detection require the ability to process high-frequency observations, sometimes with limited available computational resources. Online algorithms for detecting a change in mean often involve using a moving window, or specifying the expected size of change. Such choices affect which changes the algorithms have most power to detect. We introduce an algorithm, Functional Online CuSUM (FOCuS), which is equivalent to running these earlier methods simultaneously for all sizes of window, or all possible values for the size of change. Our theoretical results give tight bounds on the expected computational cost per iteration of FOCuS, with this being logarithmic in the number of observations. We show how FOCuS can be applied to a number of different change in mean scenarios, and demonstrate its practical utility through its state-of-the art performance at detecting anomalous behaviour in computer server data.

[16]  arXiv:2110.08211 (cross-list from astro-ph.IM) [pdf, other]
Title: Astronomical source finding services for the CIRASA visual analytic platform
Comments: 16 pages, 6 figures
Subjects: Instrumentation and Methods for Astrophysics (astro-ph.IM); Computation (stat.CO); Machine Learning (stat.ML)

Innovative developments in data processing, archiving, analysis, and visualization are nowadays unavoidable to deal with the data deluge expected in next-generation facilities for radio astronomy, such as the Square Kilometre Array (SKA) and its precursors. In this context, the integration of source extraction and analysis algorithms into data visualization tools could significantly improve and speed up the cataloguing process of large area surveys, boosting astronomer productivity and shortening publication time. To this aim, we are developing a visual analytic platform (CIRASA) for advanced source finding and classification, integrating state-of-the-art tools, such as the CAESAR source finder, the ViaLactea Visual Analytic (VLVA) and Knowledge Base (VLKB). In this work, we present the project objectives and the platform architecture, focusing on the implemented source finding services.

Replacements for Mon, 18 Oct 21

[17]  arXiv:2105.07610 (replaced) [pdf, other]
Title: Cross-Cluster Weighted Forests
Comments: 19 pages, 6 figures, 1 table
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)
[18]  arXiv:2106.02078 (replaced) [pdf, other]
Title: Improving Neural Network Robustness via Persistency of Excitation
Subjects: Machine Learning (stat.ML); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Systems and Control (eess.SY)
[19]  arXiv:2106.02713 (replaced) [pdf, other]
Title: Learning Curves for SGD on Structured Features
Comments: Added new analysis of optimal batchsize and learning rate. Provided theoretical learning curves for case where test and training measures are different and apply to predicting errors for test/train splits on real datasets. Also provided a new bound for non-Gaussian features based on a regularity condition proposed by Varre et al 2021 arXiv:2102.03183
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)
[20]  arXiv:2107.02474 (replaced) [pdf, other]
Title: Viscos Flows: Variational Schur Conditional Sampling With Normalizing Flows
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)
[21]  arXiv:2107.09088 (replaced) [pdf, other]
Title: Reward-Weighted Regression Converges to a Global Optimum
Comments: 7 pages in main text + 2 pages of references + 6 pages of appendices, 1 figure in main text + 1 figure in appendices; source code available at this https URL
Subjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[22]  arXiv:2110.06381 (replaced) [pdf, other]
Title: Meta Learning Low Rank Covariance Factors for Energy-Based Deterministic Uncertainty
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)
[23]  arXiv:2110.06581 (replaced) [pdf, other]
Title: Averting A Crisis In Simulation-Based Inference
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)
[24]  arXiv:1912.08140 (replaced) [pdf, other]
Title: On-the-fly Global Embeddings Using Random Projections for Extreme Multi-label Classification
Authors: Yashaswi Verma
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Information Retrieval (cs.IR); Machine Learning (stat.ML)
[25]  arXiv:2002.12036 (replaced) [pdf, other]
Title: Complexity Measures and Features for Times Series classification
Subjects: Machine Learning (cs.LG); Information Theory (cs.IT); Machine Learning (stat.ML)
[26]  arXiv:2006.06592 (replaced) [pdf, other]
Title: The Backbone Method for Ultra-High Dimensional Sparse Machine Learning
Comments: First submission to Machine Learning: 06/2020. Revised: 10/2021
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
[27]  arXiv:2006.09773 (replaced) [pdf, other]
Title: Neural Ordinary Differential Equation Control of Dynamics on Graphs
Comments: Fifth version improves and clears notation
Subjects: Machine Learning (cs.LG); Social and Information Networks (cs.SI); Machine Learning (stat.ML)
[28]  arXiv:2007.12652 (replaced) [pdf, other]
Title: MurTree: Optimal Classification Trees via Dynamic Programming and Search
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Data Structures and Algorithms (cs.DS); Machine Learning (stat.ML)
[29]  arXiv:2009.06921 (replaced) [pdf, ps, other]
Title: Optimal Decision Trees for Nonlinear Metrics
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Data Structures and Algorithms (cs.DS); Machine Learning (stat.ML)
[30]  arXiv:2102.12353 (replaced) [pdf, other]
Title: Nonlinear Invariant Risk Minimization: A Causal Approach
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
[31]  arXiv:2102.13088 (replaced) [pdf, other]
Title: Even your Teacher Needs Guidance: Ground-Truth Targets Dampen Regularization Imposed by Self-Distillation
Comments: To be published at NeurIPS 2021; 21 pages, 14 figures
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
[32]  arXiv:2103.12591 (replaced) [pdf, ps, other]
Title: BoXHED2.0: Scalable boosting of dynamic survival analysis
Comments: 12 pages
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
[33]  arXiv:2106.05319 (replaced) [pdf, other]
Title: Stein Latent Optimization for Generative Adversarial Networks
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
[34]  arXiv:2106.11609 (replaced) [pdf, other]
Title: Distributional Gradient Matching for Learning Uncertain Neural Dynamics Models
Comments: Published at NeurIPS 2021
Journal-ref: Advances in Neural Information Processing Systems, 2021
Subjects: Machine Learning (cs.LG); Dynamical Systems (math.DS); Machine Learning (stat.ML)
[35]  arXiv:2106.12248 (replaced) [pdf, other]
Title: ADAVI: Automatic Dual Amortized Variational Inference Applied To Pyramidal Bayesian Models
Authors: Louis Rouillard (PARIETAL, Inria, CEA), Demian Wassermann (PARIETAL, Inria, CEA)
Comments: Preprint submitted to ICLR 2022
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Neurons and Cognition (q-bio.NC); Machine Learning (stat.ML)
[36]  arXiv:2107.00594 (replaced) [pdf, other]
Title: Pretext Tasks selection for multitask self-supervised speech representation learning
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[37]  arXiv:2108.03039 (replaced) [pdf, other]
Title: Identifiable Energy-based Representations: An Application to Estimating Heterogeneous Causal Effects
Comments: 19 pages, 2 figures, 9 tables
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
[38]  arXiv:2110.01583 (replaced) [pdf, other]
Title: Online Control of the False Discovery Rate under "Decision Deadlines"
Authors: Aaron Fisher
Comments: Keywords: adaptive stopping time, batch testing, data decay, decaying memory, quality preserving database. Updates: Expanded simulations, other minor edits for submission
Subjects: Methodology (stat.ME); Machine Learning (stat.ML)
[39]  arXiv:2110.03684 (replaced) [pdf, other]
Title: Cross-Domain Imitation Learning via Optimal Transport
Comments: typos corrected, references added
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Robotics (cs.RO); Machine Learning (stat.ML)
[40]  arXiv:2110.04020 (replaced) [pdf, other]
Title: Pathologies in priors and inference for Bayesian transformers
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
[41]  arXiv:2110.05316 (replaced) [pdf, ps, other]
Title: Reversible Genetically Modified Mode Jumping MCMC
Comments: 6 pages, 2 table, based on arXiv:1806.02160, which got divided into two revised articles
Journal-ref: Published in Proceedings of 22nd European Young Statisticians Meeting (ISBN: 978-960-7943-23-1), 2021. URL: https://www.eysm2021.panteion.gr/files/Proceedings_EYSM_2021.pdf Parpoula & Athanasios Rakitzis
Subjects: Methodology (stat.ME); Computation (stat.CO); Machine Learning (stat.ML)
[42]  arXiv:2110.06871 (replaced) [pdf, other]
Title: Two-argument activation functions learn soft XOR operations like cortical neurons
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Neurons and Cognition (q-bio.NC); Machine Learning (stat.ML)
[ total of 42 entries: 1-42 ]
[ showing up to 2000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, stat, recent, 2110, contact, help  (Access key information)