Machine Learning
New submissions
[ showing up to 2000 entries per page: fewer  more ]
New submissions for Mon, 18 Oct 21
 [1] arXiv:2110.07618 [pdf, other]

Title: Sparse Implicit Processes for Approximate InferenceComments: 10 pages for the main text (with 3 figures and 1 table), and 9 pages of supplementary material (with 6 figures and 3 tables)Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)
Implicit Processes (IPs) are flexible priors that can describe models such as Bayesian neural networks, neural samplers and data generators. IPs allow for approximate inference in functionspace. This avoids some degenerate problems of parameterspace approximate inference due to the high number of parameters and strong dependencies. For this, an extra IP is often used to approximate the posterior of the prior IP. However, simultaneously adjusting the parameters of the prior IP and the approximate posterior IP is a challenging task. Existing methods that can tune the prior IP result in a Gaussian predictive distribution, which fails to capture important data patterns. By contrast, methods producing flexible predictive distributions by using another IP to approximate the posterior process cannot fit the prior IP to the observed data. We propose here a method that can carry out both tasks. For this, we rely on an inducingpoint representation of the prior IP, as often done in the context of sparse Gaussian processes. The result is a scalable method for approximate inference with IPs that can tune the prior IP parameters to the data, and that provides accurate nonGaussian predictive distributions.
 [2] arXiv:2110.07739 [pdf, other]

Title: ModelChange Active Learning in GraphBased SemiSupervised LearningComments: Submitted to SIAM Journal on Mathematics of Data Science (SIMODS)Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)
Active learning in semisupervised classification involves introducing additional labels for unlabelled data to improve the accuracy of the underlying classifier. A challenge is to identify which points to label to best improve performance while limiting the number of new labels. "Modelchange" active learning quantifies the resulting change incurred in the classifier by introducing the additional label(s). We pair this idea with graphbased semisupervised learning methods, that use the spectrum of the graph Laplacian matrix, which can be truncated to avoid prohibitively large computational and storage costs. We consider a family of convex loss functions for which the acquisition function can be efficiently approximated using the Laplace approximation of the posterior distribution. We show a variety of multiclass examples that illustrate improved performance over prior stateofart.
 [3] arXiv:2110.07756 [pdf, other]

Title: Learning MeanField Equations from Particle Data Using WSINDySubjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Dynamical Systems (math.DS); Numerical Analysis (math.NA); Optimization and Control (math.OC); Probability (math.PR)
We develop a weakform sparse identification method for interacting particle systems (IPS) with the primary goals of reducing computational complexity for large particle number $N$ and offering robustness to either intrinsic or extrinsic noise. In particular, we use concepts from meanfield theory of IPS in combination with the weakform sparse identification of nonlinear dynamics algorithm (WSINDy) to provide a fast and reliable system identification scheme for recovering the governing stochastic differential equations for an IPS when the number of particles per experiment $N$ is on the order of several thousand and the number of experiments $M$ is less than 100. This is in contrast to existing work showing that system identification for $N$ less than 100 and $M$ on the order of several thousand is feasible using strongform methods. We prove that under some standard regularity assumptions the scheme converges with rate $\mathcal{O}(N^{1/2})$ in the ordinary least squares setting and we demonstrate the convergence rate numerically on several systems in one and two spatial dimensions. Our examples include a canonical problem from homogenization theory (as a first step towards learning coarsegrained models), the dynamics of an attractiverepulsive swarm, and the IPS description of the parabolicelliptic KellerSegel model for chemotaxis.
 [4] arXiv:2110.07788 [pdf, other]

Title: Gaussian Process Bandit Optimization with Few BatchesSubjects: Machine Learning (stat.ML); Information Theory (cs.IT); Machine Learning (cs.LG); Optimization and Control (math.OC)
In this paper, we consider the problem of blackbox optimization using Gaussian Process (GP) bandit optimization with a small number of batches. Assuming the unknown function has a low norm in the Reproducing Kernel Hilbert Space (RKHS), we introduce a batch algorithm inspired by batched finitearm bandit algorithms, and show that it achieves the cumulative regret upper bound $O^\ast(\sqrt{T\gamma_T})$ using $O(\log\log T)$ batches within time horizon $T$, where the $O^\ast(\cdot)$ notation hides dimensionindependent logarithmic factors and $\gamma_T$ is the maximum information gain associated with the kernel. This bound is nearoptimal for several kernels of interest and improves on the typical $O^\ast(\sqrt{T}\gamma_T)$ bound, and our approach is arguably the simplest among algorithms attaining this improvement. In addition, in the case of a constant number of batches (not depending on $T$), we propose a modified version of our algorithm, and characterize how the regret is impacted by the number of batches, focusing on the squared exponential and Mat\'ern kernels. The algorithmic upper bounds are shown to be nearly minimax optimal via analogous algorithmindependent lower bounds.
 [5] arXiv:2110.08045 [pdf, ps, other]

Title: Compressive Independent Component Analysis: Theory and AlgorithmsComments: 27 pages, 8 figures, under reviewSubjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Signal Processing (eess.SP)
Compressive learning forms the exciting intersection between compressed sensing and statistical learning where one exploits forms of sparsity and structure to reduce the memory and/or computational complexity of the learning task. In this paper, we look at the independent component analysis (ICA) model through the compressive learning lens. In particular, we show that solutions to the cumulant based ICA model have particular structure that induces a low dimensional model set that resides in the cumulant tensor space. By showing a restricted isometry property holds for random cumulants e.g. Gaussian ensembles, we prove the existence of a compressive ICA scheme. Thereafter, we propose two algorithms of the form of an iterative projection gradient (IPG) and an alternating steepest descent (ASD) algorithm for compressive ICA, where the order of compression asserted from the restricted isometry property is realised through empirical results. We provide analysis of the CICA algorithms including the effects of finite samples. The effects of compression are characterised by a tradeoff between the sketch size and the statistical efficiency of the ICA estimates. By considering synthetic and real datasets, we show the substantial memory gains achieved over wellknown ICA algorithms by using one of the proposed CICA algorithms. Finally, we conclude the paper with open problems including interesting challenges from the emerging field of compressive learning.
 [6] arXiv:2110.08087 [pdf, other]

Title: Causal Identification with Additive Noise Models: Quantifying the Effect of NoiseComments: Presented at 10\`emes Journ\'ees Francophones sur les R\'eseaux Bay\'esiens et les Mod\`eles Graphiques Probabilistes (JFRB2021), this https URLSubjects: Machine Learning (stat.ML); Machine Learning (cs.LG)
In recent years, a lot of research has been conducted within the area of causal inference and causal learning. Many methods have been developed to identify the causeeffect pairs in models and have been successfully applied to observational realworld data to determine the direction of causal relationships. Yet in bivariate situations, causal discovery problems remain challenging. One class of such methods, that also allows tackling the bivariate case, is based on Additive Noise Models (ANMs). Unfortunately, one aspect of these methods has not received much attention until now: what is the impact of different noise levels on the ability of these methods to identify the direction of the causal relationship. This work aims to bridge this gap with the help of an empirical study. We test Regression with Subsequent Independence Test (RESIT) using an exhaustive range of models where the level of additive noise gradually changes from 1\% to 10000\% of the causes' noise level (the latter remains fixed). Additionally, the experiments in this work consider several different types of distributions as well as linear and nonlinear models. The results of the experiments show that ANMs methods can fail to capture the true causal direction for some levels of noise.
 [7] arXiv:2110.08111 [pdf, other]

Title: An active learning approach for improving the performance of equilibrium based chemical simulationsComments: 22 pages, 17 figuresSubjects: Machine Learning (stat.ML); Machine Learning (cs.LG)
In this paper, we propose a novel sequential datadriven method for dealing with equilibrium based chemical simulations, which can be seen as a specific machine learning approach called active learning. The underlying idea of our approach is to consider the function to estimate as a sample of a Gaussian process which allows us to compute the global uncertainty on the function estimation. Thanks to this estimation and with almost no parameter to tune, the proposed method sequentially chooses the most relevant input data at which the function to estimate has to be evaluated to build a surrogate model. Hence, the number of evaluations of the function to estimate is dramatically limited. Our active learning method is validated through numerical experiments and applied to a complex chemical system commonly used in geoscience.
 [8] arXiv:2110.08217 [pdf, other]

Title: Choice functions based multiobjective Bayesian optimisationSubjects: Machine Learning (stat.ML); Machine Learning (cs.LG)
In this work we introduce a new framework for multiobjective Bayesian optimisation where the multiobjective functions can only be accessed via choice judgements, such as ``I pick options A,B,C among this set of five options A,B,C,D,E''. The fact that the option D is rejected means that there is at least one option among the selected ones A,B,C that I strictly prefer over D (but I do not have to specify which one). We assume that there is a latent vector function f for some dimension $n_e$ which embeds the options into the real vector space of dimension n, so that the choice set can be represented through a Pareto set of nondominated options. By placing a Gaussian process prior on f and deriving a novel likelihood model for choice data, we propose a Bayesian framework for choice functions learning. We then apply this surrogate model to solve a novel multiobjective Bayesian optimisation from choice data problem.
Crosslists for Mon, 18 Oct 21
 [9] arXiv:2110.07751 (crosslist from cs.LG) [pdf, other]

Title: Leveraging Spatial and Temporal Correlations in Sparsified Mean EstimationComments: Accepted to NeurIPS 2021Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
We study the problem of estimating at a central server the mean of a set of vectors distributed across several nodes (one vector per node). When the vectors are highdimensional, the communication cost of sending entire vectors may be prohibitive, and it may be imperative for them to use sparsification techniques. While most existing work on sparsified mean estimation is agnostic to the characteristics of the data vectors, in many practical applications such as federated learning, there may be spatial correlations (similarities in the vectors sent by different nodes) or temporal correlations (similarities in the data sent by a single node over different iterations of the algorithm) in the data vectors. We leverage these correlations by simply modifying the decoding method used by the server to estimate the mean. We provide an analysis of the resulting estimation error as well as experiments for PCA, KMeans and Logistic Regression, which show that our estimators consistently outperform more sophisticated and expensive sparsification methods.
 [10] arXiv:2110.07810 (crosslist from cs.LG) [pdf, other]

Title: Towards Statistical and Computational Complexities of Polyak Step Size Gradient DescentComments: First three authors contributed equally. 40 pages, 4 figuresSubjects: Machine Learning (cs.LG); Statistics Theory (math.ST); Machine Learning (stat.ML)
We study the statistical and computational complexities of the Polyak step size gradient descent algorithm under generalized smoothness and Lojasiewicz conditions of the population loss function, namely, the limit of the empirical loss function when the sample size goes to infinity, and the stability between the gradients of the empirical and population loss functions, namely, the polynomial growth on the concentration bound between the gradients of sample and population loss functions. We demonstrate that the Polyak step size gradient descent iterates reach a final statistical radius of convergence around the true parameter after logarithmic number of iterations in terms of the sample size. It is computationally cheaper than the polynomial number of iterations on the sample size of the fixedstep size gradient descent algorithm to reach the same final statistical radius when the population loss function is not locally strongly convex. Finally, we illustrate our general theory under three statistical examples: generalized linear model, mixture model, and mixed linear regression model.
 [11] arXiv:2110.07959 (crosslist from cs.LG) [pdf, other]

Title: Lowrank Matrix Recovery With Unknown CorrespondenceSubjects: Machine Learning (cs.LG); Information Retrieval (cs.IR); Machine Learning (stat.ML)
We study a matrix recovery problem with unknown correspondence: given the observation matrix $M_o=[A,\tilde P B]$, where $\tilde P$ is an unknown permutation matrix, we aim to recover the underlying matrix $M=[A,B]$. Such problem commonly arises in many applications where heterogeneous data are utilized and the correspondence among them are unknown, e.g., due to privacy concerns. We show that it is possible to recover $M$ via solving a nuclear norm minimization problem under a proper lowrank condition on $M$, with provable nonasymptotic error bound for the recovery of $M$. We propose an algorithm, $\text{M}^3\text{O}$ (Matrix recovery via MinMax Optimization) which recasts this combinatorial problem as a continuous minimax optimization problem and solves it by proximal gradient with a MaxOracle. $\text{M}^3\text{O}$ can also be applied to a more general scenario where we have missing entries in $M_o$ and multiple groups of data with distinct unknown correspondence. Experiments on simulated data, the MovieLens 100K dataset and Yale B database show that $\text{M}^3\text{O}$ achieves stateoftheart performance over several baselines and can recover the groundtruth correspondence with high accuracy.
 [12] arXiv:2110.08138 (crosslist from math.DG) [pdf, ps, other]

Title: Convergence of Laplacian Eigenmaps and its Rate for Submanifolds with SingularitiesAuthors: Masayuki AinoComments: 63 pagesSubjects: Differential Geometry (math.DG); Machine Learning (stat.ML)
In this paper, we give a spectral approximation result for the Laplacian on submanifolds of Euclidean spaces with singularities by the $\epsilon$neighborhood graph constructed from random points on the submanifold. Our convergence rate for the eigenvalue of the Laplacian is $O\left(\left(\log n/n\right)^{1/(m+2)}\right)$, where $m$ and $n$ denote the dimension of the manifold and the sample size, respectively.
 [13] arXiv:2110.08150 (crosslist from math.OC) [pdf, ps, other]

Title: HalpernType Accelerated and Splitting Algorithms For Monotone InclusionsComments: 33 pagesSubjects: Optimization and Control (math.OC); Machine Learning (stat.ML)
In this paper, we develop a new type of accelerated algorithms to solve some classes of maximally monotone equations as well as monotone inclusions. Instead of using Nesterov's accelerating approach, our methods rely on a socalled Halperntype fixedpoint iteration in [32], and recently exploited by a number of researchers, including [24, 70]. Firstly, we derive a new variant of the anchored extragradient scheme in [70] based on Popov's past extragradient method to solve a maximally monotone equation $G(x) = 0$. We show that our method achieves the same $\mathcal{O}(1/k)$ convergence rate (up to a constant factor) as in the anchored extragradient algorithm on the operator norm $\Vert G(x_k)\Vert$, , but requires only one evaluation of $G$ at each iteration, where $k$ is the iteration counter. Next, we develop two splitting algorithms to approximate a zero point of the sum of two maximally monotone operators. The first algorithm originates from the anchored extragradient method combining with a splitting technique, while the second one is its Popov's variant which can reduce the periteration complexity. Both algorithms appear to be new and can be viewed as accelerated variants of the DouglasRachford (DR) splitting method. They both achieve $\mathcal{O}(1/k)$ rates on the norm $\Vert G_{\gamma}(x_k)\Vert$ of the forwardbackward residual operator $G_{\gamma}(\cdot)$ associated with the problem. We also propose a new accelerated DouglasRachford splitting scheme for solving this problem which achieves $\mathcal{O}(1/k)$ convergence rate on $\Vert G_{\gamma}(x_k)\Vert$ under only maximally monotone assumptions. Finally, we specify our first algorithm to solve convexconcave minimax problems and apply our accelerated DR scheme to derive a new variant of the alternating direction method of multipliers (ADMM).
 [14] arXiv:2110.08161 (crosslist from stat.ME) [pdf, other]

Title: SAFFRON and LORD Ensure Online Control of the False Discovery Rate Under Positive DependenceAuthors: Aaron FisherSubjects: Methodology (stat.ME); Machine Learning (stat.ML)
Online testing procedures assume that hypotheses are observed in sequence, and allow the significance thresholds for upcoming tests to depend on the test statistics observed so far. Some of the most popular online methods include alpha investing, LORD++ (hereafter, LORD), and SAFFRON. These three methods have been shown to provide online control of the "modified" false discovery rate (mFDR). However, to our knowledge, they have only been shown to control the traditional false discovery rate (FDR) under an independence condition on the test statistics. Our work bolsters these results by showing that SAFFRON and LORD additionally ensure online control of the FDR under nonnegative dependence. Because alpha investing can be recovered as a special case of the SAFFRON framework, the same result applies to this method as well. Our result also allows for certain forms of adaptive stopping times, for example, stopping after a certain number of rejections have been observed.
 [15] arXiv:2110.08205 (crosslist from stat.ME) [pdf, other]

Title: Fast Online Changepoint Detection via Functional Pruning CUSUM statisticsSubjects: Methodology (stat.ME); Computation (stat.CO); Machine Learning (stat.ML)
Many modern applications of online changepoint detection require the ability to process highfrequency observations, sometimes with limited available computational resources. Online algorithms for detecting a change in mean often involve using a moving window, or specifying the expected size of change. Such choices affect which changes the algorithms have most power to detect. We introduce an algorithm, Functional Online CuSUM (FOCuS), which is equivalent to running these earlier methods simultaneously for all sizes of window, or all possible values for the size of change. Our theoretical results give tight bounds on the expected computational cost per iteration of FOCuS, with this being logarithmic in the number of observations. We show how FOCuS can be applied to a number of different change in mean scenarios, and demonstrate its practical utility through its stateofthe art performance at detecting anomalous behaviour in computer server data.
 [16] arXiv:2110.08211 (crosslist from astroph.IM) [pdf, other]

Title: Astronomical source finding services for the CIRASA visual analytic platformAuthors: S. Riggia, C. Bordiu, F. Vitello, G. Tudisco, E. Sciacca, D. Magro, R. Sortino, C. Pino, M. Molinaro, M. Benedettini, S.Leurini, F. Bufano, M. Raciti, U. BeccianiComments: 16 pages, 6 figuresSubjects: Instrumentation and Methods for Astrophysics (astroph.IM); Computation (stat.CO); Machine Learning (stat.ML)
Innovative developments in data processing, archiving, analysis, and visualization are nowadays unavoidable to deal with the data deluge expected in nextgeneration facilities for radio astronomy, such as the Square Kilometre Array (SKA) and its precursors. In this context, the integration of source extraction and analysis algorithms into data visualization tools could significantly improve and speed up the cataloguing process of large area surveys, boosting astronomer productivity and shortening publication time. To this aim, we are developing a visual analytic platform (CIRASA) for advanced source finding and classification, integrating stateoftheart tools, such as the CAESAR source finder, the ViaLactea Visual Analytic (VLVA) and Knowledge Base (VLKB). In this work, we present the project objectives and the platform architecture, focusing on the implemented source finding services.
Replacements for Mon, 18 Oct 21
 [17] arXiv:2105.07610 (replaced) [pdf, other]

Title: CrossCluster Weighted ForestsComments: 19 pages, 6 figures, 1 tableSubjects: Machine Learning (stat.ML); Machine Learning (cs.LG)
 [18] arXiv:2106.02078 (replaced) [pdf, other]

Title: Improving Neural Network Robustness via Persistency of ExcitationSubjects: Machine Learning (stat.ML); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Systems and Control (eess.SY)
 [19] arXiv:2106.02713 (replaced) [pdf, other]

Title: Learning Curves for SGD on Structured FeaturesComments: Added new analysis of optimal batchsize and learning rate. Provided theoretical learning curves for case where test and training measures are different and apply to predicting errors for test/train splits on real datasets. Also provided a new bound for nonGaussian features based on a regularity condition proposed by Varre et al 2021 arXiv:2102.03183Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)
 [20] arXiv:2107.02474 (replaced) [pdf, other]

Title: Viscos Flows: Variational Schur Conditional Sampling With Normalizing FlowsSubjects: Machine Learning (stat.ML); Machine Learning (cs.LG)
 [21] arXiv:2107.09088 (replaced) [pdf, other]

Title: RewardWeighted Regression Converges to a Global OptimumAuthors: Miroslav Štrupl, Francesco Faccio, Dylan R. Ashley, Rupesh Kumar Srivastava, Jürgen SchmidhuberComments: 7 pages in main text + 2 pages of references + 6 pages of appendices, 1 figure in main text + 1 figure in appendices; source code available at this https URLSubjects: Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
 [22] arXiv:2110.06381 (replaced) [pdf, other]

Title: Meta Learning Low Rank Covariance Factors for EnergyBased Deterministic UncertaintySubjects: Machine Learning (stat.ML); Machine Learning (cs.LG)
 [23] arXiv:2110.06581 (replaced) [pdf, other]

Title: Averting A Crisis In SimulationBased InferenceSubjects: Machine Learning (stat.ML); Machine Learning (cs.LG)
 [24] arXiv:1912.08140 (replaced) [pdf, other]

Title: Onthefly Global Embeddings Using Random Projections for Extreme Multilabel ClassificationAuthors: Yashaswi VermaSubjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Information Retrieval (cs.IR); Machine Learning (stat.ML)
 [25] arXiv:2002.12036 (replaced) [pdf, other]

Title: Complexity Measures and Features for Times Series classificationSubjects: Machine Learning (cs.LG); Information Theory (cs.IT); Machine Learning (stat.ML)
 [26] arXiv:2006.06592 (replaced) [pdf, other]

Title: The Backbone Method for UltraHigh Dimensional Sparse Machine LearningComments: First submission to Machine Learning: 06/2020. Revised: 10/2021Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
 [27] arXiv:2006.09773 (replaced) [pdf, other]

Title: Neural Ordinary Differential Equation Control of Dynamics on GraphsComments: Fifth version improves and clears notationSubjects: Machine Learning (cs.LG); Social and Information Networks (cs.SI); Machine Learning (stat.ML)
 [28] arXiv:2007.12652 (replaced) [pdf, other]

Title: MurTree: Optimal Classification Trees via Dynamic Programming and SearchAuthors: Emir Demirović, Anna Lukina, Emmanuel Hebrard, Jeffrey Chan, James Bailey, Christopher Leckie, Kotagiri Ramamohanarao, Peter J. StuckeySubjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Data Structures and Algorithms (cs.DS); Machine Learning (stat.ML)
 [29] arXiv:2009.06921 (replaced) [pdf, ps, other]

Title: Optimal Decision Trees for Nonlinear MetricsSubjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Data Structures and Algorithms (cs.DS); Machine Learning (stat.ML)
 [30] arXiv:2102.12353 (replaced) [pdf, other]

Title: Nonlinear Invariant Risk Minimization: A Causal ApproachSubjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
 [31] arXiv:2102.13088 (replaced) [pdf, other]

Title: Even your Teacher Needs Guidance: GroundTruth Targets Dampen Regularization Imposed by SelfDistillationComments: To be published at NeurIPS 2021; 21 pages, 14 figuresSubjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
 [32] arXiv:2103.12591 (replaced) [pdf, ps, other]

Title: BoXHED2.0: Scalable boosting of dynamic survival analysisComments: 12 pagesSubjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
 [33] arXiv:2106.05319 (replaced) [pdf, other]

Title: Stein Latent Optimization for Generative Adversarial NetworksSubjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
 [34] arXiv:2106.11609 (replaced) [pdf, other]

Title: Distributional Gradient Matching for Learning Uncertain Neural Dynamics ModelsComments: Published at NeurIPS 2021Journalref: Advances in Neural Information Processing Systems, 2021Subjects: Machine Learning (cs.LG); Dynamical Systems (math.DS); Machine Learning (stat.ML)
 [35] arXiv:2106.12248 (replaced) [pdf, other]

Title: ADAVI: Automatic Dual Amortized Variational Inference Applied To Pyramidal Bayesian ModelsComments: Preprint submitted to ICLR 2022Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Neurons and Cognition (qbio.NC); Machine Learning (stat.ML)
 [36] arXiv:2107.00594 (replaced) [pdf, other]

Title: Pretext Tasks selection for multitask selfsupervised speech representation learningSubjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
 [37] arXiv:2108.03039 (replaced) [pdf, other]

Title: Identifiable Energybased Representations: An Application to Estimating Heterogeneous Causal EffectsComments: 19 pages, 2 figures, 9 tablesSubjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
 [38] arXiv:2110.01583 (replaced) [pdf, other]

Title: Online Control of the False Discovery Rate under "Decision Deadlines"Authors: Aaron FisherComments: Keywords: adaptive stopping time, batch testing, data decay, decaying memory, quality preserving database. Updates: Expanded simulations, other minor edits for submissionSubjects: Methodology (stat.ME); Machine Learning (stat.ML)
 [39] arXiv:2110.03684 (replaced) [pdf, other]

Title: CrossDomain Imitation Learning via Optimal TransportComments: typos corrected, references addedSubjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Robotics (cs.RO); Machine Learning (stat.ML)
 [40] arXiv:2110.04020 (replaced) [pdf, other]

Title: Pathologies in priors and inference for Bayesian transformersSubjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
 [41] arXiv:2110.05316 (replaced) [pdf, ps, other]

Title: Reversible Genetically Modified Mode Jumping MCMCComments: 6 pages, 2 table, based on arXiv:1806.02160, which got divided into two revised articlesJournalref: Published in Proceedings of 22nd European Young Statisticians Meeting (ISBN: 9789607943231), 2021. URL: https://www.eysm2021.panteion.gr/files/Proceedings_EYSM_2021.pdf Parpoula & Athanasios RakitzisSubjects: Methodology (stat.ME); Computation (stat.CO); Machine Learning (stat.ML)
 [42] arXiv:2110.06871 (replaced) [pdf, other]

Title: Twoargument activation functions learn soft XOR operations like cortical neuronsSubjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Neurons and Cognition (qbio.NC); Machine Learning (stat.ML)
[ showing up to 2000 entries per page: fewer  more ]
Disable MathJax (What is MathJax?)
Links to: arXiv, form interface, find, stat, recent, 2110, contact, help (Access key information)