Data Analysis, Statistics and Probability
New submissions
[ showing up to 500 entries per page: fewer  more ]
New submissions for Tue, 25 Feb 20
 [1] arXiv:2002.09530 [pdf, other]

Title: Using machine learning to separate hadronic and electromagnetic interactions in the GlueX forward calorimeterComments: 12 pages, 10 figures, submitted to JINSTSubjects: Data Analysis, Statistics and Probability (physics.dataan); Instrumentation and Detectors (physics.insdet)
The GlueX forward calorimeter is an array of 2800 lead glass modules that was constructed to detect photons produced in the decays of hadrons. A background to this process originates from hadronic interactions in the calorimeter, which, in some instances, can be difficult to distinguish from low energy photon interactions. Machine learning techniques were applied to the classification of particle interactions in the GlueX forward calorimeter. The algorithms were trained on data using decays of the $\omega$ meson, which contain both true photons and charged particles that interact with the calorimeter. Algorithms were evaluated on efficiency, rate of false positives, run time, and implementation complexity. An algorithm that utilizes a multilayer perceptron neural net was deployed in the GlueX software stack and provides a signal efficiency of 85% with a background rejection of 60% for an inclusive $\pi^0$ data sample for an intermediate quality constraint.
 [2] arXiv:2002.09865 [pdf, other]

Title: Bryan's Maximum Entropy Method  diagnosis of a flawed argument and its remedyAuthors: Alexander RothkopfComments: 7 pages, 1 figureSubjects: Data Analysis, Statistics and Probability (physics.dataan); High Energy Physics  Lattice (heplat); High Energy Physics  Phenomenology (hepph); Nuclear Theory (nuclth); Computational Physics (physics.compph)
The Maximum Entropy Method (MEM) is a popular data analysis technique, based on Bayesian inference, which has found various applications in the research literature. While the MEM itself is well grounded in statistics, I argue that its stateoftheart implementation, suggested originally by Bryan, artificially restricts its solution space. This restriction leads to a systematic error often unaccounted for in contemporary MEM studies. Since previously published arguments on the shortcoming of Bryan's MEM have recently been questioned in arXiv:2001.10205, this paper will carefully revisit Bryan's train of thought, point out its flaw in applying linear algebra arguments to an inherently nonlinear problem and suggest possible ways to overcome it.
Crosslists for Tue, 25 Feb 20
 [3] arXiv:2002.09713 (crosslist from stat.OT) [pdf, other]

Title: Connections between statistical practice in elementary particle physics and the severity concept as discussed in Mayo's Statistical Inference as Severe TestingAuthors: Robert D. CousinsComments: 25 pages including 4 figuresSubjects: Other Statistics (stat.OT); High Energy Physics  Experiment (hepex); Data Analysis, Statistics and Probability (physics.dataan)
For many years, philosopherofstatistics Deborah Mayo has been advocating the concept of severe testing as a key part of hypothesis testing. Her recent book, Statistical Inference as Severe Testing, is a comprehensive exposition of her arguments in the context of a historical study of many threads of statistical inference, both frequentist and Bayesian. Her foundational point of view is called error statistics, emphasizing frequentist evaluation of the errors called Type I and Type II in the NeymanPearson theory of frequentist hypothesis testing. Since the field of elementary particle physics (also known as high energy physics) has strong traditions in frequentist inference, one might expect that something like the severity concept was independently developed in the field. Indeed, I find that, at least operationally (numerically), we highenergy physicists have long interpreted data in ways that map directly onto severity. Whether or not we subscribe to Mayo's philosophical interpretations of severity is a more complicated story that I do not address here.
 [4] arXiv:2002.09770 (crosslist from physics.socph) [pdf, other]

Title: Allotaxonometry and rankturbulence divergence: A universal instrument for comparing complex systemsAuthors: P. S. Dodds, J. R. Minot, M. V. Arnold, T. Alshaabi, J. L. Adams, D. R. Dewhurst, T. J. Gray, M. R. Frank, A. J. Reagan, C. M. DanforthComments: 22 pages, 7 figures, 1 table; online appendices: this http URLSubjects: Physics and Society (physics.socph); Data Analysis, Statistics and Probability (physics.dataan)
Complex systems often comprise many kinds of components which vary over many orders of magnitude in size: Populations of cities in countries, individual and corporate wealth in economies, species abundance in ecologies, word frequency in natural language, and node degree in complex networks. Comparisons of component size distributions for two complex systemsor a system with itself at two different time pointsgenerally employ informationtheoretic instruments, such as JensenShannon divergence. We argue that these methods lack transparency and adjustability, and should not be applied when component probabilities are nonsensible or are problematic to estimate. Here, we introduce `allotaxonometry' along with `rankturbulence divergence', a tunable instrument for comparing any two (Zipfian) ranked lists of components. We analytically develop our rankbased divergence in a series of steps, and then establish a rankbased allotaxonograph which pairs a maplike histogram for rankrank pairs with an ordered list of components according to divergence contribution. We explore the performance of rankturbulence divergence for a series of distinct settings including: Language use on Twitter and in books, species abundance, baby name popularity, market capitalization, performance in sports, mortality causes, and job titles. We provide a series of supplementary flipbooks which demonstrate the tunability and storytelling power of rankbased allotaxonometry.
 [5] arXiv:2002.10440 (crosslist from astroph.IM) [pdf, other]

Title: Modeling Aerial GammaRay Backgrounds using Nonnegative Matrix FactorizationComments: 14 pages, 12 figures, accepted for publication in IEEE Transactions on Nuclear ScienceSubjects: Instrumentation and Methods for Astrophysics (astroph.IM); Applied Physics (physics.appph); Data Analysis, Statistics and Probability (physics.dataan)
Airborne gammaray surveys are useful for many applications, ranging from geology and mining to public health and nuclear security. In all these contexts, the ability to decompose a measured spectrum into a linear combination of background source terms can provide useful insights into the data and lead to improvements over techniques that use spectral energy windows. Multiple methods for the linear decomposition of spectra exist but are subject to various drawbacks, such as allowing negative photon fluxes or requiring detailed Monte Carlo modeling. We propose using Nonnegative Matrix Factorization (NMF) as a datadriven approach to spectral decomposition. Using aerial surveys that include flights over water, we demonstrate that the mathematical approach of NMF finds physically relevant structure in aerial gammaray background, namely that measured spectra can be expressed as the sum of nearby terrestrial emission, distant terrestrial emission, and radon and cosmic emission. These NMF background components are compared to the background components obtained using NoiseAdjusted Singular Value Decomposition (NASVD), which contain negative photon fluxes and thus do not represent emission spectra in as straightforward a way. Finally, we comment on potential areas of research that are enabled by NMF decompositions, such as new approaches to spectral anomaly detection and data fusion.
Replacements for Tue, 25 Feb 20
 [6] arXiv:1708.08794 (replaced) [pdf, other]

Title: Impact of nonstationarity on hybrid ensemble filters: A study with a doubly stochastic advectiondiffusiondecay modelComments: The accepted version of the published articleJournalref: Quarterly Journal of the Royal Meteorological Society, 2019, v. 145, N 722, 22552271Subjects: Data Analysis, Statistics and Probability (physics.dataan); Atmospheric and Oceanic Physics (physics.aoph); Geophysics (physics.geoph)
 [7] arXiv:1907.11674 (replaced) [pdf, other]

Title: Reducing the dependence of the neural network function to systematic uncertainties in the input spaceJournalref: Comput Softw Big Sci 4, 5 (2020)Subjects: Data Analysis, Statistics and Probability (physics.dataan)
 [8] arXiv:1910.11571 (replaced) [pdf, other]

Title: Report from RAMP challenge on fast vertexingAuthors: Florian ReissComments: 7 pages, 7 figures, Connecting The Dots and Workshop on Intelligent Trackers 2019Subjects: Instrumentation and Detectors (physics.insdet); High Energy Physics  Experiment (hepex); Data Analysis, Statistics and Probability (physics.dataan)
[ showing up to 500 entries per page: fewer  more ]
Disable MathJax (What is MathJax?)
Links to: arXiv, form interface, find, physics, recent, 2002, contact, help (Access key information)