We gratefully acknowledge support from
the Simons Foundation and member institutions.


New submissions

[ total of 7 entries: 1-7 ]
[ showing up to 2000 entries per page: fewer | more ]

New submissions for Tue, 4 Aug 20

[1]  arXiv:2008.00532 [pdf, other]
Title: A Stochastic EM Algorithm for Cure Rate Model with Negative Binomial Competing Risks and Non-homogeneous Lifetime
Authors: Suvra Pal
Comments: 28 pages, 4 figures
Subjects: Methodology (stat.ME); Computation (stat.CO)

In this paper, we consider a long-term survival model under a competing risks scenario. Since the number of competing risks is unobserved, we assume it to follow a negative binomial distribution that can capture both over- and under-dispersion we usually encounter when modeling count data. The distribution of the progression time, corresponding to each competing risk, is associated with a set of risk factors that allow us to capture the non-homogeneous patient population. We also provide flexibility in modeling the cure or the long-term survival rate, which is considered as a function of risk factors. Considering the latent competing risks as missing data, we develop a variation of the well-known expectation maximization (EM) algorithm, called the stochastic EM algorithm (SEM), which is the main contribution of this paper. We show that the SEM algorithm avoids calculation of complicated expectations, which is a major advantage of the SEM algorithm over the EM algorithm. Our proposed procedure allows the objective function to be maximized to be split into two simpler functions, one corresponding to the parameters associated with the cure rate and the other corresponding to the parameters associated with the progression times. The advantage of this approach is that each function, with lower parameter dimension, can be maximized independently. Through an extensive Monte Carlo simulation study we show the performance of the proposed SEM algorithm through calculated bias, root mean square error, and coverage probability of the asymptotic confidence interval. We also show that the SEM algorithm is not sensitive to the choice of the initial values. Finally, for illustration, we analyze a breast cancer survival data.

[2]  arXiv:2008.00553 [pdf, ps, other]
Title: A Unifying Framework for Parallel and Distributed Processing in R using Futures
Authors: Henrik Bengtsson
Comments: 16 pages, 0 figures, to be submitted
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Computation (stat.CO)

A future is a programming construct designed for concurrent and asynchronous evaluation of code, making it particularly useful for parallel processing. The future package implements the Future API for programming with futures in R. This minimal API provides sufficient constructs for implementing parallel versions of well-established, high-level map-reduce APIs. The future ecosystem supports exception handling, output and condition relaying, parallel random number generation, and automatic identification of globals lowering the threshold to parallelize code. The Future API bridges parallel frontends with parallel backends following the philosophy that end-users are the ones who choose the parallel backend while the developer focuses on what to parallelize. A variety of backends exist and third-party contributions meeting the specifications, which ensure that the same code works on all backends, are automatically supported. The future framework solves several problems not addressed by other parallel frameworks in R.

[3]  arXiv:2008.00961 [pdf, other]
Title: Accelerating Genome Analysis: A Primer on an Ongoing Journey
Subjects: Hardware Architecture (cs.AR); Genomics (q-bio.GN); Computation (stat.CO)

Genome analysis fundamentally starts with a process known as read mapping, where sequenced fragments of an organism's genome are compared against a reference genome. Read mapping is currently a major bottleneck in the entire genome analysis pipeline, because state-of-the-art genome sequencing technologies are able to sequence a genome much faster than the computational techniques employed to analyze the genome. We describe the ongoing journey in significantly improving the performance of read mapping. We explain state-of-the-art algorithmic methods and hardware-based acceleration approaches. Algorithmic approaches exploit the structure of the genome as well as the structure of the underlying hardware. Hardware-based acceleration approaches exploit specialized microarchitectures or various execution paradigms (e.g., processing inside or near memory). We conclude with the challenges of adopting these hardware-accelerated read mappers.

Replacements for Tue, 4 Aug 20

[4]  arXiv:1912.11914 (replaced) [pdf, other]
Title: Inverses of Matern Covariances on Grids
Authors: Joseph Guinness
Subjects: Computation (stat.CO); Statistics Theory (math.ST); Machine Learning (stat.ML)
[5]  arXiv:2005.14281 (replaced) [pdf, ps, other]
Title: MCMC for Bayesian uncertainty quantification from time-series data
Journal-ref: LNCS, volume 12143, Coputational Science - ICCS 2020
Subjects: Computation (stat.CO)
[6]  arXiv:2002.09006 (replaced) [pdf, ps, other]
Title: A table of short-period Tausworthe generators for Markov chain quasi-Monte Carlo
Authors: Shin Harase
Subjects: Numerical Analysis (math.NA); Computation (stat.CO)
[7]  arXiv:2006.07561 (replaced) [pdf, other]
Title: Model Based Screening Embedded Bayesian Variable Selection for Ultra-high Dimensional Settings
Comments: 54 pages including supplementary,4 figures and 6 tables
Subjects: Methodology (stat.ME); Computation (stat.CO)
[ total of 7 entries: 1-7 ]
[ showing up to 2000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, stat, recent, 2008, contact, help  (Access key information)