Data Structures and Algorithms
New submissions
[ showing up to 1000 entries per page: fewer  more ]
New submissions for Fri, 9 Apr 21
 [1] arXiv:2104.03461 [pdf, other]

Title: Linear and Sublinear Time Spectral Density EstimationSubjects: Data Structures and Algorithms (cs.DS); Numerical Analysis (math.NA)
We analyze the popular kernel polynomial method (KPM) for approximating the spectral density (eigenvalue distribution) of a real symmetric (or Hermitian) matrix $A \in \mathbb{R}^{n\times n}$. We prove that a simple and practical variant of the KPM algorithm can approximate the spectral density to $\epsilon$ accuracy in the Wasserstein1 distance with roughly $O({1}/{\epsilon})$ matrixvector multiplications with $A$. This yields a provable linear time result for the problem.
The KPM variant we study is based on damped Chebyshev polynomial expansions. We show that it is stable, meaning that it can be combined with any approximate matrixvector multiplication algorithm for $A$. As an application, we develop an $O(n/\text{poly}(\epsilon))$ time algorithm for computing the spectral density of any $n\times n$ normalized graph adjacency or Laplacian matrix. This runtime is sublinear in the size of the matrix, and assumes sample access to the graph.
Our approach leverages several tools from approximation theory, including Jackson's seminal work on approximation with positive kernels [Jackson, 1912], and stability properties of threeterm recurrence relations for orthogonal polynomials.  [2] arXiv:2104.03484 [pdf, ps, other]

Title: Advances in Metric Ramsey Theory and its ApplicationsAuthors: Yair BartalComments: This is paper is still in stages of preparation, this version is not intended for distribution. A preliminary version of this article was written by the author in 2006, and was presented in the 2007 ICMS Workshop on Geometry and Algorithms. The basic result on constructive metric Ramsey decomposition and metric Ramsey theorem has also appeared in the author's lectures notesSubjects: Data Structures and Algorithms (cs.DS); Computational Geometry (cs.CG); Metric Geometry (math.MG)
Metric Ramsey theory is concerned with finding large wellstructured subsets of more complex metric spaces. For finite metric spaces this problem was first studies by Bourgain, Figiel and Milman \cite{bfm}, and studied further in depth by Bartal et. al \cite{BLMN03}. In this paper we provide deterministic constructions for this problem via a novel notion of \emph{metric Ramsey decomposition}. This method yields several more applications, reflecting on some basic results in metric embedding theory.
The applications include various results in metric Ramsey theory including the first deterministic construction yielding Ramsey theorems with tight bounds, a well as stronger theorems and properties, implying appropriate distance oracle applications.
In addition, this decomposition provides the first deterministic Bourgaintype embedding of finite metric spaces into Euclidean space, and an optimal multiembedding into ultrametrics, thus improving its applications in approximation and online algorithms.
The decomposition presented here, the techniques and its consequences have already been used in recent research in the field of metric embedding for various applications.  [3] arXiv:2104.03932 [pdf, ps, other]

Title: UniversallyOptimal Distributed Algorithms for Known TopologiesComments: Full version of extended abstract in STOC 2021Subjects: Data Structures and Algorithms (cs.DS)
Many distributed optimization algorithms achieve existentiallyoptimal running times, meaning that there exists some pathological worstcase topology on which no algorithm can do better. Still, most networks of interest allow for exponentially faster algorithms. This motivates two questions: (1) What network topology parameters determine the complexity of distributed optimization? (2) Are there universallyoptimal algorithms that are as fast as possible on every topology?
We resolve these 25yearold open problems in the knowntopology setting (i.e., supported CONGEST) for a wide class of global network optimization problems including MST, $(1+\varepsilon)$min cut, various approximate shortest paths problems, subgraph connectivity, etc.
In particular, we provide several (equivalent) graph parameters and show they are tight universal lower bounds for the above problems, fully characterizing their inherent complexity. Our results also imply that algorithms based on the lowcongestion shortcut framework match the above lower bound, making them universally optimal if shortcuts are efficiently approximable. We leverage a recent result in hopconstrained oblivious routing to show this is the case if the topology is known  giving universallyoptimal algorithms for all above problems.
Crosslists for Fri, 9 Apr 21
 [4] arXiv:2104.03353 (crosslist from cs.DB) [pdf, other]

Title: Correlation Sketches for Approximate JoinCorrelation QueriesComments: Proceedings of the 2021 International Conference on Management of Data (SIGMOD '21)Subjects: Databases (cs.DB); Data Structures and Algorithms (cs.DS); Information Retrieval (cs.IR)
The increasing availability of structured datasets, from Web tables and opendata portals to enterprise data, opens up opportunities~to enrich analytics and improve machine learning models through relational data augmentation. In this paper, we introduce a new class of data augmentation queries: joincorrelation queries. Given a column $Q$ and a join column $K_Q$ from a query table $\mathcal{T}_Q$, retrieve tables $\mathcal{T}_X$ in a dataset collection such that $\mathcal{T}_X$ is joinable with $\mathcal{T}_Q$ on $K_Q$ and there is a column $C \in \mathcal{T}_X$ such that $Q$ is correlated with $C$. A na\"ive approach to evaluate these queries, which first finds joinable tables and then explicitly joins and computes correlations between $Q$ and all columns of the discovered tables, is prohibitively expensive. To efficiently support correlated column discovery, we 1) propose a sketching method that enables the construction of an index for a large number of tables and that provides accurate estimates for joincorrelation queries, and 2) explore different scoring strategies that effectively rank the query results based on how well the columns are correlated with the query. We carry out a detailed experimental evaluation, using both synthetic and real data, which shows that our sketches attain high accuracy and the scoring strategies lead to highquality rankings.
 [5] arXiv:2104.03673 (crosslist from cs.DC) [pdf, other]

Title: Practical Byzantine Reliable Broadcast on Partially Connected NetworksComments: This is a preprint of a paper that will appear at the IEEE ICDCS 2021 conferenceSubjects: Distributed, Parallel, and Cluster Computing (cs.DC); Data Structures and Algorithms (cs.DS); Networking and Internet Architecture (cs.NI)
In this paper, we consider the Byzantine reliable broadcast problem on authenticated and partially connected networks. The stateoftheart method to solve this problem consists in combining two algorithms from the literature. Handling asynchrony and faulty senders is typically done thanks to Gabriel Bracha's authenticated doubleecho broadcast protocol, which assumes an asynchronous fully connected network. Danny Dolev's algorithm can then be used to provide reliable communications between processes in the global fault model, where up to f processes among N can be faulty in a communication network that is at least 2f+1connected. Following recent works that showed that Dolev's protocol can be made more practical thanks to several optimizations, we show that the stateoftheart methods to solve our problem can be optimized thanks to layerspecific and crosslayer optimizations. Our simulations with the Omnet++ network simulator show that these optimizations can be efficiently combined to decrease the total amount of information transmitted or the protocol's latency (e.g., respectively, 25% and 50% with a 16B payload, N=31 and f=4) compared to the stateoftheart combination of Bracha's and Dolev's protocols.
Replacements for Fri, 9 Apr 21
 [6] arXiv:1901.03627 (replaced) [pdf, ps, other]

Title: Destroying Bicolored $P_3$s by Deleting Few EdgesComments: 30 pagesSubjects: Data Structures and Algorithms (cs.DS); Discrete Mathematics (cs.DM); Combinatorics (math.CO)
 [7] arXiv:2006.08668 (replaced) [pdf, ps, other]

Title: Algorithmic Aspects of Temporal BetweennessSubjects: Data Structures and Algorithms (cs.DS); Discrete Mathematics (cs.DM)
 [8] arXiv:2011.03622 (replaced) [pdf, ps, other]

Title: Settling the Robust Learnability of Mixtures of GaussiansSubjects: Data Structures and Algorithms (cs.DS); Machine Learning (cs.LG); Statistics Theory (math.ST); Machine Learning (stat.ML)
 [9] arXiv:1602.07570 (replaced) [pdf, ps, other]

Title: Bayesian Exploration: Incentivizing Exploration in Bayesian GamesComments: All revisions focused on presentation; all results (except Appendix C) have been present since the initial versionSubjects: Computer Science and Game Theory (cs.GT); Data Structures and Algorithms (cs.DS); Machine Learning (cs.LG)
 [10] arXiv:1801.07029 (replaced) [pdf, other]

Title: Deterministic Scheduling of Periodic Messages for Low Latency in Cloud RANComments: 40 pages, 23 FiguresSubjects: Networking and Internet Architecture (cs.NI); Data Structures and Algorithms (cs.DS)
 [11] arXiv:2012.03879 (replaced) [pdf, other]

Title: Sequential Stratified Regeneration: MCMC for Large State Spaces with an Application to Subgraph Count EstimationComments: Markov Chain Monte Carlo, Random Walk, Regenerative Sampling, Motif Analysis, Subgraph Counting, Graph MiningSubjects: Social and Information Networks (cs.SI); Data Structures and Algorithms (cs.DS)
[ showing up to 1000 entries per page: fewer  more ]
Disable MathJax (What is MathJax?)
Links to: arXiv, form interface, find, cs, recent, 2104, contact, help (Access key information)