Data Structures and Algorithms

New submissions

Submissions received from Wed 17 Apr 24 to Thu 18 Apr 24, announced Fri, 19 Apr 24

New submissions
Cross-lists
Replacements

[ total of 16 entries: 1-16 ]
[ showing up to 500 entries per page: fewer | more ]

New submissions for Fri, 19 Apr 24

[1] arXiv:2404.11673 [pdf, other]: Title: Hairpin Completion Distance Lower Bound

Authors: Itai Boneh, Dvir Fried, Shay Golan, Matan Kraus

Comments: To be published in CPM 2024

Subjects: Data Structures and Algorithms (cs.DS)

Hairpin completion, derived from the hairpin formation observed in DNA biochemistry, is an operation applied to strings, particularly useful in DNA computing. Conceptually, a right hairpin completion operation transforms a string $S$ into $S\cdot S'$ where $S'$ is the reverse complement of a prefix of $S$. Similarly, a left hairpin completion operation transforms a string $S$ into $S'\cdot S$ where $S'$ is the reverse complement of a suffix of $S$. The hairpin completion distance from $S$ to $T$ is the minimum number of hairpin completion operations needed to transform $S$ into $T$. Recently Boneh et al. showed an $O(n^2)$ time algorithm for finding the hairpin completion distance between two strings of length at most $n$. In this paper we show that for any $\varepsilon>0$ there is no $O(n^{2-\varepsilon})$-time algorithm for the hairpin completion distance problem unless the Strong Exponential Time Hypothesis (SETH) is false. Thus, under SETH, the time complexity of the hairpin completion distance problem is quadratic, up to sub-polynomial factors.
[2] arXiv:2404.11879 [pdf, other]: Title: Public Event Scheduling with Busy Agents

Authors: Bo Li, Lijun Li, Minming Li, Ruilong Zhang

Comments: To appear in IJCAI 2024

Subjects: Data Structures and Algorithms (cs.DS)

We study a public event scheduling problem, where multiple public events are scheduled to coordinate the availability of multiple agents. The availability of each agent is determined by solving a separate flexible interval job scheduling problem, where the jobs are required to be preemptively processed. The agents want to attend as many events as possible, and their agreements are considered to be the total length of time during which they can attend these events. The goal is to find a schedule for events as well as the job schedule for each agent such that the total agreement is maximized.
We first show that the problem is NP-hard, and then prove that a simple greedy algorithm achieves $\frac{1}{2}$-approximation when the whole timeline is polynomially bounded. Our method also implies a $(1-\frac{1}{e})$-approximate algorithm for this case. Subsequently, for the general timeline case, we present an algorithmic framework that extends a $\frac{1}{\alpha}$-approximate algorithm for the one-event instance to the general case that achieves $\frac{1}{\alpha+1}$-approximation. Finally, we give a polynomial time algorithm that solves the one-event instance, and this implies a $\frac{1}{2}$-approximate algorithm for the general case.
[3] arXiv:2404.12080 [pdf, other]: Title: A Mathematical Formalisation of the γ-contraction Problem

Authors: Elia Onofri

Subjects: Data Structures and Algorithms (cs.DS); Combinatorics (math.CO)

Networks play an ubiquitous role in computer science and real-world applications, offering multiple kind of information that can be retrieved with adequate methods. With the continuous growing in the amount of data available, networks are becoming larger day by day. Consequently, the tasks that were easily achievable on smaller networks, often becomes impractical on huge amount of data, either due to the high computational cost or due to the impracticality to visualise corresponding data. Using distinctive node features to group large amount of connected data into a limited number of clusters, hence represented by a representative per cluster, proves to be a valuable approach. The resulting contracted graphs are more manageable in size and can reveal previously hidden characteristics of the original networks. Furthermore, in many real-world use cases, a definition of cluster is intrinsic with the data, eventually obtained with the injection of some expert knowledge represent by a categorical function. Clusters then results in set of connected vertices taking the same values in a finite set C. In the recent literature, Lombardi and Onofri proposed a novel, fast, and easily parallelisable approach under the name of $\gamma$-contraction to contract a graph given a categorical function. In this work, we formally define such approach by providing a rigorous mathematical definition of the problem, which, to the best of our knowledge, was missing in the existing literature. Specifically, we explore the variadic nature of the contraction operation and use it to introduce the weaker version of the colour contraction, under the name of $\beta$-contraction, that the algorithmic solution exploits. We finally dive into the details of the algorithm and we provide a full assesment on its convergence complexity relying on two constructive proofs that deeply unveil its mode of operation.

Cross-lists for Fri, 19 Apr 24

[4] arXiv:2404.11862 (cross-list from cs.SI) [pdf, ps, other]: Title: A Fast Maximum Clique Algorithm Based on Network Decomposition for Large Sparse Networks

Authors: Tianlong Fan, Wenjun Jiang, Yi-Cheng Zhang, Linyuan Lü

Comments: 12 pages, 2 figures, 1 table

Subjects: Social and Information Networks (cs.SI); Data Structures and Algorithms (cs.DS); Information Retrieval (cs.IR); Data Analysis, Statistics and Probability (physics.data-an)

Finding maximum cliques in large networks is a challenging combinatorial problem with many real-world applications. We present a fast algorithm to achieve the exact solution for the maximum clique problem in large sparse networks based on efficient graph decomposition. A bunch of effective techniques is being used to greatly prune the graph and a novel concept called Complete-Upper-Bound-Induced Subgraph (CUBIS) is proposed to ensure that the structures with the potential to form the maximum clique are retained in the process of graph decomposition. Our algorithm first pre-prunes peripheral nodes, subsequently, one or two small-scale CUBISs are constructed guided by the core number and current maximum clique size. Bron-Kerbosch search is performed on each CUBIS to find the maximum clique. Experiments on 50 empirical networks with a scale of up to 20 million show the CUBIS scales are largely independent of the original network scale. This enables an approximately linear runtime, making our algorithm amenable for large networks. Our work provides a new framework for effectively solving maximum clique problems on massive sparse graphs, which not only makes the graph scale no longer the bottleneck but also shows some light on solving other clique-related problems.
[5] arXiv:2404.12107 (cross-list from cs.DB) [pdf, other]: Title: Effective Individual Fairest Community Search over Heterogeneous Information Networks

Authors: Taige Zhao, Jianxin Li, Ningning Cui, Wei Luo

Subjects: Databases (cs.DB); Data Structures and Algorithms (cs.DS); Social and Information Networks (cs.SI)

Community search over heterogeneous information networks has been applied to wide domains, such as activity organization and team formation. From these scenarios, the members of a group with the same treatment often have different levels of activity and workloads, which causes unfairness in the treatment between active members and inactive members (called individual unfairness). However, existing works do not pay attention to individual fairness and do not sufficiently consider the rich semantics of HINs (e.g., high-order structure), which disables complex queries. To fill the gap, we formally define the issue of individual fairest community search over HINs (denoted as IFCS), which aims to find a set of vertices from the HIN that own the same type, close relationships, and small difference of activity level and has been demonstrated to be NP-hard. To do this, we first develop an exploration-based filter that reduces the search space of the community effectively. Further, to avoid repeating computation and prune unfair communities in advance, we propose a message-based scheme and a lower bound-based scheme. At last, we conduct extensive experiments on four real-world datasets to demonstrate the effectiveness and efficiency of our proposed algorithms, which achieve at least X3 times faster than the baseline solution.

Replacements for Fri, 19 Apr 24

[6] arXiv:1704.04370 (replaced) [pdf, other]: Title: Fast Similarity Sketching

Authors: Søren Dahlgaard, Mathias Bæk Tejs Langhede, Jakob Bæk Tejs Houen, Mikkel Thorup

Comments: The original version was directly based on a conference paper of the same title from FOCS'17. This new version is substantially revised with some cleaner and stronger theorems, particularly concerning the high probability domain. Moreover, there is one more author, Jakob Houen. In addition, one of the old authors, Mathias, has changed surname from Knudsen to Langhede

Subjects: Data Structures and Algorithms (cs.DS)
[7] arXiv:2001.05053 (replaced) [src]: Title: Tight Static Lower Bounds for Non-Adaptive Data Structures

Authors: Giuseppe Persiano, Kevin Yeo

Comments: This paper has been superceded and merged with arXiv:2308.16042

Subjects: Data Structures and Algorithms (cs.DS)
[8] arXiv:2303.06090 (replaced) [pdf, ps, other]: Title: Simple and efficient four-cycle counting on sparse graphs

Authors: Paul Burkhardt, David G. Harris

Subjects: Data Structures and Algorithms (cs.DS)
[9] arXiv:2303.14467 (replaced) [pdf, other]: Title: A Survey on the Densest Subgraph Problem and Its Variants

Authors: Tommaso Lanciano, Atsushi Miyauchi, Adriano Fazzone, Francesco Bonchi

Comments: Accepted to ACM Computing Surveys

Subjects: Data Structures and Algorithms (cs.DS); Artificial Intelligence (cs.AI)
[10] arXiv:2308.07476 (replaced) [pdf, ps, other]: Title: Dependent rounding with strong negative-correlation, and scheduling on unrelated machines to minimize completion time

Authors: David G. Harris

Journal-ref: SODA 2024

Subjects: Data Structures and Algorithms (cs.DS)
[11] arXiv:2310.02670 (replaced) [pdf, other]: Title: Searching 2D-Strings for Matching Frames

Authors: Itai Boneh, Dvir Fried, Shay Golan, Matan Kraus, Adrian Miclaus, Arseny Shur

Subjects: Data Structures and Algorithms (cs.DS)
[12] arXiv:2311.13590 (replaced) [pdf, other]: Title: Triangle-free $2$-matchings

Authors: Katarzyna Paluch

Comments: A more polished version and a clearer explanation of \Delta-diminishing hinges and 1-vulnerable triangles

Subjects: Data Structures and Algorithms (cs.DS); Discrete Mathematics (cs.DM)
[13] arXiv:2403.07760 (replaced) [pdf, other]: Title: Simplified Tight Bounds for Monotone Minimal Perfect Hashing

Authors: Dmitry Kosolobov

Comments: 13 pages, 4 figures

Subjects: Data Structures and Algorithms (cs.DS)
[14] arXiv:2403.12213 (replaced) [pdf, ps, other]: Title: Private graphon estimation via sum-of-squares

Authors: Hongjie Chen, Jingqiu Ding, Tommaso d'Orsi, Yiding Hua, Chih-Hung Liu, David Steurer

Comments: 71 pages, accepted to STOC 2024

Subjects: Data Structures and Algorithms (cs.DS); Computational Complexity (cs.CC); Machine Learning (cs.LG); Machine Learning (stat.ML)
[15] arXiv:2404.11607 (replaced) [pdf, other]: Title: Private federated discovery of out-of-vocabulary words for Gboard

Authors: Ziteng Sun, Peter Kairouz, Haicheng Sun, Adria Gascon, Ananda Theertha Suresh

Subjects: Data Structures and Algorithms (cs.DS)
[16] arXiv:2207.08015 (replaced) [pdf, ps, other]: Title: Parallel Best Arm Identification in Heterogeneous Environments

Authors: Nikolai Karpov, Qin Zhang

Comments: 15 pages (published in SPAA 2024)

Subjects: Machine Learning (cs.LG); Data Structures and Algorithms (cs.DS)

New submissions
Cross-lists
Replacements

[ total of 16 entries: 1-16 ]
[ showing up to 500 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, recent, 2404, contact, help (Access key information)

> cs > cs.DS

Data Structures and Algorithms

New submissions

New submissions for Fri, 19 Apr 24

Cross-lists for Fri, 19 Apr 24

Replacements for Fri, 19 Apr 24