Data Structures and Algorithms
New submissions
[ showing up to 2000 entries per page: fewer | more ]
New submissions for Fri, 19 Apr 24
- [1] arXiv:2404.11673 [pdf, other]
-
Title: Hairpin Completion Distance Lower BoundComments: To be published in CPM 2024Subjects: Data Structures and Algorithms (cs.DS)
Hairpin completion, derived from the hairpin formation observed in DNA biochemistry, is an operation applied to strings, particularly useful in DNA computing. Conceptually, a right hairpin completion operation transforms a string $S$ into $S\cdot S'$ where $S'$ is the reverse complement of a prefix of $S$. Similarly, a left hairpin completion operation transforms a string $S$ into $S'\cdot S$ where $S'$ is the reverse complement of a suffix of $S$. The hairpin completion distance from $S$ to $T$ is the minimum number of hairpin completion operations needed to transform $S$ into $T$. Recently Boneh et al. showed an $O(n^2)$ time algorithm for finding the hairpin completion distance between two strings of length at most $n$. In this paper we show that for any $\varepsilon>0$ there is no $O(n^{2-\varepsilon})$-time algorithm for the hairpin completion distance problem unless the Strong Exponential Time Hypothesis (SETH) is false. Thus, under SETH, the time complexity of the hairpin completion distance problem is quadratic, up to sub-polynomial factors.
- [2] arXiv:2404.11879 [pdf, other]
-
Title: Public Event Scheduling with Busy AgentsComments: To appear in IJCAI 2024Subjects: Data Structures and Algorithms (cs.DS)
We study a public event scheduling problem, where multiple public events are scheduled to coordinate the availability of multiple agents. The availability of each agent is determined by solving a separate flexible interval job scheduling problem, where the jobs are required to be preemptively processed. The agents want to attend as many events as possible, and their agreements are considered to be the total length of time during which they can attend these events. The goal is to find a schedule for events as well as the job schedule for each agent such that the total agreement is maximized.
We first show that the problem is NP-hard, and then prove that a simple greedy algorithm achieves $\frac{1}{2}$-approximation when the whole timeline is polynomially bounded. Our method also implies a $(1-\frac{1}{e})$-approximate algorithm for this case. Subsequently, for the general timeline case, we present an algorithmic framework that extends a $\frac{1}{\alpha}$-approximate algorithm for the one-event instance to the general case that achieves $\frac{1}{\alpha+1}$-approximation. Finally, we give a polynomial time algorithm that solves the one-event instance, and this implies a $\frac{1}{2}$-approximate algorithm for the general case. - [3] arXiv:2404.12080 [pdf, other]
-
Title: A Mathematical Formalisation of the γ-contraction ProblemAuthors: Elia OnofriSubjects: Data Structures and Algorithms (cs.DS); Combinatorics (math.CO)
Networks play an ubiquitous role in computer science and real-world applications, offering multiple kind of information that can be retrieved with adequate methods. With the continuous growing in the amount of data available, networks are becoming larger day by day. Consequently, the tasks that were easily achievable on smaller networks, often becomes impractical on huge amount of data, either due to the high computational cost or due to the impracticality to visualise corresponding data. Using distinctive node features to group large amount of connected data into a limited number of clusters, hence represented by a representative per cluster, proves to be a valuable approach. The resulting contracted graphs are more manageable in size and can reveal previously hidden characteristics of the original networks. Furthermore, in many real-world use cases, a definition of cluster is intrinsic with the data, eventually obtained with the injection of some expert knowledge represent by a categorical function. Clusters then results in set of connected vertices taking the same values in a finite set C. In the recent literature, Lombardi and Onofri proposed a novel, fast, and easily parallelisable approach under the name of $\gamma$-contraction to contract a graph given a categorical function. In this work, we formally define such approach by providing a rigorous mathematical definition of the problem, which, to the best of our knowledge, was missing in the existing literature. Specifically, we explore the variadic nature of the contraction operation and use it to introduce the weaker version of the colour contraction, under the name of $\beta$-contraction, that the algorithmic solution exploits. We finally dive into the details of the algorithm and we provide a full assesment on its convergence complexity relying on two constructive proofs that deeply unveil its mode of operation.
Cross-lists for Fri, 19 Apr 24
- [4] arXiv:2404.11862 (cross-list from cs.SI) [pdf, ps, other]
-
Title: A Fast Maximum Clique Algorithm Based on Network Decomposition for Large Sparse NetworksComments: 12 pages, 2 figures, 1 tableSubjects: Social and Information Networks (cs.SI); Data Structures and Algorithms (cs.DS); Information Retrieval (cs.IR); Data Analysis, Statistics and Probability (physics.data-an)
Finding maximum cliques in large networks is a challenging combinatorial problem with many real-world applications. We present a fast algorithm to achieve the exact solution for the maximum clique problem in large sparse networks based on efficient graph decomposition. A bunch of effective techniques is being used to greatly prune the graph and a novel concept called Complete-Upper-Bound-Induced Subgraph (CUBIS) is proposed to ensure that the structures with the potential to form the maximum clique are retained in the process of graph decomposition. Our algorithm first pre-prunes peripheral nodes, subsequently, one or two small-scale CUBISs are constructed guided by the core number and current maximum clique size. Bron-Kerbosch search is performed on each CUBIS to find the maximum clique. Experiments on 50 empirical networks with a scale of up to 20 million show the CUBIS scales are largely independent of the original network scale. This enables an approximately linear runtime, making our algorithm amenable for large networks. Our work provides a new framework for effectively solving maximum clique problems on massive sparse graphs, which not only makes the graph scale no longer the bottleneck but also shows some light on solving other clique-related problems.
- [5] arXiv:2404.12107 (cross-list from cs.DB) [pdf, other]
-
Title: Effective Individual Fairest Community Search over Heterogeneous Information NetworksSubjects: Databases (cs.DB); Data Structures and Algorithms (cs.DS); Social and Information Networks (cs.SI)
Community search over heterogeneous information networks has been applied to wide domains, such as activity organization and team formation. From these scenarios, the members of a group with the same treatment often have different levels of activity and workloads, which causes unfairness in the treatment between active members and inactive members (called individual unfairness). However, existing works do not pay attention to individual fairness and do not sufficiently consider the rich semantics of HINs (e.g., high-order structure), which disables complex queries. To fill the gap, we formally define the issue of individual fairest community search over HINs (denoted as IFCS), which aims to find a set of vertices from the HIN that own the same type, close relationships, and small difference of activity level and has been demonstrated to be NP-hard. To do this, we first develop an exploration-based filter that reduces the search space of the community effectively. Further, to avoid repeating computation and prune unfair communities in advance, we propose a message-based scheme and a lower bound-based scheme. At last, we conduct extensive experiments on four real-world datasets to demonstrate the effectiveness and efficiency of our proposed algorithms, which achieve at least X3 times faster than the baseline solution.
Replacements for Fri, 19 Apr 24
- [6] arXiv:1704.04370 (replaced) [pdf, other]
-
Title: Fast Similarity SketchingComments: The original version was directly based on a conference paper of the same title from FOCS'17. This new version is substantially revised with some cleaner and stronger theorems, particularly concerning the high probability domain. Moreover, there is one more author, Jakob Houen. In addition, one of the old authors, Mathias, has changed surname from Knudsen to LanghedeSubjects: Data Structures and Algorithms (cs.DS)
- [7] arXiv:2001.05053 (replaced) [src]
-
Title: Tight Static Lower Bounds for Non-Adaptive Data StructuresComments: This paper has been superceded and merged with arXiv:2308.16042Subjects: Data Structures and Algorithms (cs.DS)
- [8] arXiv:2303.06090 (replaced) [pdf, ps, other]
-
Title: Simple and efficient four-cycle counting on sparse graphsSubjects: Data Structures and Algorithms (cs.DS)
- [9] arXiv:2303.14467 (replaced) [pdf, other]
-
Title: A Survey on the Densest Subgraph Problem and Its VariantsComments: Accepted to ACM Computing SurveysSubjects: Data Structures and Algorithms (cs.DS); Artificial Intelligence (cs.AI)
- [10] arXiv:2308.07476 (replaced) [pdf, ps, other]
-
Title: Dependent rounding with strong negative-correlation, and scheduling on unrelated machines to minimize completion timeAuthors: David G. HarrisJournal-ref: SODA 2024Subjects: Data Structures and Algorithms (cs.DS)
- [11] arXiv:2310.02670 (replaced) [pdf, other]
-
Title: Searching 2D-Strings for Matching FramesSubjects: Data Structures and Algorithms (cs.DS)
- [12] arXiv:2311.13590 (replaced) [pdf, other]
-
Title: Triangle-free $2$-matchingsAuthors: Katarzyna PaluchComments: A more polished version and a clearer explanation of \Delta-diminishing hinges and 1-vulnerable trianglesSubjects: Data Structures and Algorithms (cs.DS); Discrete Mathematics (cs.DM)
- [13] arXiv:2403.07760 (replaced) [pdf, other]
-
Title: Simplified Tight Bounds for Monotone Minimal Perfect HashingAuthors: Dmitry KosolobovComments: 13 pages, 4 figuresSubjects: Data Structures and Algorithms (cs.DS)
- [14] arXiv:2403.12213 (replaced) [pdf, ps, other]
-
Title: Private graphon estimation via sum-of-squaresComments: 71 pages, accepted to STOC 2024Subjects: Data Structures and Algorithms (cs.DS); Computational Complexity (cs.CC); Machine Learning (cs.LG); Machine Learning (stat.ML)
- [15] arXiv:2404.11607 (replaced) [pdf, other]
-
Title: Private federated discovery of out-of-vocabulary words for GboardSubjects: Data Structures and Algorithms (cs.DS)
- [16] arXiv:2207.08015 (replaced) [pdf, ps, other]
-
Title: Parallel Best Arm Identification in Heterogeneous EnvironmentsComments: 15 pages (published in SPAA 2024)Subjects: Machine Learning (cs.LG); Data Structures and Algorithms (cs.DS)
[ showing up to 2000 entries per page: fewer | more ]
Disable MathJax (What is MathJax?)
Links to: arXiv, form interface, find, cs, recent, 2404, contact, help (Access key information)