Data Structures and Algorithms
New submissions
[ showing up to 2000 entries per page: fewer  more ]
New submissions for Wed, 14 Apr 21
 [1] arXiv:2104.05771 [pdf, ps, other]

Title: Online Weighted Bipartite Matching with a SampleComments: 20 pages, 1 figureSubjects: Data Structures and Algorithms (cs.DS)
We study the classical online bipartite matching problem: One side of the graph is known and vertices of the other side arrive online. It is well known that when the graph is edgeweighted, and vertices arrive in an adversarial order, no online algorithm has a nontrivial competitiveratio. To bypass this hurdle we modify the rules such that the adversary still picks the graph but has to reveal a random part (say half) of it to the player. The remaining part is given to the player in an adversarial order. This models practical scenarios in which the online algorithm has some history to learn from.
This way of modeling a history was formalized recently by the authors (SODA 20) and was called the AOS model (for Adversarial Online with a Sample). It allows developing online algorithms for the secretary problem that compete even when the secretaries arrive in an adversarial order. Here we use the same model to attack the much more challenging matching problem.
We analyze a natural algorithmic framework that decides how to match an arriving vertex $v$ by applying an offline matching algorithm to $v$ and the sample. We get roughly $1/4$ of the maximum weight by applying the offline greedy matching algorithm to the sample and $v$. Our analysis ties the performance of this algorithm to the performance of the offline greedy matching on the online part and we also prove that it is tight. Surprisingly, when replacing greedy with an optimal algorithm for maximum matching, no constant competitiveratio can be guaranteed when the size of the sample is comparable to the size of the online part. However, when the sample is quadratic in the size of the online part, we do get a competitiveratio of $1/e$.  [2] arXiv:2104.06133 [pdf, other]

Title: A New Coreset Framework for ClusteringSubjects: Data Structures and Algorithms (cs.DS)
Given a metric space, the $(k,z)$clustering problem consists of finding $k$ centers such that the sum of the of distances raised to the power $z$ of every point to its closest center is minimized. This encapsulates the famous $k$median ($z=1$) and $k$means ($z=2$) clustering problems. Designing smallspace sketches of the data that approximately preserves the cost of the solutions, also known as \emph{coresets}, has been an important research direction over the last 15 years.
In this paper, we present a new, simple coreset framework that simultaneously improves upon the best known bounds for a large variety of settings, ranging from Euclidean space, doubling metric, minorfree metric, and the general metric cases.  [3] arXiv:2104.06210 [pdf, ps, other]

Title: A simple proof of the MooreHodgson Algorithm for minimizing the number of late jobsComments: 3 pagesSubjects: Data Structures and Algorithms (cs.DS); Performance (cs.PF); Combinatorics (math.CO)
The MooreHodgson Algorithm minimizes the number of late jobs on a single machine. That is, it finds an optimal schedule for the classical problem $1~\;~\sum{U_j}$. Several proofs of the correctness of this algorithm have been published. We present a new short proof.
Crosslists for Wed, 14 Apr 21
 [4] arXiv:2011.11907 (crosslist from cs.DB) [pdf, other]

Title: Efficient Approximate Nearest Neighbor Search for Multiple Weighted $l_{p\leq2}$ Distance FunctionsSubjects: Databases (cs.DB); Data Structures and Algorithms (cs.DS)
Nearest neighbor search is fundamental to a wide range of applications. Since the exact nearest neighbor search suffers from the "curse of dimensionality", approximate approaches, such as LocalitySensitive Hashing (LSH), are widely used to trade a little query accuracy for a much higher query efficiency. In many scenarios, it is necessary to perform nearest neighbor search under multiple weighted distance functions in highdimensional spaces. This paper considers the important problem of supporting efficient approximate nearest neighbor search for multiple weighted distance functions in highdimensional spaces. To the best of our knowledge, prior work can only solve the problem for the $l_2$ distance. However, numerous studies have shown that the $l_p$ distance with $p\in(0,2)$ could be more effective than the $l_2$ distance in highdimensional spaces. We propose a novel method, WLSH, to address the problem for the $l_p$ distance for $p\in(0,2]$. WLSH takes the LSH approach and can theoretically guarantee both the efficiency of processing queries and the accuracy of query results while minimizing the required total number of hash tables. We conduct extensive experiments on synthetic and real data sets, and the results show that WLSH achieves high performance in terms of query efficiency, query accuracy and space consumption.
 [5] arXiv:2104.05983 (crosslist from cs.CR) [pdf, ps, other]

Title: Towards Better Understanding of User Authorization Query Problem via Multivariable Complexity AnalysisComments: Accepted for publication in ACM Transactions on Privacy and Security (TOPS)Subjects: Cryptography and Security (cs.CR); Databases (cs.DB); Data Structures and Algorithms (cs.DS)
User authorization queries in the context of rolebased access control have attracted considerable interest in the last 15 years. Such queries are used to determine whether it is possible to allocate a set of roles to a user that enables the user to complete a task, in the sense that all the permissions required to complete the task are assigned to the roles in that set. Answering such a query, in general, must take into account a number of factors, including, but not limited to, the roles to which the user is assigned and constraints on the sets of roles that can be activated. Answering such a query is known to be NPhard. The presence of multiple parameters and the need to find efficient and exact solutions to the problem suggest that a multivariate approach will enable us to better understand the complexity of the user authorization query problem (UAQ). In this paper, we establish a number of complexity results for UAQ. Specifically, we show the problem remains hard even when quite restrictive conditions are imposed on the structure of the problem. Our FPT results show that we have to use either a parameter with potentially quite large values or quite a restricted version of UAQ. Moreover, our second FPT algorithm is complex and requires sophisticated, stateoftheart techniques. In short, our results show that it is unlikely that all variants of UAQ that arise in practice can be solved reasonably quickly in general.
Replacements for Wed, 14 Apr 21
 [6] arXiv:1805.06869 (replaced) [pdf, ps, other]

Title: Revisiting the tree edit distance and its backtracing: A tutorialAuthors: Benjamin PaaßenComments: Supplementary material for the ICML 2018 paper: Tree Edit Distance Learning via Adaptive Symbol EmbeddingsSubjects: Data Structures and Algorithms (cs.DS)
 [7] arXiv:2007.08204 (replaced) [pdf, other]

Title: A Faster Exponential Time Algorithm for Bin Packing With a Constant Number of Bins via Additive CombinatoricsComments: Presented at SODA 2021; 42 pages; 4 figuresSubjects: Data Structures and Algorithms (cs.DS)
 [8] arXiv:1911.04415 (replaced) [pdf, other]

Title: Revisiting the Approximate Carathéodory Problem via the FrankWolfe AlgorithmComments: 21 pages and 2 figuresSubjects: Optimization and Control (math.OC); Data Structures and Algorithms (cs.DS); Machine Learning (cs.LG)
 [9] arXiv:2007.15306 (replaced) [pdf, other]

Title: Phase Transition of the kMajority Dynamics in Biased Communication ModelsComments: Preliminary versions published in DISC 2020 (Brief Announcement) and ICDCN 2021Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Data Structures and Algorithms (cs.DS); Probability (math.PR)
[ showing up to 2000 entries per page: fewer  more ]
Disable MathJax (What is MathJax?)
Links to: arXiv, form interface, find, cs, recent, 2104, contact, help (Access key information)