We gratefully acknowledge support from
the Simons Foundation and member institutions.

Databases

Authors and titles for cs.DB in Apr 2024

[ total of 87 entries: 1-87 ]
[ showing 87 entries per page: fewer | more ]
[1]  arXiv:2404.00007 [pdf, other]
Title: A Comprehensive Tutorial on over 100 Years of Diagrammatic Representations of Logical Statements and Relational Queries
Comments: 6 pages, 2 figures, preprint of ICDE 2024 tutorial. arXiv admin note: substantial text overlap with arXiv:2308.10319
Subjects: Databases (cs.DB); Logic in Computer Science (cs.LO)
[2]  arXiv:2404.00065 [pdf, other]
Title: Towards a Theoretical Foundation of Process Science
Comments: 8 pages, 1 figure, submitted to 19th International Conference on Wirtschaftsinformatik 2024. arXiv admin note: text overlap with arXiv:2203.09602
Subjects: Databases (cs.DB); Computers and Society (cs.CY); Physics and Society (physics.soc-ph)
[3]  arXiv:2404.00137 [pdf, ps, other]
Title: Budget-aware Query Tuning: An AutoML Perspective
Authors: Wentao Wu, Chi Wang
Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[4]  arXiv:2404.00746 [pdf, other]
Title: Mining Weighted Sequential Patterns in Incremental Uncertain Databases
Comments: Accepted to Information Science journal
Journal-ref: Information Sciences 582 (2022): 865-896
Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI)
[5]  arXiv:2404.00766 [pdf, other]
Title: SoK: The Faults in our Graph Benchmarks
Subjects: Databases (cs.DB)
[6]  arXiv:2404.00966 [pdf, ps, other]
Title: GTS: GPU-based Tree Index for Fast Similarity Search
Comments: Accepted by SIGMOD 2024
Journal-ref: Proc. ACM Manag. Data, 2(3): 142:1-142:27
Subjects: Databases (cs.DB)
[7]  arXiv:2404.01347 [pdf, other]
Title: Mining Sequential Patterns in Uncertain Databases Using Hierarchical Index Structure
Comments: Accepted at PAKDD 2021. arXiv admin note: text overlap with arXiv:2404.00746
Subjects: Databases (cs.DB)
[8]  arXiv:2404.01585 [pdf, other]
Title: FLEXIS: FLEXible Frequent Subgraph Mining using Maximal Independent Sets
Subjects: Databases (cs.DB); Performance (cs.PF)
[9]  arXiv:2404.01710 [pdf, other]
Title: Practical Persistent Multi-Word Compare-and-Swap Algorithms for Many-Core CPUs
Comments: 8 pages, 14 figures
Subjects: Databases (cs.DB)
[10]  arXiv:2404.02276 [pdf, ps, other]
Title: Heterogeneous Data Access Model for Concurrency Control and Methods to Deal with High Data Contention
Subjects: Databases (cs.DB); Performance (cs.PF)
[11]  arXiv:2404.02933 [pdf, other]
Title: NL2KQL: From Natural Language to Kusto Query
Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[12]  arXiv:2404.03194 [pdf, other]
Title: Reservoir Sampling over Joins
Subjects: Databases (cs.DB)
[13]  arXiv:2404.03880 [pdf, other]
Title: Semantic SQL -- Combining and optimizing semantic predicates in SQL
Subjects: Databases (cs.DB)
[14]  arXiv:2404.03929 [pdf, other]
Title: SLSM : An Efficient Strategy for Lazy Schema Migration on Shared-Nothing Databases
Subjects: Databases (cs.DB)
[15]  arXiv:2404.04352 [pdf, other]
Title: Qr-Hint: Actionable Hints Towards Correcting Wrong SQL Queries
Comments: SIGMOD 2024
Subjects: Databases (cs.DB)
[16]  arXiv:2404.04703 [pdf, other]
Title: Aleph Filter: To Infinity in Constant Time
Subjects: Databases (cs.DB); Data Structures and Algorithms (cs.DS)
[17]  arXiv:2404.04713 [pdf, other]
Title: Faster Algorithms for Fair Max-Min Diversification in $\mathbb{R}^d$
Journal-ref: SIGMOD 2024
Subjects: Databases (cs.DB); Data Structures and Algorithms (cs.DS)
[18]  arXiv:2404.05696 [pdf, ps, other]
Title: BOLD v4: A Centralized Bioinformatics Platform for DNA-based Biodiversity Data
Subjects: Databases (cs.DB); Quantitative Methods (q-bio.QM)
[19]  arXiv:2404.05777 [pdf, other]
Title: IA2: Leveraging Instance-Aware Index Advisor with Reinforcement Learning for Diverse Workloads
Comments: EuroMLSys 24, April 22, 2024, Athens, Greece
Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI)
[20]  arXiv:2404.05778 [pdf, ps, other]
Title: Database-Driven Mathematical Inquiry
Authors: Steven Clontz
Subjects: Databases (cs.DB); History and Overview (math.HO)
[21]  arXiv:2404.05949 [pdf, ps, other]
Title: Balanced Partitioning for Optimizing Big Graph Computation: Complexities and Approximation Algorithms
Subjects: Databases (cs.DB); Data Structures and Algorithms (cs.DS)
[22]  arXiv:2404.06035 [pdf, ps, other]
Title: PM4Py.LLM: a Comprehensive Module for Implementing PM on LLMs
Authors: Alessandro Berti
Subjects: Databases (cs.DB)
[23]  arXiv:2404.06043 [pdf, other]
Title: Automatic Configuration Tuning on Cloud Database: A Survey
Subjects: Databases (cs.DB)
[24]  arXiv:2404.06278 [pdf, other]
Title: Dimensionality Reduction in Sentence Transformer Vector Databases with Fast Fourier Transform
Comments: 13 pages, 5 figures
Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[25]  arXiv:2404.06563 [pdf, other]
Title: Demonstration of MaskSearch: Efficiently Querying Image Masks for Machine Learning Workflows
Subjects: Databases (cs.DB); Machine Learning (cs.LG); Multimedia (cs.MM)
[26]  arXiv:2404.07354 [pdf, other]
Title: FairEM360: A Suite for Responsible Entity Matching
Subjects: Databases (cs.DB); Computers and Society (cs.CY); Machine Learning (cs.LG)
[27]  arXiv:2404.07663 [pdf, other]
Title: Interactive Ontology Matching with Cost-Efficient Learning
Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[28]  arXiv:2404.08727 [pdf, other]
Title: Can LLMs substitute SQL? Comparing Resource Utilization of Querying LLMs versus Traditional Relational Databases
Comments: 13 pages, 2 figures, 5 tables
Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[29]  arXiv:2404.08901 [pdf, other]
Title: Bullion: A Column Store for Machine Learning
Subjects: Databases (cs.DB); Machine Learning (cs.LG)
[30]  arXiv:2404.09109 [pdf, other]
Title: Optimizing Disjunctive Queries with Tagged Execution
Subjects: Databases (cs.DB)
[31]  arXiv:2404.09637 [pdf, other]
Title: climber++: Pivot-Based Approximate Similarity Search over Big Data Series
Comments: 16 pages, 14 figures, 1 table
Journal-ref: ICDE 2024
Subjects: Databases (cs.DB)
[32]  arXiv:2404.10086 [pdf, ps, other]
Title: Empowering Enterprise Development by Building and Deploying Admin Dashboard using Refine Framework
Subjects: Databases (cs.DB); Software Engineering (cs.SE)
[33]  arXiv:2404.10413 [pdf, other]
Title: VDTuner: Automated Performance Tuning for Vector Data Management Systems
Comments: Accepted by ICDE 2024
Subjects: Databases (cs.DB); Machine Learning (cs.LG); Performance (cs.PF)
[34]  arXiv:2404.11105 [pdf, other]
Title: XMiner: Efficient Directed Subgraph Matching with Pattern Reduction
Subjects: Databases (cs.DB); Distributed, Parallel, and Cluster Computing (cs.DC)
[35]  arXiv:2404.11450 [pdf, other]
Title: Real-Time Trajectory Synthesis with Local Differential Privacy
Comments: Accepted by ICDE 2024. Code is available at: this https URL
Subjects: Databases (cs.DB); Cryptography and Security (cs.CR)
[36]  arXiv:2404.12107 [pdf, other]
Title: Effective Individual Fairest Community Search over Heterogeneous Information Networks
Subjects: Databases (cs.DB); Data Structures and Algorithms (cs.DS); Social and Information Networks (cs.SI)
[37]  arXiv:2404.12128 [pdf, other]
Title: Optimizing Intensive Database Tasks Through Caching Proxy Mechanisms
Comments: 8 pages, submitted at conference
Subjects: Databases (cs.DB)
[38]  arXiv:2404.12505 [pdf, other]
Title: Open Research Issues and Tools for Visualization and Big Data Analytics
Comments: 28 pages, 4 figures
Journal-ref: International Journal of Computing and Digital Systems, 15, 1103-1117, 2024
Subjects: Databases (cs.DB)
[39]  arXiv:2404.12552 [pdf, other]
Title: Cocoon: Semantic Table Profiling Using Large Language Models
Subjects: Databases (cs.DB)
[40]  arXiv:2404.12608 [pdf, other]
Title: Auto-Formula: Recommend Formulas in Spreadsheets using Contrastive Learning for Table Representations
Comments: full version of a paper to appear in SIGMOD 2024
Subjects: Databases (cs.DB); Computation and Language (cs.CL); Programming Languages (cs.PL)
[41]  arXiv:2404.12872 [pdf, other]
Title: LLM-R2: A Large Language Model Enhanced Rule-based Rewrite System for Boosting Query Efficiency
Comments: 12 pages
Subjects: Databases (cs.DB); Computation and Language (cs.CL)
[42]  arXiv:2404.12913 [pdf, other]
Title: Influential Billboard Slot Selection under Zonal Influence Constraint
Comments: 14 Pages
Subjects: Databases (cs.DB)
[43]  arXiv:2404.13091 [pdf, other]
Title: A process mining-based error correction approach to improve data quality of an IoT-sourced event log
Comments: 10 pages
Subjects: Databases (cs.DB); Emerging Technologies (cs.ET)
[44]  arXiv:2404.13105 [pdf, other]
Title: On-Demand Earth System Data Cubes
Comments: Accepted at IGARSS24
Subjects: Databases (cs.DB); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[45]  arXiv:2404.13359 [pdf, other]
Title: Declarative Concurrent Data Structures
Subjects: Databases (cs.DB); Data Structures and Algorithms (cs.DS); Programming Languages (cs.PL)
[46]  arXiv:2404.13489 [pdf, other]
Title: SCHENO: Measuring Schema vs. Noise in Graphs
Subjects: Databases (cs.DB)
[47]  arXiv:2404.13682 [pdf, other]
Title: Reproducible data science over data lakes: replayable data pipelines with Bauplan and Nessie
Comments: Pre-print of paper accepted at SIGMOD (DEEM2024)
Subjects: Databases (cs.DB); Machine Learning (cs.LG)
[48]  arXiv:2404.14831 [pdf, other]
Title: Towards Universal Dense Blocking for Entity Resolution
Comments: Code and data are available at this this https URL
Subjects: Databases (cs.DB); Computation and Language (cs.CL); Information Retrieval (cs.IR)
[49]  arXiv:2404.14999 [pdf, other]
Title: A Unified Replay-based Continuous Learning Framework for Spatio-Temporal Prediction on Streaming Data
Comments: Accepted by ICDE 2024
Subjects: Databases (cs.DB); Machine Learning (cs.LG)
[50]  arXiv:2404.15670 [pdf, other]
Title: HTAP Databases: A Survey
Comments: IEEE Transactions on Knowledge and Data Engineering, 2024
Subjects: Databases (cs.DB)
[51]  arXiv:2404.16224 [pdf, ps, other]
Title: Tractable Conjunctive Queries over Static and Dynamic Relations
Subjects: Databases (cs.DB)
[52]  arXiv:2404.16322 [pdf, other]
Title: Bridging Speed and Accuracy to Approximate $K$-Nearest Neighbor Search
Comments: 13 pages
Subjects: Databases (cs.DB)
[53]  arXiv:2404.16486 [pdf, other]
Title: OpenIVM: a SQL-to-SQL Compiler for Incremental Computations
Subjects: Databases (cs.DB)
[54]  arXiv:2404.17136 [pdf, other]
Title: Automated Data Visualization from Natural Language via Large Language Models: An Exploratory Study
Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[55]  arXiv:2404.17679 [pdf, other]
Title: Recent Increments in Incremental View Maintenance
Authors: Dan Olteanu
Comments: 18 pages, 7 figures, Gems of PODS 2024
Subjects: Databases (cs.DB); Data Structures and Algorithms (cs.DS)
[56]  arXiv:2404.18428 [pdf, other]
Title: Geospatial Big Data: Survey and Challenges
Comments: IEEE JSTARS. 14 pages, 5 figures
Subjects: Databases (cs.DB)
[57]  arXiv:2404.18673 [pdf, other]
Title: Open-Source Drift Detection Tools in Action: Insights from Two Use Cases
Subjects: Databases (cs.DB); Machine Learning (cs.LG)
[58]  arXiv:2404.18681 [pdf, other]
Title: LLMClean: Context-Aware Tabular Data Cleaning via LLM-Generated OFDs
Subjects: Databases (cs.DB)
[59]  arXiv:2404.19052 [pdf, other]
Title: Exploring Weighted Property Approaches for RDF Graph Similarity Measure
Subjects: Databases (cs.DB); Information Retrieval (cs.IR)
[60]  arXiv:2404.19243 [pdf, other]
Title: Co-occurrence order-preserving pattern mining
Subjects: Databases (cs.DB)
[61]  arXiv:2404.19591 [pdf, other]
Title: Towards Interactively Improving ML Data Preparation Code via "Shadow Pipelines"
Subjects: Databases (cs.DB); Machine Learning (cs.LG); Software Engineering (cs.SE)
[62]  arXiv:2404.00776 (cross-list from cs.LG) [pdf, other]
Title: PyTorch Frame: A Modular Framework for Multi-Modal Tabular Learning
Comments: this https URL
Subjects: Machine Learning (cs.LG); Databases (cs.DB); Machine Learning (stat.ML)
[63]  arXiv:2404.01593 (cross-list from cs.DC) [pdf, other]
Title: Optimizing Distributed Protocols with Query Rewrites [Technical Report]
Comments: Technical report of paper accepted at SIGMOD 2024
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Databases (cs.DB)
[64]  arXiv:2404.02930 (cross-list from cs.CR) [pdf, other]
Title: What Blocks My Blockchain's Throughput? Developing a Generalizable Approach for Identifying Bottlenecks in Permissioned Blockchains
Subjects: Cryptography and Security (cs.CR); Databases (cs.DB)
[65]  arXiv:2404.03299 (cross-list from cs.LG) [pdf, other]
Title: SiloFuse: Cross-silo Synthetic Data Generation with Latent Tabular Diffusion Models
Comments: Accepted at 40th IEEE International Conference on Data Engineering (ICDE 2024)
Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR); Databases (cs.DB); Distributed, Parallel, and Cluster Computing (cs.DC)
[66]  arXiv:2404.04271 (cross-list from cs.IR) [pdf, other]
Title: Towards Effective Next POI Prediction: Spatial and Semantic Augmentation with Remote Sensing Data
Comments: 12 pages, 11 figures, Accepted by ICDE 2024
Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Databases (cs.DB)
[67]  arXiv:2404.04621 (cross-list from cs.PL) [pdf, other]
Title: IsoPredict: Dynamic Predictive Analysis for Detecting Unserializable Behaviors in Weakly Isolated Data Store Applications
Journal-ref: Proc. ACM Program. Lang., Vol. 8, No. PLDI, Article 161. Publication date: June 2024
Subjects: Programming Languages (cs.PL); Databases (cs.DB)
[68]  arXiv:2404.05057 (cross-list from cs.LG) [pdf, other]
Title: TimeCSL: Unsupervised Contrastive Learning of General Shapelets for Explorable Time Series Analysis
Subjects: Machine Learning (cs.LG); Databases (cs.DB)
[69]  arXiv:2404.06819 (cross-list from cs.CR) [pdf, other]
Title: Enc2DB: A Hybrid and Adaptive Encrypted Query Processing Framework
Comments: 33 pages,33 figures, DASAFAA24
Subjects: Cryptography and Security (cs.CR); Databases (cs.DB)
[70]  arXiv:2404.08669 (cross-list from cs.IR) [pdf, ps, other]
Title: Combining PatternRank with Huffman Coding: A Novel Compression Algorithm
Subjects: Information Retrieval (cs.IR); Databases (cs.DB)
[71]  arXiv:2404.09674 (cross-list from cs.DS) [pdf, ps, other]
Title: A Circus of Circuits: Connections Between Decision Diagrams, Circuits, and Automata
Comments: 26 pages
Subjects: Data Structures and Algorithms (cs.DS); Databases (cs.DB); Formal Languages and Automata Theory (cs.FL)
[72]  arXiv:2404.10150 (cross-list from cs.CL) [pdf, other]
Title: TabSQLify: Enhancing Reasoning Capabilities of LLMs Through Table Decomposition
Comments: Accepted to NAACL 2024 (long, main)
Subjects: Computation and Language (cs.CL); Databases (cs.DB); Information Retrieval (cs.IR)
[73]  arXiv:2404.11581 (cross-list from cs.AI) [pdf, other]
Title: LLMTune: Accelerate Database Knob Tuning with Large Language Models
Subjects: Artificial Intelligence (cs.AI); Databases (cs.DB)
[74]  arXiv:2404.12560 (cross-list from cs.CL) [pdf, other]
Title: Dubo-SQL: Diverse Retrieval-Augmented Generation and Fine Tuning for Text-to-SQL
Comments: 10 pages, 3 figures, 3 tables
Subjects: Computation and Language (cs.CL); Databases (cs.DB)
[75]  arXiv:2404.13990 (cross-list from cs.LG) [pdf, other]
Title: QCore: Data-Efficient, On-Device Continual Calibration for Quantized Models -- Extended Version
Comments: 15 pages. An extended version of "QCore: Data-Efficient, On-Device Continual Calibration for Quantized Models" accepted at PVLDB 2024
Subjects: Machine Learning (cs.LG); Databases (cs.DB)
[76]  arXiv:2404.14061 (cross-list from cs.LG) [pdf, other]
Title: FedTAD: Topology-aware Data-free Knowledge Distillation for Subgraph Federated Learning
Comments: Accepted by IJCAI 2024
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Databases (cs.DB); Social and Information Networks (cs.SI)
[77]  arXiv:2404.14453 (cross-list from cs.CL) [pdf, other]
Title: EPI-SQL: Enhancing Text-to-SQL Translation with Error-Prevention Instructions
Authors: Xiping Liu, Zhao Tan
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Databases (cs.DB)
[78]  arXiv:2404.14692 (cross-list from cs.SI) [pdf, other]
Title: Deep Overlapping Community Search via Subspace Embedding
Subjects: Social and Information Networks (cs.SI); Databases (cs.DB); Physics and Society (physics.soc-ph)
[79]  arXiv:2404.14809 (cross-list from cs.CL) [pdf, other]
Title: A Survey of Large Language Models on Generative Graph Analytics: Query, Learning, and Applications
Comments: 31 pages including references, 22 figures
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Databases (cs.DB)
[80]  arXiv:2404.15840 (cross-list from cs.LO) [pdf, ps, other]
Title: Constructive Interpolation and Concept-Based Beth Definability for Description Logics via Sequents
Comments: Accepted to IJCAI 2024
Subjects: Logic in Computer Science (cs.LO); Artificial Intelligence (cs.AI); Databases (cs.DB); Logic (math.LO)
[81]  arXiv:2404.17757 (cross-list from cs.AI) [pdf, ps, other]
Title: Middle Architecture Criteria
Comments: 14 pages
Subjects: Artificial Intelligence (cs.AI); Databases (cs.DB); Logic in Computer Science (cs.LO)
[82]  arXiv:2404.17758 (cross-list from cs.AI) [pdf, ps, other]
Title: The Common Core Ontologies
Comments: 13 pages
Subjects: Artificial Intelligence (cs.AI); Databases (cs.DB); Logic in Computer Science (cs.LO)
[83]  arXiv:2404.18209 (cross-list from cs.LG) [pdf, other]
Title: 4DBInfer: A 4D Benchmarking Toolbox for Graph-Centric Predictive Modeling on Relational DBs
Comments: Under review
Subjects: Machine Learning (cs.LG); Databases (cs.DB)
[84]  arXiv:2404.18388 (cross-list from cs.CR) [pdf, other]
Title: SPECIAL: Synopsis Assisted Secure Collaborative Analytics
Subjects: Cryptography and Security (cs.CR); Databases (cs.DB)
[85]  arXiv:2404.19234 (cross-list from cs.AI) [pdf, other]
Title: Multi-hop Question Answering over Knowledge Graphs using Large Language Models
Authors: Abir Chakraborty
Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Databases (cs.DB)
[86]  arXiv:2404.19519 (cross-list from cs.LG) [pdf, ps, other]
Title: Generating Robust Counterfactual Witnesses for Graph Neural Networks
Comments: This paper has been accepted by ICDE 2024
Subjects: Machine Learning (cs.LG); Databases (cs.DB)
[87]  arXiv:2404.01514 (cross-list from q-bio.QM) [pdf, ps, other]
Title: A drug classification pipeline for Medicaid claims using RxNorm
Subjects: Quantitative Methods (q-bio.QM); Databases (cs.DB)
[ total of 87 entries: 1-87 ]
[ showing 87 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, 2406, contact, help  (Access key information)