Relevance queries for interval data
Loading...
Date issued
Authors
Editors
Journal Title
Journal ISSN
Volume Title
Publisher
Reuse License
Description of rights: CC-BY-4.0
Abstract
A wide range of applications manage large collections of interval data. For instance, temporal databases manage validity intervals of objects or versions thereof, while in probabilistic databases attribute values of records are associated with confidence or uncertainty intervals. The main search operation on interval data is the retrieval of data intervals that intersect (i.e., overlap with) a query interval (e.g., find records which were valid in September 2020, find temperature readings with non-zero probability to be within [24, 26] degrees). As query results could be many, we need mechanisms that filter or order them based on how relevant they are to the query interval. We define alternative relevance scores between a data and a query interval based on their (relative) overlap. We define relevance queries, which compute only a subset of the most relevant intervals that intersect a query. Then, we propose a framework for evaluating relevance queries that can be applied on popular domain-partitioning interval indices (interval tree and HINT). We present experiments on real datasets that demonstrate the efficiency of our framework over baseline approaches.
Description
Keywords
Citation
Published in
Proceedings of the ACM on management of data, 3, 3, ACM, New York, NY, 2025, https://doi.org/10.1145/3725343
