Relevance queries for interval data

A wide range of applications manage large collections of interval data. For instance, temporal databases manage validity intervals of objects or versions thereof, while in probabilistic databases attribute values of records are associated with confidence or uncertainty intervals. The main search operation on interval data is the retrieval of data intervals that intersect (i.e., overlap with) a query interval (e.g., find records which were valid in September 2020, find temperature readings with non-zero probability to be within [24, 26] degrees). As query results could be many, we need mechanisms that filter or order them based on how relevant they are to the query interval. We define alternative relevance scores between a data and a query interval based on their (relative) overlap. We define relevance queries, which compute only a subset of the most relevant intervals that intersect a query. Then, we propose a framework for evaluating relevance queries that can be applied on popular domain-partitioning interval indices (interval tree and HINT). We present experiments on real datasets that demonstrate the efficiency of our framework over baseline approaches.

DOI

https://doi.org/10.25358/openscience-13381

URI

https://openscience.ub.uni-mainz.de/handle/20.500.12030/13402

Published in

Proceedings of the ACM on management of data, 3, 3, ACM, New York, NY, 2025, https://doi.org/10.1145/3725343

Collections

DFG-491381577-H

Full item page

Relevance queries for interval data

Files

Date issued

Authors

Editors

Journal Title

Journal ISSN

Volume Title

Publisher

Reuse License

Abstract

DOI

Description

Keywords

Citation

URI

Published in

Relationships

Collections

Endorsement

Review

Supplemented By

Referenced By