Relevance queries for interval data

Loading...
Thumbnail Image

Date issued

Editors

Journal Title

Journal ISSN

Volume Title

Publisher

Reuse License

Description of rights: CC-BY-4.0
Item type: Item , ZeitschriftenaufsatzAccess status: Open Access ,

Abstract

A wide range of applications manage large collections of interval data. For instance, temporal databases manage validity intervals of objects or versions thereof, while in probabilistic databases attribute values of records are associated with confidence or uncertainty intervals. The main search operation on interval data is the retrieval of data intervals that intersect (i.e., overlap with) a query interval (e.g., find records which were valid in September 2020, find temperature readings with non-zero probability to be within [24, 26] degrees). As query results could be many, we need mechanisms that filter or order them based on how relevant they are to the query interval. We define alternative relevance scores between a data and a query interval based on their (relative) overlap. We define relevance queries, which compute only a subset of the most relevant intervals that intersect a query. Then, we propose a framework for evaluating relevance queries that can be applied on popular domain-partitioning interval indices (interval tree and HINT). We present experiments on real datasets that demonstrate the efficiency of our framework over baseline approaches.

Description

Keywords

Citation

Published in

Proceedings of the ACM on management of data, 3, 3, ACM, New York, NY, 2025, https://doi.org/10.1145/3725343

Relationships

Collections

Endorsement

Review

Supplemented By

Referenced By