In-Memory Interval Joins

dc.contributor.authorBouros, Panagiotis
dc.contributor.authorMamoulis, Nikos
dc.contributor.authorTsitsigkos, Dimitrios
dc.contributor.authorTerrovitis, Manolis
dc.date.accessioned2022-07-04T08:23:34Z
dc.date.available2022-07-04T08:23:34Z
dc.date.issued2021
dc.description.abstractThe interval join is a popular operation in temporal, spatial, and uncertain databases. The majority of interval join algorithms assume that input data reside on disk and so, their focus is to minimize the I/O accesses. Recently, an in-memory approach based on plane sweep (PS) for modern hardware was proposed which greatly outperforms previous work. However, this approach relies on a complex data structure and its parallelization has not been adequately studied. In this article, we investigate in-memory interval joins in two directions. First, we explore the applicability of a largely ignored forward scan (FS)-based plane sweep algorithm, for single-threaded join evaluation. We propose four optimizations for FS that greatly reduce its cost, making it competitive or even faster than the state-of-the-art. Second, we study in depth the parallel computation of interval joins. We design a non-partitioning-based approach that determines independent tasks of the join algorithm to run in parallel. Then, we address the drawbacks of the previously proposed hash-based partitioning and suggest a domain-based partitioning approach that does not produce duplicate results. Within our approach, we propose a novel breakdown of the partition-joins into mini-joins to be scheduled in the available CPU threads and propose an adaptive domain partitioning, aiming at load balancing. We also investigate how the partitioning phase can benefit from modern parallel hardware. Our thorough experimental analysis demonstrates the advantage of our novel partitioning-based approach for parallel computation.en_GB
dc.identifier.doihttp://doi.org/10.25358/openscience-7287
dc.identifier.urihttps://openscience.ub.uni-mainz.de/handle/20.500.12030/7301
dc.language.isoengde
dc.rightsCC-BY-4.0*
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/*
dc.subject.ddc004 Informatikde_DE
dc.subject.ddc004 Data processingen_GB
dc.titleIn-Memory Interval Joinsen_GB
dc.typeZeitschriftenaufsatzde
jgu.journal.titleThe VLDB Journalde
jgu.journal.volume30de
jgu.organisation.departmentFB 08 Physik, Mathematik u. Informatikde
jgu.organisation.nameJohannes Gutenberg-Universität Mainz
jgu.organisation.number7940
jgu.organisation.placeMainz
jgu.organisation.rorhttps://ror.org/023b0x485
jgu.pages.end691de
jgu.pages.start667de
jgu.publisher.doi10.1007/s00778-020-00639-0de
jgu.publisher.issn0949-877Xde
jgu.publisher.nameSpringerde
jgu.publisher.placeBerlin u.a.
jgu.publisher.year2021
jgu.rights.accessrightsopenAccess
jgu.subject.ddccode004de
jgu.type.dinitypeArticleen_GB
jgu.type.resourceTextde
jgu.type.versionPublished versionde

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
inmemory_interval_joins-20220701133214926.pdf
Size:
2.37 MB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
3.57 KB
Format:
Item-specific license agreed upon to submission
Description: