Toward Efficient Navigation of Massive-Scale Geo-Textual Streams

Toward Efficient Navigation of Massive-Scale Geo-Textual Streams

Chengcheng Yang, Lisi Chen, Shuo Shang, Fan Zhu, Li Liu, Ling Shao

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence
Main track. Pages 4838-4845. https://doi.org/10.24963/ijcai.2019/672

With the popularization of portable devices, numerous applications continuously produce huge streams of geo-tagged textual data, thus posing challenges to index geo-textual streaming data efficiently, which is an important task in both data management and AI applications, e.g., real-time data streams mining and targeted advertising. This, however, is not possible with the state-of-the-art indexing methods as they focus on search optimizations of static datasets, and have high index maintenance cost. In this paper, we present NQ-tree, which combines new structure designs and self-tuning methods to navigate between update and search efficiency. Our contributions include: (1) the design of multiple stores each with a different emphasis on write-friendness and read-friendness; (2) utilizing data compression techniques to reduce the I/O cost; (3) exploiting both spatial and keyword information to improve the pruning efficiency; (4) proposing an analytical cost model, and using an online self-tuning method to achieve efficient accesses to different workloads. Experiments on two real-world datasets show that NQ-tree outperforms two well designed baselines by up to 10×.
Keywords:
Multidisciplinary Topics and Applications: Databases
Multidisciplinary Topics and Applications: Information Retrieval