Comparing Implementation Variants Of Distributed Spatial Join on Spark
Abstract
As an increasing number of sensor devices (Internet of Things) is used, more and more spatio-temporal data becomes available. Being able to process and analyze large quantities of such datasets is therefore critical. Spatial joins in classical geo-information systems do not scale well. Nevertheless, distributed implementations are promising to solve this. Various implementation variants for distributed spatial joins are documented in literature, with some being only suitable for specific use cases. We compared broadcast and multiple variants of a distributed spatially partitioned join. We anticipate that this comparison will give guidance to when to use which implementation strategy.
Type
Publication
Poster IEEE Big Data proceedings: Comparing Implementation Variants Of Distributed Spatial Join on Spark