Blog

Production grade pyspark jobs

Use additional python packages with pyspark

avatar
Dr. Georg Heiler
blazing-fast data science on GPUs featured image

blazing-fast data science on GPUs

Fast calculation of ego network using RAPIDS-AI.

avatar
Dr. Georg Heiler

Deterministic scale-out for spark jobs under increased load

Make spark jobs scale reliably using iteration

avatar
Dr. Georg Heiler

Spark and Hive 3

Get spark and Hive to play nice again on HDP 3.1

avatar
Dr. Georg Heiler

Parallel aggregation of dataframes

Use idempotency of RDD's to your advantage

avatar
Dr. Georg Heiler

Tricks for scala with gradle

Get compiler heavy scala projects to build using gradle

avatar
Dr. Georg Heiler
reproducible geospatial visualization in kepler.gl featured image

reproducible geospatial visualization in kepler.gl

Reproducible, effortless & great looking visualization of geospatial data.

avatar
Dr. Georg Heiler
Geospatial binning with hexagons on spark featured image

Geospatial binning with hexagons on spark

Bring hexagons as efficient spatial operations to spark

avatar
Dr. Georg Heiler

Spark descriptive name for cached dataframes

Display user friendly names for cached table in Spark web UI

avatar
Dr. Georg Heiler

Data links KW 28

Useful links

avatar
Dr. Georg Heiler