C · Linux

MiniSpark

A simplified Spark-like execution engine for partitioned datasets and transformation pipelines.

Role

Systems implementation

Year

2025

Stack

C · Linux

Links

GitHub

Overview

MiniSpark executes MapReduce-style transformations on a single node using worker threads, a global task queue, locks, and condition variables. The project focuses on scheduling partition-level work as dependencies become available and materializing a DAG of transformations in the correct order.

Demo in progress...