Return Home
Open Source Data Engineering Tools
A comprehensive collection of tools for modern data engineering
1
Data Ingestion & ETL/ELT
Apache NiFi
Airbyte
Singer
Meltano
Talend Open Studio
2
Workflow Orchestration
Apache Airflow
Dagster
Luigi
3
Data Processing & Transformation
Apache Spark
Apache Flink
DBT (Data Build Tool)
Pandas
4
Data Storage & Management
Apache Hadoop (HDFS)
Apache Iceberg
Delta Lake
DuckDB
ClickHouse
5
Data Streaming
Apache Kafka
Redpanda
Apache Pulsar
Flink SQL
6
Data Warehousing & Query Engines
Presto/Trino
Apache Druid
Apache Pinot
7
Data Cataloging & Governance
Apache Atlas
Amundsen
DataHub
8
Data Quality & Monitoring
Great Expectations
Deequ
Monte Carlo