v2.4.1 — stableOpen Source
All systems operational

Pipeline watches your data warehouse like a hawk — catching schema drift, flagging stale models, and killing zombie DAGs before they poison downstream dashboards. Built for dbt, Airflow, Spark, and every warehouse in between.

Deploy in Your StackView on GitHub
4,218 engineers trust Pipeline
Pipeline Uptime
0.0%
30-day rolling avg
Schema Changes
0
caught this week
Freshness SLA
0ms
avg check latency
Failing Tests
0
right now
pipeline — health check
$ pipeline check --all --warehouse snowflake
✓ 247 models checked
✓ 0 schema drifts detected
✓ 0 zombie DAGs found
✓ Freshness SLAs: all green
Pipeline healthy. Go to sleep.
Scroll
Interactive Tool

Pipeline Health Estimator

Input your stack size and incident history. Get an honest estimate of what Pipeline saves you per sprint.

Your Stack
120
10total in your project2,000
8
1warehouses, APIs, streams50
4
0broken dashboards, SLA misses40
120 models monitored8 source freshness checks29 alerts/week est.
Estimated Impact / Month
42h
engineering hours recovered
78%
incident reduction
29
proactive alerts/wk
$6,090
est. cost avoided @ $145/hr
Seeing real numbers?

Get a full audit of your actual pipeline with personalized recommendations.

Integrations

Works with your stack.
All of it.

One config file. Pipeline connects to every layer of your data stack and starts catching issues in under 10 minutes.

SN
Snowflake
Data Warehouse
stable

Real-time schema drift detection across all databases, schemas, and tables. Catches column drops, type changes, and constraint mutations before your dbt models break.

Schema diff Row count anomalies Freshness SLAs Query cost spikes
See the Docs
BI
BigQuery
Data Warehouse
stable

Partition freshness monitoring, slot usage anomalies, and streaming insert validation.

Partition staleness Slot anomalies Streaming inserts Cost attribution
See the Docs
DB
dbt Core
Transformation
stable

Parses your manifest.json live. Catches test failures, model staleness, and dependency graph breaks before CI does.

Test failures Model freshness DAG integrity Selector drift
See the Docs
AP
Apache Airflow
Orchestration
stable

Zombie DAG detection, task SLA monitoring, and backfill anomaly alerts. Kills stuck tasks before they block the whole graph.

Zombie DAGs Task SLAs Backfill anomalies Worker saturation
See the Docs
AP
Apache Spark
Processing
stable

Stage-level lineage tracking and executor failure pattern detection across batch and streaming jobs.

Stage failures Executor churn Schema evolution Checkpoint lag
See the Docs
DA
Databricks
Lakehouse
beta

Unity Catalog schema monitoring, Delta table freshness, and job cluster anomaly detection.

Unity Catalog diff Delta freshness Job anomalies Photon regressions
See the Docs
Also integrates with:RedshiftDuckDBTrinoFivetrandbt CloudPrefectDagster
Request Integration
Social Proof

Engineers who stopped guessing.

From 50-model startups to 2,000-model platforms — Pipeline runs in production across every scale.

47sdetection latency
SnowflakedbtAirflow
"We had a column drop in our Snowflake orders table at 11 PM on a Thursday. Pipeline caught it in 47 seconds and paged our on-call before a single dashboard went red. That's the first time in three years our data lead slept through a deploy."
Marcus Webb, Staff Analytics Engineer at Clearpath Logistics
Marcus Webb
Staff Analytics Engineer · Clearpath Logistics
~600 engineers
−94%schema incidents
dbt CoreBigQuery
"Our dbt project has 847 models. Before Pipeline, every sprint had at least two incidents that traced back to schema drift we didn't catch in CI. Now we catch them in staging."
Priya Nambiar, Data Platform Lead at Kestrel Health
Priya Nambiar
Data Platform Lead · Kestrel Health
Series C
8hsaved in week one
AirflowSparkDatabricks
"Pipeline replaced three separate monitoring tools we were duct-taping together. The YAML config took 12 minutes. The zombie DAG detector alone saved us 8 hours in the first week."
Tom Okafor, Senior Data Engineer at Forge Financial
Tom Okafor
Senior Data Engineer · Forge Financial
FinTech, NYSE-listed
99.3%SLA compliance
Snowflakedbt Cloud
"I had to present pipeline reliability numbers to our VP of Product. Pipeline gave me a freshness SLA dashboard I could point to in 10 minutes. That conversation used to be a nightmare."
Sofia Lindqvist, Head of Analytics at Nordicwave Commerce
Sofia Lindqvist
Head of Analytics · Nordicwave Commerce
€2B GMV
Open Source Health
Active
4.2k
GitHub Stars
138
Contributors
1,847
Issues Closed
47
Releases
Last commit: 6 hours agoView on GitHub
Architecture

One agent. Every layer.
Zero blind spots.

Pipeline sits between your data sources and your consumers — a single health engine that speaks to every tool in your stack.

Architecture
Live topology
SnowflakewarehouseBigQuerywarehouseAirfloworchestrationdbt CoretransformationSparkprocessingPipelinehealth engineAlertsSlack / PagerDutyDashboardSLA reportingREST APICI/CD gates
Data Sources
Pipeline Engine
Outputs
Hover nodes to trace connections
Primary Path

Deploy in Your Stack

Get a hosted sandbox connected to your actual warehouse. Pipeline running in your environment in under 10 minutes.

No credit card · No sales call · Tears down after 72h

Quick Install
Or run it yourself in 2 minutes:
$ pip install pipeline-oss
$ pipeline init --warehouse snowflake
$ pipeline check --all
✓ Connected. 247 models under watch.
Full Documentation
Config reference, alert rules, CI/CD recipes
Read the Docs
Join 1,400+ engineers
Pipeline community on Slack
Join Slack
No data leaves your VPC
SOC 2 Type II in progress
100% open source (Apache 2.0)
Sub-60s detection latency