Pipeline — Know Before It Breaks

v2.4.1 — stableOpen Source

All systems operational

Pipeline watches your data warehouse like a hawk — catching schema drift, flagging stale models, and killing zombie DAGs before they poison downstream dashboards. Built for dbt, Airflow, Spark, and every warehouse in between.

Deploy in Your Stack View on GitHub

4,218 engineers trust Pipeline

Pipeline Uptime

0.0%

30-day rolling avg

Schema Changes

caught this week

Freshness SLA

0ms

avg check latency

Failing Tests

right now

pipeline — health check

$ pipeline check --all --warehouse snowflake

✓ 247 models checked

✓ 0 schema drifts detected

✓ 0 zombie DAGs found

✓ Freshness SLAs: all green

→ Pipeline healthy. Go to sleep.

Scroll

Interactive Tool

Pipeline Health Estimator

Input your stack size and incident history. Get an honest estimate of what Pipeline saves you per sprint.

Based on data from

340+ production deployments

Your Stack

dbt Models120

10total in your project2,000

Data Sources8

1warehouses, APIs, streams50

Incidents / Month4

0broken dashboards, SLA misses40

120 models monitored8 source freshness checks29 alerts/week est.

Estimated Impact / Month

42h

engineering hours recovered

78%

incident reduction

proactive alerts/wk

$6,090

est. cost avoided @ $145/hr

Seeing real numbers?

Get a full audit of your actual pipeline with personalized recommendations.

Integrations

Works with your stack.
All of it.

One config file. Pipeline connects to every layer of your data stack and starts catching issues in under 10 minutes.

Snowflake

Data Warehouse

stable

Real-time schema drift detection across all databases, schemas, and tables. Catches column drops, type changes, and constraint mutations before your dbt models break.

✓ Schema diff✓ Row count anomalies✓ Freshness SLAs✓ Query cost spikes

See the Docs

BigQuery

Data Warehouse

stable

Partition freshness monitoring, slot usage anomalies, and streaming insert validation.

✓ Partition staleness✓ Slot anomalies✓ Streaming inserts✓ Cost attribution

See the Docs

dbt Core

Transformation

stable

Parses your manifest.json live. Catches test failures, model staleness, and dependency graph breaks before CI does.

✓ Test failures✓ Model freshness✓ DAG integrity✓ Selector drift

See the Docs

Apache Airflow

Orchestration

stable

Zombie DAG detection, task SLA monitoring, and backfill anomaly alerts. Kills stuck tasks before they block the whole graph.

✓ Zombie DAGs✓ Task SLAs✓ Backfill anomalies✓ Worker saturation

See the Docs

Apache Spark

Processing

stable

Stage-level lineage tracking and executor failure pattern detection across batch and streaming jobs.

✓ Stage failures✓ Executor churn✓ Schema evolution✓ Checkpoint lag

See the Docs

Databricks

Lakehouse

beta

Unity Catalog schema monitoring, Delta table freshness, and job cluster anomaly detection.

✓ Unity Catalog diff✓ Delta freshness✓ Job anomalies✓ Photon regressions

See the Docs

Also integrates with:RedshiftDuckDBTrinoFivetrandbt CloudPrefectDagster

Request Integration

Social Proof

Engineers who stopped guessing.

From 50-model startups to 2,000-model platforms — Pipeline runs in production across every scale.

47sdetection latency

SnowflakedbtAirflow

"We had a column drop in our Snowflake orders table at 11 PM on a Thursday. Pipeline caught it in 47 seconds and paged our on-call before a single dashboard went red. That's the first time in three years our data lead slept through a deploy."

Marcus Webb

Staff Analytics Engineer · Clearpath Logistics

~600 engineers

−94%schema incidents

dbt CoreBigQuery

"Our dbt project has 847 models. Before Pipeline, every sprint had at least two incidents that traced back to schema drift we didn't catch in CI. Now we catch them in staging."

Priya Nambiar

Data Platform Lead · Kestrel Health

Series C

8hsaved in week one

AirflowSparkDatabricks

"Pipeline replaced three separate monitoring tools we were duct-taping together. The YAML config took 12 minutes. The zombie DAG detector alone saved us 8 hours in the first week."

Tom Okafor

Senior Data Engineer · Forge Financial

FinTech, NYSE-listed

99.3%SLA compliance

Snowflakedbt Cloud

"I had to present pipeline reliability numbers to our VP of Product. Pipeline gave me a freshness SLA dashboard I could point to in 10 minutes. That conversation used to be a nightmare."

Sofia Lindqvist

Head of Analytics · Nordicwave Commerce

€2B GMV

Open Source Health

Active

4.2k

GitHub Stars

138

Contributors

1,847

Issues Closed

Releases

Last commit: 6 hours agoView on GitHub

Architecture

One agent. Every layer.
Zero blind spots.

Pipeline sits between your data sources and your consumers — a single health engine that speaks to every tool in your stack.

Architecture

Live topology

Data Sources

Pipeline Engine

Outputs

Hover nodes to trace connections

Primary Path

Deploy in Your Stack

Get a hosted sandbox connected to your actual warehouse. Pipeline running in your environment in under 10 minutes.

Quick Install

Or run it yourself in 2 minutes:

$ pip install pipeline-oss

$ pipeline init --warehouse snowflake

$ pipeline check --all

✓ Connected. 247 models under watch.

Full Documentation

Config reference, alert rules, CI/CD recipes

Read the Docs

Join 1,400+ engineers

Pipeline community on Slack

Join Slack

No data leaves your VPC

SOC 2 Type II in progress

100% open source (Apache 2.0)

Sub-60s detection latency

Pipeline Health Estimator

Works with your stack.All of it.

Engineers who stopped guessing.

One agent. Every layer.Zero blind spots.

Deploy in Your Stack

Works with your stack.
All of it.

One agent. Every layer.
Zero blind spots.