Local-first data pipeline toolkit combining DLT, dbt, and DuckDB. Zero cloud dependencies, enterprise-grade processing.
Setting up a modern data pipeline shouldn't take 4+ hours of wrestling with cloud configs
Configure AWS/GCP, manage credentials, deploy infrastructure, debug networking issues...
Forced to use specific vendors, pay egress fees, and deal with vendor-specific quirks
Development environments rack up cloud bills. Data egress charges surprise you at month-end.
No cloud setup. No credentials. Just code.
One command gets you a complete ETL sandbox with DLT, dbt, and DuckDB integrated.
Execute the full pipeline locally. Watch data flow through ingestion, transformation, and analytics.
Query your data with DuckDB. Sub-second performance on millions of rows, all running locally.
Capabilities that unlock new possibilities for data teams
No waiting for cloud resources. Test ideas immediately with hot-reload dev mode.
Your data never leaves your machine. Perfect for compliance, sovereignty, or air-gapped environments.
DLT for ingestion, dbt for modeling, DuckDB for analytics. Industry-standard tools, zero config.
Watch data flow through your pipeline. Catch errors instantly. Debug with confidence.
396K+ operations per second. Process millions of rows without cloud-scale hardware.
Clear error messages. Validation at every step. Professional CLI experience.
Watch a complete pipeline run from ingestion to analytics
Watch data flow from ingestion through transformation to analytics
Data ingestion & generation
Data transformation & modeling
Analytics & querying
See how easy it is to set up and run a complete data pipeline
sbdk init <name>Create new project
sbdk run --visualRun with UI
sbdk query <sql>Execute SQL
396K+ operations per second, sub-10ms latency
SELECT * FROM analytics.user_metrics ORDER BY revenue DESC LIMIT 10| user_id | username | total_orders | revenue | avg_order_value | last_order_date |
|---|---|---|---|---|---|
| 1,247 | alice_smith | 42 | 8,940.50 | 212.87 | 2024-12-15 |
| 3,891 | bob_jones | 38 | 7,215.30 | 189.88 | 2024-12-14 |
| 5,632 | carol_white | 35 | 6,842.75 | 195.51 | 2024-12-16 |
| 2,109 | david_brown | 31 | 6,124.20 | 197.55 | 2024-12-13 |
| 7,854 | emma_davis | 29 | 5,890.15 | 203.11 | 2024-12-15 |
| 4,321 | frank_miller | 27 | 5,445.80 | 201.70 | 2024-12-12 |
| 9,087 | grace_wilson | 26 | 5,234.60 | 201.33 | 2024-12-16 |
| 1,563 | henry_moore | 24 | 4,896.40 | 204.02 | 2024-12-14 |
| 6,745 | iris_taylor | 23 | 4,678.90 | 203.43 | 2024-12-11 |
| 8,234 | jack_anderson | 22 | 4,512.30 | 205.11 | 2024-12-15 |
Powered by DuckDB - In-process analytical database with zero configuration
Teams who value speed, simplicity, and sovereignty
Tired of cloud complexity and want to iterate faster on local development.
Need enterprise-grade analytics without enterprise infrastructure.
Want to avoid cloud vendor lock-in and keep infrastructure costs low.
Require data sovereignty and local processing for compliance.
Real metrics from production usage
See how SBDK compares to traditional approaches
| Feature | SBDK | Cloud ETL | Custom Scripts |
|---|---|---|---|
| Setup Time | 30 seconds | 4+ hours | 2-3 days |
| Cloud Required | |||
| Monthly Cost | $0 | $500+ | $0 |
| Data Sovereignty | |||
| Visual Pipeline UI | |||
| Hot-Reload Dev | |||
| Production Ready | |||
| Learning Curve | Low | High | Medium |
Yes! SBDK uses battle-tested tools (DLT, dbt, DuckDB) that power production data pipelines at thousands of companies. The CLI provides professional error handling, validation, and clear error messages.
Absolutely. SBDK supports all DLT sources (APIs, databases, SaaS apps). You ingest from cloud sources but process and analyze locally, avoiding data egress costs.
DuckDB can process millions of rows in seconds on a laptop. For truly massive datasets (100GB+), you can still use SBDK for development and deploy to a larger local or on-prem machine.
Yes. SBDK core is MIT licensed and will always be free and open source. Future Team and Enterprise tiers will add collaboration features, but the core toolkit remains free forever.
Your SBDK pipelines are just Python code using standard tools. You can deploy them anywhere: Docker containers, Kubernetes, cloud VMs, or serverless functions.
SBDK gives you the complete stack (ingestion + transformation + analytics) with one command. No juggling multiple tools, configs, or databases. Everything works together out of the box.
Join data engineers who've ditched cloud complexity for local-first simplicity
uv pip install sbdk-dev && sbdk init