dltHub AI Workbench

dlt (data load tool) is an open-source Python library for loading data from APIs and databases into a warehouse or lakehouse. dltHub (paid platform) extends dlt with enterprise-grade features tailored to the needs of coding agents: transformations, data quality validation, managed runtime infrastructure, managed data apps, and an AI-powered workspace environment.

The dltHub AI Workbench is a collection of toolkits that give AI coding assistants step-by-step workflows to build data pipelines with dlt. You can use the workbench as-is or fork and customize it for your own stack. The dlthub ai CLI installs toolkit components into the right locations for your assistant and runs the workspace MCP server.

Build toolkits cover ingestion (REST API, SQL), transformation, and data quality; Run toolkits handle deployment and exploration. The REST API toolkit is backed by the dltHub context — over 9,700 source definitions the agent queries to find verified connectors before writing code. New users can start with the quick-start toolkit for a guided end-to-end run from data to dashboard.

The dltHub AI Workbench is tested with Claude Code, Cursor, and Codex and may work with other AI coding assistants. We recommend workings in accept edits (Claude) / --approval-mode (Codex) mode to review the changes and familiarizing with dlthub AI workflows when getting started with the dlthub AI workbench.

The dlthub AI workbench supports the iterative data engineering workflow

Building data pipelines is iterative and covers two major phases — ingestion and transformations — each following the same inner loop:

Build (local development)

Develop the pipeline iteratively — for ingestion: first REST API endpoint, then additional endpoints; for transformation: data model first, then the full transformation pipeline
Explore the loaded data and validate it after each step
Loop back to refine until the pipeline is solid

Run (production)

Deploy the ingestion or transformation pipeline to production
Serve insights via data apps built on top of the loaded data

The outer loop connects the two phases: insights from the transformation and serving layer feed back into ingestion refinement. The workbench Build toolkits support the local development loop; the Run toolkits handle deployment and data apps.

dltHub AI Workbench Toolkits

The workbench gives your coding assistant toolkits — that contain a structured, guided workflow for a specific phase. Instead of generating ad-hoc code, the assistant follows a defined sequence of steps from start to finish.

A Toolkit contains skills, commands, rules, and an MCP server — tied together by a workflow that tells the assistant which skill to run at each step and how to leverage the MCP.

All toolkits depend on init for shared rules, secrets handling, and the MCP server. When using the dlthub ai CLI, init is installed automatically as a dependency. When using the Claude marketplace, install the init plugin separately.

Toolkit components

Component	What it is	When it runs
Skill	Step-by-step procedure the assistant follows	Triggered by user intent or explicitly with `/skill-name`
Command	A slash command for a specific action	User invokes with `/toolkit:command`
Rule	Always-on context (conventions, constraints)	Every session, automatically
Workflow	Ordered sequence of skills with a fixed entry point	Loaded as a rule — always active
MCP server	Exposes pipelines, tables, and secrets as tools	During a session, via MCP protocol
dltHub context	9,700+ REST API source definitions with verified connectors and pipeline patterns	During source discovery, via `search_dlthub_sources`

MCP tools

Two MCP servers give the agent structured context throughout the workflow to avoid the need for manual copy-pasting.

dlt-workspace-mcp (local, installed by dlthub ai init) exposes: data inspection tools (list_tables, preview_table, execute_sql_query, get_row_counts, display_schema, get_local_pipeline_state), secrets tools (secrets_view_redacted, secrets_update_fragment), and toolkit discovery (list_toolkits, toolkit_info).

dltHub context (remote) provides search_dlthub_sources — used by the find-source skill to search 9,700+ REST API source definitions and return verified connectors with reference links before writing code.

Available toolkits

Toolkit	Phase	Workflow entry	What it does	Example prompts	Availability
`quick-start`	Setup	`quick-start`	Guided end-to-end run from data to dashboard in 3–5 prompts; routes to the right entry skill based on a chosen depth	"Use quick-start to take me through the full workflow with the GitHub API"	Run `/quick-start:quick-start`
`bootstrap`	Setup	`/init-workspace`	Checks for `uv`, Python venv, and `dlthub`; installs what's missing; initializes the workspace; then runs `dlthub ai init` and lists available toolkits	"Run /init-workspace to set up a Python environment with dlthub"	Try it out yourself! Run `/init-workspace`
`rest-api-pipeline`	Build	`find-source`	Scaffold, debug, and validate REST API ingestion pipelines	"Use find-source to load data from the Stripe API into DuckDB"	Try it out yourself! Run `/find-source`
`sql-database-pipeline`	Build	`find-source`	Scaffold, debug, and validate SQL database ingestion pipelines	"Use find-source to load tables from my Postgres database into DuckDB"	Run `/find-source`
`filesystem-pipeline`	Build	`create-filesystem-pipeline`	Load files (CSV, Parquet, JSONL, or custom) from local disk, S3, GCS, Azure, or SFTP into a destination	"Use create-filesystem-pipeline to load my S3 CSV files into DuckDB"	Sign up
`data-exploration`	Explore	`explore-data`	Query loaded data and create marimo dashboards	"Use explore-data to explore my Stripe pipeline and create a dashboard"	Try it out yourself! Run `/explore-data`
`dlthub-platform`	Run	`setup-runtime`	Deploy pipelines to the dltHub Platform	"Use setup-runtime to deploy my pipeline to dltHub"	Sign up
`transformations`	Transform	`annotate-sources`	Design a Canonical Data Model (CDM) and write dlthub transformation functions from existing pipelines	"Use annotate-sources to start building a CDM from my HubSpot and Luma pipelines"	Sign up
`data-quality`	Build	`setup-data-quality`	Define, run, and review data quality checks and metrics on dlt pipeline data	"Use setup-data-quality to add validation checks to my Stripe pipeline"	Sign up

init is a shared dependency that provides rules, secrets handling, and the MCP server. It is installed automatically by dlthub ai init or as a separate plugin via the Claude marketplace.

Getting started

New project (recommended)

The fastest way to get started is dlthub-start — no prior setup needed:

The fastest way to start a new dltHub project:

uvx dlthub-start@latest my-workspace

This interactive prompt scaffolds a ready-to-run workspace: picks a scaffold (Starter or Minimal), installs AI workbench files for your coding assistant(s), and runs uv sync to install all dependencies. Once done:

cd my-workspace
uv run dlthub run load_breweries   # run the example pipeline on dltHub
uv run dlthub show                 # open the dltHub dashboard

Existing project

To add the AI workbench to an existing project:

Note: All dlthub ai commands below use uv run dlthub ... syntax. If you have dlthub installed globally or in an active virtual environment, you can omit uv run and call dlthub directly. We recommend using uv.

# Initialize the environment 
uv init 

# Install dlthub
uv add "dlt[hub]"

# Initialize the dlthub workspace and follow its instructions (most importantly `uv sync`)
uv run dlthub init

# Set up AI support (auto-detects your coding assistant)
uv run dlthub ai init

# If multiple coding assistants are detected, specify one explicitly:
uv run dlthub ai init --agent <agent>  # <agent>: claude | cursor | codex

dlthub ai init detects your coding assistant from environment variables and config files, then installs skills, rules, and the MCP server in the correct locations for that tool.

Claude Code note: Add the following to your CLAUDE.md to enforce safe credential handling:
CRITICAL: never ask for credentials in chat. Always let the user edit secrets directly and do not attempt to read them.

Cursor note: After running the command, manually enable the dlt-workspace-mcp server in Cursor Settings > MCP. Add the following to your .cursor/rules/security.mdc to enforce safe credential handling:
CRITICAL: never ask for credentials in chat. Always let the user edit secrets directly and do not attempt to read them.

Codex note: Codex does not support commands and rules, so the installer converts those into skills and AGENTS.md. Codex also runs in a strict sandbox — consider enabling web access in your project or global config:
# .codex/config.toml
web_search = "live"
Add the following to your AGENTS.md to enforce safe credential handling:
CRITICAL: never ask for credentials in chat. Always let the user edit secrets directly and do not attempt to read them.

Browse and install toolkits

No Python environment yet? uvx dlthub-start@latest my-workspace (see Installation) handles this in one command. If you're going through the Claude Code marketplace instead, the bootstrap toolkit's /init-workspace does the same from inside Claude Code.

uv run dlthub ai toolkit list

Install toolkits (if you are not sure which toolkits to install we recommend installing all of them):

uv run dlthub ai toolkit install quick-start
uv run dlthub ai toolkit install bootstrap
uv run dlthub ai toolkit install rest-api-pipeline
uv run dlthub ai toolkit install sql-database-pipeline
uv run dlthub ai toolkit install filesystem-pipeline
uv run dlthub ai toolkit install dlthub-platform
uv run dlthub ai toolkit install data-exploration
uv run dlthub ai toolkit install transformations
uv run dlthub ai toolkit install data-quality

Starting the workbench

Use one of the example prompts from the Available toolkits table above to kick off a workflow.

Claude Code — start a new session via claude in your terminal. Restart after installation for skills and MCP to take effect.

Cursor — open the project in Cursor and use the chat panel (Cmd+L). The installed skills and rules are picked up automatically.

Codex — launch the Codex CLI via codex or use the Codex chat in the UI. Restart Codex after setup for the MCP server to take effect.

Claude Code marketplace plugin (Early Access)

Early Access: The Claude Code plugin is currently in early access and may not provide the best linking experience between different toolkits. For new projects we recommend uvx dlthub-start@latest (see Installation). The marketplace path below is useful when you don't have uv/Python and want to bootstrap from inside Claude Code via the bootstrap toolkit.

The workbench is also available as a Claude Code plugin via the marketplace. Start a Claude Code session and run:

/plugin marketplace add dlt-hub/dlthub-ai-workbench
/plugin install init@dlthub-ai-workbench --scope project
/plugin install quick-start@dlthub-ai-workbench --scope project
/plugin install bootstrap@dlthub-ai-workbench --scope project
/plugin install rest-api-pipeline@dlthub-ai-workbench --scope project
/plugin install sql-database-pipeline@dlthub-ai-workbench --scope project
/plugin install dlthub-platform@dlthub-ai-workbench --scope project
/plugin install data-exploration@dlthub-ai-workbench --scope project
/plugin install transformations@dlthub-ai-workbench --scope project
/plugin install data-quality@dlthub-ai-workbench --scope project

Start a new session — plugins take effect only after restarting Claude Code: claude

Resuming a session? Plugins installed mid-session are not active until you start a new one.

The `dlthub ai` CLI

The dlthub ai subcommand is the bridge between the workbench and your coding assistant. dlthub ai init installs project rules, a secrets management skill, appropriate ignore files, and configures the dlt MCP server for your agent. dlthub ai toolkit install copies additional toolkit components (skills, rules, commands) into the right locations for your assistant.

Toolkit management — copies skills, rules, commands, and MCP config from the workbench into your project's agent config directory (.claude/, .cursor/, .agents/, etc.):

uv run dlthub ai status                        # show installed agent, dlthub version, active toolkits
uv run dlthub ai toolkit list                  # list available toolkits from the workbench
uv run dlthub ai toolkit info <name>           # show a toolkit's skills, commands, and workflow
uv run dlthub ai toolkit install <name>        # install a toolkit for the detected agent
uv run dlthub ai toolkit install <name> --agent <agent>  # <agent>: claude | cursor | codex  - override agent detection

Secrets management — dlt stores credentials in TOML files; these commands let the assistant inspect and update them without reading raw secret values:

uv run dlthub ai secrets list                  # show which secret files exist and where
uv run dlthub ai secrets view-redacted         # print secrets with values masked
uv run dlthub ai secrets update-fragment --path <file> '<toml>'  # merge a TOML snippet into a secrets file

MCP server — starts a local server that exposes your dlthub workspace (pipelines, schemas, tables, secrets) as tools the assistant can call:

uv run dlthub ai mcp run                       # run in SSE mode (default)
uv run dlthub ai mcp run --stdio               # run in stdio mode (for assistants that require it)
uv run dlthub ai mcp install                   # register the MCP server in the agent's config

The MCP server allows the assistant to answer questions like "what tables were loaded?" or "show me the schema" without you having to copy-paste output into the chat.

License

This project is licensed under the dltHub AI Workbench License.

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
.claude-plugin		.claude-plugin
.claude		.claude
.github/workflows		.github/workflows
evals		evals
images		images
tools		tools
workbench		workbench
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
EVALS.md		EVALS.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
REVIEW.md		REVIEW.md
TOOLKITS.md		TOOLKITS.md
product_principles.md		product_principles.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

dltHub AI Workbench

The dlthub AI workbench supports the iterative data engineering workflow

dltHub AI Workbench Toolkits

Toolkit components

MCP tools

Available toolkits

Getting started

New project (recommended)

Existing project

Browse and install toolkits

Starting the workbench

Claude Code marketplace plugin (Early Access)

The `dlthub ai` CLI

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

dltHub AI Workbench

The dlthub AI workbench supports the iterative data engineering workflow

dltHub AI Workbench Toolkits

Toolkit components

MCP tools

Available toolkits

Getting started

New project (recommended)

Existing project

Browse and install toolkits

Starting the workbench

Claude Code marketplace plugin (Early Access)

The dlthub ai CLI

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

The `dlthub ai` CLI

Packages