feat(destination): add Hotdata managed-database destination by eddietejeda · Pull Request #4013 · dlt-hub/dlt

eddietejeda · 2026-06-01T19:56:10Z

Adds Hotdata as a first-party dlt destination:

dlt.destinations.hotdata

Hotdata is a managed-database service that accepts Parquet uploads via HTTP API.

This destination implements JobClientBase and WithStateSync, the same interface as other non-SQL destinations, so it works with existing pipelines without code changes.

Feature Support

Feature	Status	Notes
`replace`	✅	Full table re-upload
`append`	✅	Permissive concat; schema drift handled
`merge / upsert`	✅	Client-side upsert by primary key
`insert-only`	✅	Insert when not matched; existing rows untouched
`truncate-and-insert`	✅	Declared replace strategy
Table nesting	✅	Unlimited (`1000`), configurable per destination
dlt metadata columns	✅	`_dlt_id`, `_dlt_load_id`, `_dlt_parent_id`, etc. preserved
Pipeline state sync	✅	`WithStateSync`; survives across runs
Schema versioning	✅	Stored in `_dlt_version` managed table
Load tracking	✅	Stored in `_dlt_loads` managed table
Auto-create database	✅	Configurable via `create_database_if_missing`
Schema evolution	✅	Delete and recreate DB with union of old and new tables
Retry / backoff	✅	Configurable retries, exponential backoff capped at 30 seconds
Error classification	✅	Transient `408`, `409`, `425`, `429`, `5xx` vs terminal
Parallelism strategy	✅	Table-sequential default, configurable
Max table nesting	✅	Configurable per destination instance
Identifier normalisation	✅	`snake_case` convention: `[a-z0-9_]`; nested tables as `parent__child`
Dataset read API	❌	Requires `SqlJobClientBase`
SCD2	❌	Requires server-side SQL
Staging area	❌	No Hotdata staging concept
Type mapper	❌	Not needed; Parquet carries its own types
Clone table	❌	No API endpoint

New Files

dlt/destinations/impl/hotdata/
├── __init__.py
├── _api_client.py       # Hotdata SDK wrapper with retry logic
├── configuration.py     # HotdataCredentials + HotdataClientConfiguration
├── contracts.py         # Identifier normalisation, TableContract
├── errors.py            # Error classification: transient vs terminal
├── factory.py           # hotdata(Destination[...]) + capabilities
├── hotdata.py           # HotdataClient + HotdataLoadJob
├── merge.py             # combine_tables for all write dispositions
└── parquet.py           # Arrow → Parquet writer

tests/load/hotdata/
└── test_hotdata_client.py   # 44 unit tests

Related Issues

N/A — new destination.

Additional Context

Merge is executed client-side: fetch existing data, combine in Arrow, then re-upload. Server-side merge is planned for the next Hotdata API release and will eliminate the need for the fetch step.
loader_parallelism_strategy defaults to table-sequential to prevent concurrent read-modify-write races on the same table.
The Hotdata API requires tables to be declared at database creation time. When a new table appears mid-pipeline, the destination deletes and recreates the database with the union of existing and new declared tables, preserving all data.

Ports the hotdata destination into dlt as a first-party destination. Hotdata uses parquet uploads to a managed database API with client-side merge logic implemented in Python/Arrow. Write dispositions: replace, append, merge, upsert, insert-only Replace strategies: truncate-and-insert Merge strategies: upsert, insert-only (dedup via _dlt_id fallback) Table nesting: unlimited (max_table_nesting=1000, configurable) State sync: WithStateSync — pipeline state, schema versioning, load tracking Metadata: all dlt columns preserved across user and internal tables Retry: exponential backoff with transient/terminal error classification Schema evolution: auto-recreates managed database with union of tables

runtimedb.local enforces lowercase [a-z0-9_] identifiers with __ as the nested table separator — exactly snake_case semantics. The direct convention passes identifiers through unchanged and uses ▶ as the separator, which the hotdata API rejects.

eddietejeda added 2 commits June 1, 2026 11:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(destination): add Hotdata managed-database destination#4013

feat(destination): add Hotdata managed-database destination#4013
eddietejeda wants to merge 2 commits into
dlt-hub:develfrom
hotdata-dev:feat/hotdata-destination

eddietejeda commented Jun 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

eddietejeda commented Jun 1, 2026

Feature Support

New Files

Related Issues

Additional Context

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant