Neptune Exporter is a CLI tool to move Neptune experiments (version 2.x or 3.x) to disk as parquet and files, with an option to load them into MLflow or Weights & Biases.
- Streams runs from Neptune to local storage. Artifacts are downloaded alongside the parquet.
- Skips runs that were already exported (presence of
part_0.parquet), making exports resumable. - Loads parquet data into MLflow or W&B while preserving run structure (forks, steps, attributes) as closely as possible.
- Prints a human-readable summary of what is on disk.
- Python
3.13, managed via uv. - Neptune credentials:
- API token, set with the
NEPTUNE_API_TOKENenvironment variable or the--api-tokenoption. - Project path, set with the
NEPTUNE_PROJECTenvironment variable or the--project-idsoption.
- API token, set with the
- Target credentials when loading:
- MLflow tracking URI, set with
MLFLOW_TRACKING_URIor--mlflow-tracking-uri. - W&B entity and API key, set with
WANDB_ENTITY/--wandb-entityandWANDB_API_KEY/--wandb-api-key.
- MLflow tracking URI, set with
Important
This project is not published on PyPI. Clone the Git repository and run it directly with uv.
Install dependencies in the repo:
uv syncRun the CLI:
uv run neptune-exporter --helpuv run neptune-exporter export \
-p "my-org/my-project" \
--exporter neptune3 \
--data-path ./exports/data \
--files-path ./exports/filesOptions:
| Option | Description |
|---|---|
--exporter (required) |
neptune3 or neptune2. Use the version corresponding to your workspace. For help, see the migration overview. |
-r/--runs |
Neptune run ID filter, regex supported.
|
-a/--attributes |
One value is treated as a regex. Multiple values are treated as exact attribute names. |
-c/--classes and --exclude |
Include or exclude certain data types. Arguments: parameters, metrics, series, or files. |
--include-archived-runs |
Include archived/trashed runs. |
--api-token |
Pass the token explicitly instead of using the NEPTUNE_API_TOKEN environment variable. |
--no-progress, -v/--verbose, --log-file |
Progress and logging controls for the CLI. |
Export everything from a project:
uv run neptune-exporter export -p "workspace/proj" --exporter neptune3Export only parameters and metrics from runs matching a pattern:
uv run neptune-exporter export -p "workspace/proj" --exporter neptune3 -r "RUN-.*" -c parameters -c metricsExport specific attributes by pattern:
uv run neptune-exporter export -p "workspace/proj" --exporter neptune3 -a "metrics/accuracy" -a "metrics/loss" -a "config/.*"Export with Neptune 2.x client, splitting data and files to different locations:
uv run neptune-exporter export -p "workspace/proj" --exporter neptune2 --data-path /mnt/fast/exports/data --files-path /mnt/cold/exports/filesuv run neptune-exporter summary --data-path ./exports/data# MLflow
uv run neptune-exporter load \
--loader mlflow \
--mlflow-tracking-uri "http://localhost:5000" \
--data-path ./exports/data \
--files-path ./exports/files
# W&B
uv run neptune-exporter load \
--loader wandb \
--wandb-entity my-org \
--wandb-api-key "$WANDB_API_KEY" \
--data-path ./exports/data \
--files-path ./exports/filesNote
MLflow and W&B only accept integers. If your Neptune steps contain decimals, use the --step-multiplier option to convert the step values to integers. Pick a single multiplier (e.g. 1000 for millisteps) and use it consistently for all loads so that every series stays aligned.
Default is 1 (no scaling).
- Parquet path:
- Default:
./exports/data - One directory per project, sanitized for filesystem safety (digest suffix added) but the parquet columns keep the real
project_id/run_id. - Each run is split into
run_id_part_<n>.parquet(Snappy-compressed). Parts roll over around 50 MB compressed.
- Default:
- Files path
- Default:
./exports/files - Mirrors the sanitized project directory. File artifacts and file series are saved relative to that root.
- Kept separate from the parquet path so you can place potentially large artifacts on different storage.
- Default:
All records use src/neptune_exporter/model.py::SCHEMA:
| Column | Type | Description |
|---|---|---|
project_id |
string |
Neptune project path, in the form workspace-name/project-name. |
run_id |
string |
Neptune run identifier.
|
attribute_path |
string |
Full attribute path. For example, metrics/accuracy, metrics/loss, files/dataset_desc.json |
attribute_type |
string |
One of: float, int, string, bool, datetime, string_set, float_series, string_series, histogram_series, file, file_series |
step |
decimal(18,6) |
Decimal step value, per series/metric/file series |
timestamp |
timestamp(ms, UTC) |
Populated for time-based records (metrics/series/file series). Null for parameters and files. |
int_value / float_value / string_value / bool_value / datetime_value / string_set_value |
typed columns | Payload depending on attribute_type |
file_value |
struct{path} |
Relative path to downloaded file payload |
histogram_value |
struct{type,edges,values} |
Histogram payload |
- Runs are listed per project and streamed in batches. Already-exported runs (those with
part_0.parquet) are skipped so reruns are resumable.
Warning
Use this with care: if a run was exported and later received new data in Neptune, that new data will not be picked up unless you re-export to a fresh location.
- Data is written per run into parquet parts (~50 MB compressed per part), keeping memory usage low.
- Artifacts and file series are downloaded alongside parquet under
--files-path/<sanitized_project_id>/....
Note
A run is considered complete once part_0.parquet exists. If you need a clean re-export, use a fresh --data-path.
- Data is streamed run-by-run from parquet, using the same
--step-multiplierto turn decimal steps into integers. Keep the multiplier consistent across loads when your Neptune steps are floats. - MLflow loader:
- Experiments are named
<project_id>/<experiment_name>, prefixed with--name-prefixif provided. - Attribute paths are sanitized to MLflow’s key rules (alphanumeric +
_-. /, max 250 chars). - Metrics/series use the integer step. Files are uploaded as artifacts from
--files-path. - MLflow saves parentship/fork relationships as tags (no native forks).
- Experiments are named
- W&B loader:
- Requires
--wandb-entity. Project names derive fromproject_id, plus optional--name-prefix, sanitized. - String series become W&B Tables, histograms use
wandb.Histogram, files/file series become artifacts. Forked runs from Neptune3.xare handled best-effort (W&B has limited preview support).
- Requires
- If a target run with the same name already exists in the experiment or project, the loader skips uploading that run to avoid duplicates.
- MLflow:
- Each unique
project_id+sys/namepair becomes an MLflow experiment named<project_id>/<sys/name>(prefixed by--name-prefixif provided). - Runs are created inside that experiment using Neptune
run_id(orcustom_run_idwhen present) as the run name. Fork relationships are ignored by MLflow.
- Each unique
- W&B:
- Neptune
project_idmaps to the W&B project name (sanitized, plus optional--name-prefix). sys/namebecomes the W&B group, so all runs with the samesys/nameland in the same group.- Runs are created with their Neptune
run_id(orcustom_run_id) as the run name. Forks from Neptune3.xare mapped best-effort viafork_from; behavior depends on W&B's fork support.
- Neptune
- Parameters (
float,int,string,bool,datetime,string_set):- MLflow: logged as params (values stringified by the client).
- W&B: logged as config with native types (string_set → list).
- Float series (
float_series):- Both targets: logged as metrics using the integer step (
--step-multiplierapplied). - Timestamps are forwarded when present.
- Both targets: logged as metrics using the integer step (
- String series (
string_series):- MLflow: saved as artifacts (one text file per series).
- W&B: logged as a Table with columns
step,value,timestamp.
- Histogram series (
histogram_series):- MLflow: uploaded as artifacts containing the histogram payload.
- W&B: logged as
wandb.Histogram.
- Files (
file) and file series (file_series):- Downloaded to
--files-path/<sanitized_project_id>/...with relative paths stored infile_value.path. - MLflow/W&B: uploaded as artifacts. File series include the step in the artifact name/path so steps remain distinguishable.
- Downloaded to
- Attribute names:
- MLflow: sanitized to allowed chars (alphanumeric +
_-. /), truncated at 250 chars. - W&B: sanitized to allowed pattern (
^[_a-zA-Z][_a-zA-Z0-9]*$); invalid chars become_, and names are forced to start with a letter or underscore.
- MLflow: sanitized to allowed chars (alphanumeric +
For details on Neptune attribute types, see the documentation.
The uv run neptune-exporter summary command reads parquet files and prints counts of projects and runs, attribute type breakdowns, and basic step stats to help you verify the export before loading.
To learn more about the Neptune acquisition and shutdown, see the transition hub.
Apache 2.0. See LICENSE.txt.