You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm using -c config.yaml to pass config.
When "attributes" is a list of multiple elements:
uv run dolma -c analyze.yaml stat
attributes:
- cudo/attributes/c4_v2
- cudo/attributes/pii_regex_with_counts_fast_v2
bins: 20
debug: false
processes: 1
regex: null
report: stats
seed: 0
total: true
work_dir:
input: null
output: null
Traceback (most recent call last):
File "/home/user/Projects/llm-data-prep-1/.venv/bin/dolma", line 10, in<module>sys.exit(main())
^^^^^^
File "/home/user/Projects/llm-data-prep-1/.venv/lib/python3.11/site-packages/dolma/cli/__main__.py", line 93, in main
return cli.run_from_args(args=args, config=config)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/Projects/llm-data-prep-1/.venv/lib/python3.11/site-packages/dolma/cli/__init__.py", line 192, in run_from_args
return cls.run(parsed_config)
^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/Projects/llm-data-prep-1/.venv/lib/python3.11/site-packages/dolma/cli/analyzer.py", line 76, in run
create_and_run_analyzer(
File "/home/user/Projects/llm-data-prep-1/.venv/lib/python3.11/site-packages/dolma/core/analyzer.py", line 321, in create_and_run_analyzer
analyzer = AnalyzerProcessor(
^^^^^^^^^^^^^^^^^^
File "/home/user/Projects/llm-data-prep-1/.venv/lib/python3.11/site-packages/dolma/core/parallel.py", line 154, in __init__
raise ValueError(
ValueError: The number of source and destination prefixes must be the same (got 2 and 1)
But when I just provide one attribute, it works well.
Env info: Python 3.11.11, Ubuntu 24.04.2 LTS
The text was updated successfully, but these errors were encountered:
I'm using -c config.yaml to pass config.
When "attributes" is a list of multiple elements:
But when I just provide one attribute, it works well.
Env info: Python 3.11.11, Ubuntu 24.04.2 LTS
The text was updated successfully, but these errors were encountered: