Skip to content

Commit

Permalink
Add note to start_index parameter
Browse files Browse the repository at this point in the history
  • Loading branch information
miguelusque committed May 7, 2024
1 parent 37aa823 commit 11412bd
Showing 1 changed file with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion nemo_curator/utils/file_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -222,7 +222,8 @@ def reshard_jsonl(
output_dir: The output directory where the resharded jsonl files will be written
output_file_size: Approximate size of output files. Must specify with a string and
with the unit K, M or G for kilo, mega or gigabytes
start_index: Starting index for naming the output files
start_index: Starting index for naming the output files. Note: The indices may not
be continuous if the sharding process would output an empty file in its place
file_prefix: Prefix to use to prepend to output file number
"""

Expand Down

0 comments on commit 11412bd

Please sign in to comment.