You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* remove * to include ds_config.json file in job
* add bookcorpus_data folder
* note for large files (docker build context)
* update docker image to upgrade deepspeed
Copy file name to clipboardExpand all lines: examples_deepspeed/azureml/README.md
+2Lines changed: 2 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -10,5 +10,7 @@ Setup an AML workspace. Refer to: [set-up doc](https://github.com/Azure/azureml-
10
10
Create AML Dataset. To run remote AML job, you need to provide AML FileDataset.
11
11
Refer to [prepare_dataset script](prepare_dataset.py) to upload .bin and .idx files to blob store and on how to create FileDataset.
12
12
13
+
> Note: The folder `bookcorpus_data` used by [prepare_dataset script](prepare_dataset.py) should not be under `azureml` directories. It is because Azure ML does not allow to include large files (limit: 100 files or 1048576 bytes) for Docker build context.
14
+
13
15
# Training
14
16
Run Megatron-DeepSpeed on Azure ML. Refer to [aml_submit script](aml_submit.py).
0 commit comments