Skip to content

Commit 98bcc50

Browse files
authored
AzureML v1 (deepspeedai#185)
* remove * to include ds_config.json file in job * add bookcorpus_data folder * note for large files (docker build context) * update docker image to upgrade deepspeed
1 parent ba79efc commit 98bcc50

File tree

3 files changed

+7
-2
lines changed

3 files changed

+7
-2
lines changed

.gitignore

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,9 +12,12 @@ dist/
1212
*.swp
1313

1414
# AML workspace config file
15-
*config.json
15+
config.json
1616

1717
.coverage_*
1818
*~
1919
slurm*
2020
logs
21+
22+
# Data folder
23+
bookcorpus_data/

examples_deepspeed/azureml/Dockerfile.dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
FROM mcr.microsoft.com/azureml/curated/acpt-pytorch-1.11-py38-cuda11.5-gpu
1+
FROM mcr.microsoft.com/azureml/curated/acpt-pytorch-1.11-cuda11.3:12
22
USER root:root
33

44
RUN pip install pybind11

examples_deepspeed/azureml/README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,5 +10,7 @@ Setup an AML workspace. Refer to: [set-up doc](https://github.com/Azure/azureml-
1010
Create AML Dataset. To run remote AML job, you need to provide AML FileDataset.
1111
Refer to [prepare_dataset script](prepare_dataset.py) to upload .bin and .idx files to blob store and on how to create FileDataset.
1212

13+
> Note: The folder `bookcorpus_data` used by [prepare_dataset script](prepare_dataset.py) should not be under `azureml` directories. It is because Azure ML does not allow to include large files (limit: 100 files or 1048576 bytes) for Docker build context.
14+
1315
# Training
1416
Run Megatron-DeepSpeed on Azure ML. Refer to [aml_submit script](aml_submit.py).

0 commit comments

Comments
 (0)