We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Describe the bug
GPT2 Models
num-layers
pipeline-model-parallel-size
decoder-first-pipeline-num-layers
decoder-last-pipeline-num-layers
stage1: 0,1,2,3,4,5,6 stage2: 7,8,9,10,11,12,13 stage3: 14,15,16,17,18,19,20 stage4: 21,22,23,24,25,26,27
sum layers: 28 layers not equal to 30 layers.
In the legacy version, there is a judgment on the number of model layers.
In the Mcore version, only num-layers-per-virtual-pipeline-stage can be used to determine the number of model layers.
num-layers-per-virtual-pipeline-stage
I think if users are required to split the model layer themselves due to imbalance, judgment and necessary warnings should be added here.
Environment (please complete the following information):
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Describe the bug
Situation:
GPT2 Models
num-layers
=30,pipeline-model-parallel-size
=4decoder-first-pipeline-num-layers
anddecoder-last-pipeline-num-layers
Segmentation results
stage1: 0,1,2,3,4,5,6
stage2: 7,8,9,10,11,12,13
stage3: 14,15,16,17,18,19,20
stage4: 21,22,23,24,25,26,27
sum layers: 28 layers not equal to 30 layers.
In the legacy version, there is a judgment on the number of model layers.
In the Mcore version, only
num-layers-per-virtual-pipeline-stage
can be used to determine the number of model layers.I think if users are required to split the model layer themselves due to imbalance, judgment and necessary warnings should be added here.
Environment (please complete the following information):
The text was updated successfully, but these errors were encountered: