Open
Description
In llama3, when i open the --context-parallel-size, TE will report error:
IN te, softmax_lse_per_step and softmax_lse come from _flash_attn_forward, But when fa>2.6, the softmax_lse_per_step_resize from 3 dimension to 2 dimension, So te can not through this api,Do te support context parallel with fa>2.6 ?
Metadata
Metadata
Assignees
Labels
No labels