Skip to content

Commit 0f90b4f

Browse files
Jyothirmaikottujkottujessepeq23jpeqzhuofuAMZ
authored
Fix EFA test to unblock BuildRCPipeline (#4893)
* Add PT 2.7.1 training images to release * Validate run_cmd input and polish get_package_list docstring (#4840) * Validate run_cmd input and polish get_package_list docstring * Replace dpkg subprocess grep with python-apt cache lookup * edit TOML file for testing * resolve conflict in TOML * udpate TOML * format code * add tests * uncomment build_tag_override * removed apt import * format code * align package list function calls with existing implementation * reset TOML * pin_git_modules.py * reset toml * comment build_tag_override * reset arm64 * reset arm64 * retouched get_package_list_using_command * test pytorch/inference/buildspec-2-6-sm.yml * pytorch/inference/buildspec-arm64-2-6-sm.yml * pytorch/inference/buildspec-arm64-2-6-ec2.yml * reset TOML * revert changes to use list + test * reset TOML --------- Co-authored-by: jpeq <[email protected]> Co-authored-by: zhuofuAMZ <[email protected]> * test efa * test efa * changed interface name * removed nccl debug * removed nccl debug * revert changes * do build false * test commented line * added debug statements * added debug statements again * revert cat result * revert cat result * changed log format * removed buidl changes * regular tests * ready for pr * commented out result cat line * fixed image type * revert release --------- Co-authored-by: jkottu <[email protected]> Co-authored-by: jessepeq23 <[email protected]> Co-authored-by: jpeq <[email protected]> Co-authored-by: zhuofuAMZ <[email protected]>
1 parent 2c70528 commit 0f90b4f

File tree

1 file changed

+1
-1
lines changed
  • test/dlc_tests/container_tests/bin/efa

1 file changed

+1
-1
lines changed

test/dlc_tests/container_tests/bin/efa/testEFA

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ fi
3131

3232
validate_all_reduce_performance_logs(){
3333
grep "aws-ofi-nccl" ${TRAINING_LOG} || { echo "aws-ofi-nccl is not working, please check if it is installed correctly"; exit 1; }
34-
grep "NET/OFI Selected Provider is efa" ${TRAINING_LOG} || { echo "efa is not working, please check if it is installed correctly"; exit 1; }
34+
grep -i "NET/OFI Selected [Pp]rovider is efa" ${TRAINING_LOG} || { echo "efa is not working, please check if it is installed correctly"; exit 1; }
3535
# EFA 1.37.0 using "Using network Libfabric" instead of "Using network AWS Libfabric"
3636
grep -E "Using network (AWS )?Libfabric" ${TRAINING_LOG} || { echo "efa is not working, please check if it is installed correctly"; exit 1; }
3737
if [[ ${INSTANCE_TYPE} == p4d* || ${INSTANCE_TYPE} == p5* ]]; then

0 commit comments

Comments
 (0)