Skip to content

Latest commit

 

History

History
70 lines (52 loc) · 2.29 KB

File metadata and controls

70 lines (52 loc) · 2.29 KB

PyTorch Benchmark Score Version 0

This file describes how we generate the PyTorch Benchmark Score Version 0. The goal is to help users and developers understand and be able to reproduce the score.

A complete benchmarking environment consists of three parts: the hardware environment, the environment variables and the standard config YAML.

Hardware environment

We use an Amazon EC2 g4dn.metal instance as a self-hosted runner to run the benchmark configuration V0. Before running the benchmark, we do the a few tuning of the instance to minimize performance variance.

Disabling Hyperthreading

We disable hyperthreading on all CPUs using the following script:

for cpunum in $(cat /sys/devices/system/cpu/cpu*/topology/thread_siblings_list | cut -s -d, -f2- | tr ',' '\n' | sort -un)
do
echo 0 > /sys/devices/system/cpu/cpu$cpunum/online
done

CPU Isolation and NOHZ

We isolate the CPU that run the benchmark by setting the following kernel parameters:

isolcpus=24-47,72-95 nohz_full=24-47,72-95

Environment variables

All environment variables that could affect the performance score are defined in .github/scripts/config-v0.env.

For more details, please refer to the env file.

Standard Config YAML

The standard config YAML file is stored in here. It is generated by repeated runs of the same benchmark setting on pytorch v1.7.0.dev20200626, torchtext 0.8.0.dev20200626, and torchvision 0.6.1.dev20200626. The performance is manually verified to be stable across those runs. We pick a random execution of the repeated runs as the standard execution, and the standard config YAML is a summary of it.

First, the YAML defines the models that are tested in the standard execution. Below is the complete list of the models we test in V0:

  • pytorch_mobilenet_v3 (succeeded by mobilenet_v3_large in v1)
  • yolov3
  • Background_Matting
  • attention_is_all_you_need_pytorch
  • BERT_pytorch
  • fastNLP
  • dlrm
  • LearningToPaint
  • moco
  • demucs
  • pytorch_struct

Second, the YAML defines that the performance score of the standard execution is 1000. All other V0 scores are relative to it. For example, if another benchmark execution's score is 900, it means the its performance is 10% slower comparing to the standard execution.