Compiler-assisted benchmarking for the study of C++ metaprogram compile times.
- Github project: https://github.com/jpenuchot/ctbench
- Online documentation: https://jpenuchot.github.io/ctbench-docs/
- Discord server: https://discord.gg/NvJFFrdS7p
ctbench allows you to declare and generate compile-time benchmark batches for given ranges, run them, aggregate and wrangle Clang profiling data, and plot them.
The project was made to fit the needs of scientific data collection and analysis, thus it is not a one-shot profiler, but a set of tools that enable reproductible data gathering from user-defined, variably sized compile-time benchmarks using Clang's time-trace feature to understand the impact of metaprogramming techniques on compile time. On top of that, ctbench is also able to measure compiler execution time to support compilers that do not have built-in profilers like GCC.
It has two main components: a C++ plotting toolset that can be used as a CLI program and as a library, and a CMake boilerplate library to generate benchmark and graph targets.
The CMake library contains all the boilerplate code to define benchmark targets
compatible with the C++ plotting toolset called grapher
.
Rule of Cheese can be used as an example project for using ctbench.
As an example here are benchmark curves from the Poacher project. The benchmark case sources are available here.
Clang ExecuteCompiler time curve from
poacher,
generated by the compare_by
plotter
Clang Total Frontend time curve from
poacher,
generated by the compare_by
plotter
ArchLinux and Ubuntu 23.04 are officially supported as tests are compiled and executed on both of these Linux distributions. Others including Fedora or any other Linux distro that provides CMake 3.25 or higher should be compatible.
-
Required ArchLinux packages:
boost boost-libs catch2 clang cmake curl fmt git llvm llvm-libs ninja nlohmann-json tar tbb unzip zip
-
Required Ubuntu packages:
catch2 clang cmake curl git libboost-all-dev libclang-dev libfmt-dev libllvm15 libtbb-dev libtbb12 llvm llvm-dev ninja-build nlohmann-json3-dev pkg-config tar unzip zip
The Sciplot library is required too. It can be installed on ArchLinux using the
sciplot-git
AUR package
(NB: the non-git package isn't up-to-date). Otherwise, you can install it for
your whole system using CMake or locally
using vcpkg:
git clone https://github.com/Microsoft/vcpkg.git
./vcpkg/bootstrap-vcpkg.sh
./vcpkg/vcpkg install sciplot fmt
cmake --preset release \
-DCMAKE_TOOLCHAIN_FILE=vcpkg/scripts/buildsystems/vcpkg.cmake
Note: The fmt
dependency is needed, as vcpkg breaks fmt's CMake integration if
you have it already installed.
git clone https://github.com/jpenuchot/ctbench
cd ctbench
cmake --preset release
cmake --build --preset release
sudo cmake --build --preset release --target install
An AUR package is available for easier install and update.
ctbench can be integrated to a CMake project using find_package
:
find_package(ctbench REQUIRED)
The example project is provided as a reference project for ctbench integration and usage. For more details, an exhaustive CMake API reference is available.
A benchmark case is represented by a C++ file. It will be "instanciated", ie.
compiled with BENCHMARK_SIZE
defined to values in a range that you provide.
BENCHMARK_SIZE
is intended to be used by the preprocessor to generate a
benchmark instance of the desired size:
#include <boost/preprocessor/repetition/repeat.hpp>
// First we generate foo<int>().
// foo<int>() uses C++20 requirements to dispatch function calls accross 16
// of its instances, according to the value of its integer template parameter.
#define FOO_MAX 16
#define DECL(z, i, nope) \
template <int N> \
requires(N % FOO_MAX == i) constexpr int foo() { return N * i; }
BOOST_PP_REPEAT(BENCHMARK_SIZE, DECL, FOO_MAX);
#undef DECL
// Now we generate the sum() function for instanciation
int sum() {
int i;
#define CALL(z, n, nop) i += foo<n>();
BOOST_PP_REPEAT(BENCHMARK_SIZE, CALL, i);
#undef CALL
return i;
}
By default, only compiler execution time is measured. If you want to generate plots using Clang's profiler data, add the following:
add_compile_options(-ftime-trace -ftime-trace-granularity=1)
Note that plotting profiler data takes more time and will generate a lot of plot files.
Then you can declare a benchmark case target in CMake with the following:
ctbench_add_benchmark(function_selection.requires # Benchmark case name
function_selection-requires.cpp # Benchmark case file
1 # Range begin
32 # Range end
1 # Range step
10) # Iterations per size
Once you have several benchmark cases, you can start writing a graph config.
Example configs can be found here, or by running
ctbench-grapher-utils --plotter=<plotter> --command=get-default-config
. A list
of available plotters can be retrieved by running
ctbench-grapher-utils --help
.
{
"plotter": "compare_by",
"demangle": true,
"draw_average": true,
"draw_points": true,
"key_ptrs": [
"/name",
"/args/detail"
],
"legend_title": "Timings",
"plot_file_extensions": [
".svg",
".png"
],
"value_ptr": "/dur",
"width": 1500,
"height": 500,
"x_label": "Benchmark size factor",
"y_label": "Time (µs)"
}
This configuration uses the compare_by
plotter. It compares features targeted
by the JSON pointers in key_ptrs
across all benchmark cases. This is the
easiest way to extract and compare as many relevant time-trace features at once.
Back to CMake, you can now declare a graph target using this config to compare the time spent in the compiler execution, the frontend, and the backend between the benchmark cases you declared previously:
ctbench_add_graph(function_selection-feature_comparison-graph # Target name
${CONFIGS}/feature_comparison.json # Config
function_selection.enable_if # First case
function_selection.enable_if_t # Second case
function_selection.if_constexpr # ...
function_selection.control
function_selection.requires)
For each group descriptor, a graph will be generated with one curve
per benchmark case. In this case, you would then get 3 graphs
(ExecuteCompiler
, Frontend
, and Backend
) each with 5 curves (enable_if
,
enable_if_t
, if_constexpr
, control
, and requires
).
- ctbench: compile time benchmarking for Clang at CPPP 2021
- A totally constexpr standard library - Paul Keir, Joel Falcou et al - Meeting C++ 2022
- Pyperf - Tune the system for benchmarks
- Metabench
@article{Penuchot2023,
doi = {10.21105/joss.05165},
url = {https://doi.org/10.21105/joss.05165},
year = {2023},
publisher = {The Open Journal},
volume = {8},
number = {88},
pages = {5165},
author = {Jules Penuchot and Joel Falcou},
title = {ctbench - compile-time benchmarking and analysis},
journal = {Journal of Open Source Software},
}