Skip to content

docs: Add documentation about benchmarks and campaigns #208

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 17 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,7 @@ iterate over the variables and their values using the cartesian product
(or other supported or custom methods), run the benchmark and output the
results in a format that is friendly with plotting tools.

For further documentation, see the [documentation](docs/README.md)

## Getting Started

Expand Down
9 changes: 9 additions & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Documentation

**This documentation is incomplete, please feel free to change or expand it.**

* [getting started](getting_started.md)
* [campaign](campaign.md)
* [benchmark](benchmark.md)
* [wrappers](wrappers.md)
* [hooks](hooks.md)
183 changes: 183 additions & 0 deletions docs/benchmark.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,183 @@
# Benchmarks

A benchmark is what will compile and run your code, using the variables given by the campaign.
Because each project needs to be compiled and ran differently, there is currently no benchmark implementation that can be used, for this reason you should always make a custom benchmark.

To make a custom benchmark you need to override the `Benchmark` class, and implement the required methods.
Below is an example implementation of a custom benchmark, for more info see [building](#Building the benchmark) and [running](#Running the benchmark).
```python
from benchkit.campaign import Benchmark
from benchkit.utils.dir import get_curdir
from benchkit.shell.shellasync import AsyncProcess

import pathlib
import shutil
from typing import Any, Dict, List


# The code that should be benchmarked is oftentimes relatively to the location of the current file, which can be gotten using the following method
_bench_src_path = get_curdir(__file__)
_build_dir = _bench_src_path / "build"

class MyBenchmark(Benchmark):
# Init method, setup all of the required variables
def __init__(
self,
) -> None:
# Init method, calls the `__init__` function of the `Benchmark` class
super().__init__(
# See [Command wrappers](https://github.com/open-s4c/benchkit/blob/main/docs/wrappers.md)
command_wrappers=(),
command_attachments=(),
shared_libs=(),
# See [Benchmark hooks](https://github.com/open-s4c/benchkit/blob/main/docs/hooks.md)
pre_run_hooks=(),
post_run_hooks=(),
)

@property
def bench_src_path(self) -> pathlib.Path:
# Returns the path to the source code that needs to be benchmarked
return _bench_src_path

# Return all of the variables, given to the campaign, that are required to build the source code.
# These are the only variables that will be given to the `build_bench` method.
@staticmethod
def get_build_var_names() -> List[str]:
# TODO: Add your build variables here
return ["importantBuildVariable", "importantVariable"]

# Return all of the variables, given to the campaign, that are required to run the benchmark.
# These are the only variables that will be given to the `single_run` method.
@staticmethod
def get_run_var_names() -> List[str]:
# TODO: Add your run variables here
return ["importantRunVariable", "importantVariable"]

# Build the source code using the values required for the current experiment
def build_bench(
self,
# The variables defined in `get_build_var_names`, which values are given by the campaign
# TODO: Add your build variables (defined in `get_build_var_names`) here.
importantBuildVariable,
importantVariable,
# The constants given to the campaign
constants,
# Holds all of the variables that are given to this method, but not used
**_kwargs,
) -> None:
# Remove the build directory before rebuilding
if _build_dir.is_dir() and len(str(_build_dir)) > 4:
shutil.rmtree(str(_build_dir))

# Create build directory
self.platform.comm.makedirs(path=_build_dir, exist_ok=True)

# The command used to compile the code, each argument should be its own string in the array.
# e.g. To compile a single file using `gcc` this should be `["gcc", path_to_file]`
# TODO: add your build command here
compile_command = [ ]
# Run the command inside of the build directory
self.platform.comm.shell(
command=compile_command,
# The command is executed inside of the build directory
current_dir=_build_dir,
)

# Run the benchmark once
def single_run(
self,
# The variables defined in `get_run_var_names`, which values are given by the campaign
# TODO: Add your run variables (defined in `get_run_var_names`) here.
importantRunVariable,
importantVariable,
# The constants given to the campaign
constants,
# Holds all of the variables that are given to this method, but not used
**kwargs,
) -> str | AsyncProcess:
# The command used to run the benchmark, each argument should be its own string in the array.
# TODO: add your run command here
run_command = [ ]

# Run the benchmark in the build directory
output = self.run_bench_command(
run_command=run_command,
wrapped_run_command=run_command,
current_dir=_build_dir,
environment=None,
wrapped_environment=None,
print_output=True,
)
return output

# Parse the output from an experiment, and turn it into a dictionary, any information that is returned from this method will be added to the output `csv` file
def parse_output_to_results(
self,
command_output: str,
build_variables: Dict[str, Any],
run_variables: Dict[str, Any],
**kwargs,
) -> Dict[str, Any]:
# This assumes that each experiments prints lines with the following format `<variable>=<value>`, delimited with `;`
# e.g. `var1=5;var2="value for var2"`
key_seq_values = command_output.strip().split(";")
result_dict = dict(map(lambda s: s.split("="), key_seq_values))
return result_dict
```
This benchmark can then be used for a [campaign](campaign.md).

> [!NOTE]
> The above is an, incomplete, example implementation that should be adapted to your own use case.
> To facilitate this, `TODO`s have been added where you should change the implementation of the class to fit your needs.

> [!NOTE]
> Because of the definition of `parse_output_results` this benchmarking class expects a single output line of `scv` code as extra information.
> If this is not what is outputted by your results, either change the output, or change the definition of `parse_output_to_results`.

## Building the benchmark

To build your benchmark code you have to implement the `get_build_var_names` and `build_bench` functions in your `Benchmark` class.

The `get_build_var_names` function should return a list of all of the variables that are used inside of the `build_bench` function.
The values for these variables will be supplied by the [campaign](campaign.md).

The `build_bench` function should compile your benchmarking code.
Note that this function will be called every time an experiment with different build variables (as defined by `get_build_var_names`) is started, this mean that you build folder might already contain some build artifacts and should be cleaned.

The `build_bench` function should compile your benchmarking code, and is called with the following arguments:
* `self`
* the self of the class
* `benchmark_duration_seconds`
* How long a single benchmark should take, or `None`
* `constants`
* A dictionary containing the constants of your benchmark, this is given by the [campaign](campaign.md).
* The variables returned by `get_build_var_names`
* These are the variables that vary between experiments, and are given by the [campaign](campaign.md).
If you don't need some of these variables, you can use the `**kwargs` which will contain all of the arguments that you do not specify inside of the function.

## Running the benchmark

To run your benchmark code you have to implement the `get_run_var_names` and `single_run` functions in your `Benchmark` class.

The `get_run_var_names` function should return a list of all of the variables that are used inside of the `single_run` function.
The values for these variables will be supplied by the [campaign](campaign.md).

The `single_run` function should run a single experiment, and return either the console output, or the asynchronous process if the program is ran asynchronously.

The `single_run` function should run your compiled benchmark code, and is called with the following arguments:
* `self`
* the self of the class
* `benchmark_duration_seconds`
* How long a single benchmark should take, or `None`
* `constants`
* The constants for your benchmark, this is given by the [campaign](campaign.md).
* `build_variables`
* The variables returned by `get_build_var_names`
* `record_data_dir`
* The directory where the results of this experiment will be stored
* `other_variables`
* The variables neither returned by `get_run_var_names` or by `get_build_var_names`
* The variables returned by `get_run_var_names`
* These are the variables that vary between experiments, and are given by the [campaign](campaign.md).
If you don't need some of these variables, you can use the `**kwargs` which will contain all of the arguments that you do not specify inside of the function.
186 changes: 186 additions & 0 deletions docs/campaign.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,186 @@
# Campaigns

A campaign is the thing that runs your [benchmark](benchmark.md), a single campaign will run a single benchmark, but run it multiple times, using different build- and runtime-variables.

Inside of `benchkit` there are already three campaigns implemented, below all three of these are explained

The first of the campaigns is called `CampaignCartesianProduct`, will run an experiment for all sets of values for the variables, gotten when the Cartesian product is taken.
This mean that each combination of values will be tested, using this campaign.
For example, if you have the variables `var1` which can equal `1` or `2` and the variable `var2` which can equal `3` or `4`, you will get experiments with the following values.
```
var1 = 1; var2 = 3
var1 = 1; var2 = 4
var1 = 2; var2 = 3
var1 = 2; var2 = 4
```
This kind of campaign can be useful when you want to test every possible combination of values, but it can make the amount of experiments grow very large.

The second kind of campaign is the `CampaignIterateVariables`, will run an experiment for every pre-defined set of values for the given variables.
This gives you a larger amount of control over the variables that will be combined, ensuring that the amount of experiments ran will not grow too large, even when using a large amount of variables.

The last campaign is the `CampaignSuite`, this is a campaign that wraps other campaigns, instead of directly running experiments.
This campaign takes multiple other campaigns, of any sort, and will run them, one after another.
Note that, since `CampaignSuite` itself is not a subclass of `Campaign`, you cannot create a `CampaignSuite` that will run other `CampaignSuite`s.

## Creating a campaign

Creating a `CampaignCartesianProduct` and a `CampaignIterateVariables` can be done in very similar ways, because the two campaigns only differ in the way that they treat their variables, for this reason we will explain creating the two of them at the same time.

```python
from benchkit.campaign import CampaignCartesianProduct

campaign = CampaignCartesianProduct(
# The name of your benchmark
name="benchmark_name",
# The benchmark to use
benchmark=Benchmark(),
# The amount of times each experiment should be ran
nb_runs=3,
# The variables that should be used for the experiments, this is the only thing that
# differs between `CampaignCartesianProduct` and `CampaignIterateVariables`
variables={ },
# This is a variable that remains constant throughout all of the experiments that are
# ran in this campaign
constants={"importantConstant": 5},
# Wether or not debugging should be turned on, the actual implementation of the debugging
# is handled by the benchmark
debug=False,
# Wether or not gdb should be used, the way how gdb is used is handled by the benchmark
gdb=False,
# Wether to enable data directories for this campaign, see [results](#results) for more info
enable_data_dir=False,
# How to pretty print variables, this will replace certain variable values with more meaningful,
# values. This is only used to print certain variables in different ways.
pretty={"importantVaryingVariable": {5: "five", 6: "six"}},
## Optional variables
# Can be used to limit the length that an experiment is allowed to run, actually limiting the
# experiment length should be implemented by the benchmark.
benchmark_duration_seconds = None, # Set to e.g. 5.
#
continuing = False
)
```

The above code snippet shows you how to initialize a campaign, the only thing that differs between `CampaignCartesianProduct` and `CampaignIterateVariables` is the `variables` argument.

For `CampaignCartesianProduct` this argument requires a dictionary where each variable that should vary is assigned an array with all of its possible values.
```python
variables = {
"var1": [1, 2, 3],
"var2": [4, 5, 6],
"var3": ["very", "important", "values"]
}
```
Using this dictionary for the variables in `CampaignCartesianProduct` will run `27` experiments, combining the three variables in every possible way.

For `CampaignIterateVariables` this argument requires an array of dictionaries, each assigning a value to all of the values that can vary.
```python
variables = [
{
"var1": 1,
"var2": 5,
"var3": "very",
},
{
"var1": 2,
"var2": 5,
"var3": "important",
},
{
"var1": 2,
"var2": 4,
"var3": "values",
}
]
```
Using this array for the variables in `CampaignIterateVariables` will run `3` experiments, each one using one dictionary to assign a value to every variable.

Create a `CampaignSuite` can be done by initializing it with a list of campaigns.
```python
from benchkit.campaign import CampaignSuite

# Add your campaigns here
campaigns = [...]
suite = CampaignSuite(campaigns = campaigns)
```
This will create a new `CampaignSuite` that will run all of the campaigns inside of `campaigns` when ran.

## Running a campaign

To run a campaign, you call the `run` method on that campaign.
```python
campaign.run()
```

If you want to run a campaign suite, you call the `run_suite` method on that suite.
```python
suite.run_suite()
```
This method also accepts a `parallel` argument, this is `false` by default, but when set to `true` the different campaigns inside of the suite will be ran in parallel.
```python
suite.run_suite(parallel=True)
```

You can also call the method `print_durations` on a suite to ensure that, while running, the time it took to run an experiment, and the expected time required to finish the campaign suite will be printed to the terminal.

## Results

When running a benchmark, all of the results will be put into the `results` folder.
When `enable_data_dir` is disabled, all files will be placed directly into the this folder, otherwise each campaign will get its own folder, and in those folders each variable and run will also be given their own folder with more information about that particular run.

The results are placed inside of `csv` files, with information stored about the system and benchmark configuration as comments, and the actual results stored as data inside of the `csv`.
This can look like the following

### Graphs

`benchkit` also allows you to make graphs from the data that is collected.
To do this you can run `generate_graph` on a finished campaign, or `generate_graphs` on a campaign suite to create a graph for each campaign, both of these methods take the same arguments.
Calling `generate_graph` on a campaign suite will generate a graph using the combined results of all the campaigns inside of the suite.

> [!NOTE]
> When making graphs, you are required to enable data directories, this can be done by setting `enable_data_dirs` to `True` when creating the campaign, for more info see [results](#Results).

These functions only require a `plot_name` as an argument, which is the name of the [`seaborn`](https://seaborn.pydata.org/) plot that should be generated.
Afterwards you can pass optional arguments accepted by [`seaborn`](https://seaborn.pydata.org/), if the value of these arguments is the name of one of your variables (as [given](#Creating a campaign)) then `benchkit` will automatically give the correct values for that variable to [`seaborn`](https://seaborn.pydata.org/).
This can be seen in the following example:
```python
suite.generate_graphs(plot_name="lineplot", x="nb_threads", y="duration", hue="elements");
```
This example will generate a [line plot](https://seaborn.pydata.org/generated/seaborn.lineplot.html) for every campaign in the given suite where the `x`-axis contains the amount of threads used, the `y`-axis the time it took to finish the experiment, and a different line will be created, with a different color, for each value of the variable `elements`.

If you want to generate a different plot after finishing your experiments, without rerunning them, you can use the `generate_chart_from_single_csv` function.
This function takes the same arguments as `generate_graph` but with the following extra arguments:
* `csv_pathname`
* Type: `PathType`
* The path to the `CSV` file from which it can read the data
* `output_dir`
* Type: `PathType`
* Default: `"/tmp/figs"`
* The directory in which it should place the new graph
* `prefix`
* Type: `str`
* Default: `""`
* A prefix for the name of the generated file, the eventual filename will be `f"benchkit-{prefix}{timestamp}-{figure_id}.png"`
* `nan_replace`
* Type: `bool`
* Default: `True`
* If `True`, replace all the `None`s in the data with `NaN`
* `process_dataframe`
* Type: `DataframeProcessor`
* Default: `identical_dataframe`
* A function that can modifies the dataframe before using it

This means you can generate a new graph, based on a given benchmark file without
having to rerun your experiment using the following code:
```python
from benchkit.lwchart import generate_chart_from_single_csv

generate_chart_from_single_csv(
"results/<benchmark file>.csv",
plot_name="histplot",
prefix="important_experiment-"
output_dir="results/",
)
```
Note that, this graph will not include the name of the campaign that was run,
if you want to add this you have to set the `prefix` argument.
Loading