open-s4c · Merlijn-D · Mar 11, 2025 · Mar 12, 2025 · Mar 12, 2025 · Mar 12, 2025
diff --git a/README.md b/README.md
@@ -113,6 +113,7 @@ iterate over the variables and their values using the cartesian product
 (or other supported or custom methods), run the benchmark and output the
 results in a format that is friendly with plotting tools.
 
+For further documentation, see the [documentation](docs/README.md)
 
 ## Getting Started
 

diff --git a/docs/README.md b/docs/README.md
@@ -0,0 +1,9 @@
+# Documentation
+
+**This documentation is incomplete, please feel free to change or expand it.**
+
+* [getting started](getting_started.md)
+* [campaign](campaign.md)
+* [benchmark](benchmark.md)
+* [wrappers](wrappers.md)
+* [hooks](hooks.md)
diff --git a/docs/benchmark.md b/docs/benchmark.md
@@ -0,0 +1,183 @@
+# Benchmarks
+
+A benchmark is what will compile and run your code, using the variables given by the campaign.
+Because each project needs to be compiled and ran differently, there is currently no benchmark implementation that can be used, for this reason you should always make a custom benchmark.
+
+To make a custom benchmark you need to override the `Benchmark` class, and implement the required methods.
+Below is an example implementation of a custom benchmark, for more info see [building](#Building the benchmark) and [running](#Running the benchmark).
+```python
+from benchkit.campaign import Benchmark
+from benchkit.utils.dir import get_curdir
+from benchkit.shell.shellasync import AsyncProcess
+
+import pathlib
+import shutil
+from typing import Any, Dict, List
+
+
+# The code that should be benchmarked is oftentimes relatively to the location of the current file, which can be gotten using the following method
+_bench_src_path = get_curdir(__file__)
+_build_dir = _bench_src_path / "build"
+
+class MyBenchmark(Benchmark):
+    # Init method, setup all of the required variables
+    def __init__(
+        self,
+    ) -> None:
+        # Init method, calls the `__init__` function of the `Benchmark` class
+        super().__init__(
+            # See [Command wrappers](https://github.com/open-s4c/benchkit/blob/main/docs/wrappers.md)
+            command_wrappers=(),
+            command_attachments=(),
+            shared_libs=(),
+            # See [Benchmark hooks](https://github.com/open-s4c/benchkit/blob/main/docs/hooks.md)
+            pre_run_hooks=(),
+            post_run_hooks=(),
+        )
+
+    @property
+    def bench_src_path(self) -> pathlib.Path:
+        # Returns the path to the source code that needs to be benchmarked
+        return _bench_src_path
+
+    # Return all of the variables, given to the campaign, that are required to build the source code.
+    # These are the only variables that will be given to the `build_bench` method.
+    @staticmethod
+    def get_build_var_names() -> List[str]:
+        # TODO: Add your build variables here
+        return ["importantBuildVariable", "importantVariable"]
+
+    # Return all of the variables, given to the campaign, that are required to run the benchmark.
+    # These are the only variables that will be given to the `single_run` method.
+    @staticmethod
+    def get_run_var_names() -> List[str]:
+        # TODO: Add your run variables here
+        return ["importantRunVariable", "importantVariable"]
+
+    # Build the source code using the values required for the current experiment
+    def build_bench(
+        self,
+        # The variables defined in `get_build_var_names`, which values are given by the campaign
+        # TODO: Add your build variables (defined in  `get_build_var_names`) here.
+        importantBuildVariable,
+        importantVariable,
+        # The constants given to the campaign
+        constants,
+        # Holds all of the variables that are given to this method, but not used
+        **_kwargs,
+    ) -> None:
+        # Remove the build directory before rebuilding
+        if _build_dir.is_dir() and len(str(_build_dir)) > 4:
+            shutil.rmtree(str(_build_dir))
+
+        # Create build directory
+        self.platform.comm.makedirs(path=_build_dir, exist_ok=True)
+
+        # The command used to compile the code, each argument should be its own string in the array.
+        # e.g. To compile a single file using `gcc` this should be `["gcc", path_to_file]`
+        # TODO: add your build command here
+        compile_command = [ ]
+        # Run the command inside of the build directory
+        self.platform.comm.shell(
+            command=compile_command,
+            # The command is executed inside of the build directory
+            current_dir=_build_dir,
+        )
+
+    # Run the benchmark once
+    def single_run(
+        self,
+        # The variables defined in `get_run_var_names`, which values are given by the campaign
+        # TODO: Add your run variables (defined in  `get_run_var_names`) here.
+        importantRunVariable,
+        importantVariable,
+        # The constants given to the campaign
+        constants,
+        # Holds all of the variables that are given to this method, but not used
+        **kwargs,
+    ) -> str | AsyncProcess:
+        # The command used to run the benchmark, each argument should be its own string in the array.
+        # TODO: add your run command here
+        run_command = [ ]
+
+        # Run the benchmark in the build directory
+        output = self.run_bench_command(
+            run_command=run_command,
+            wrapped_run_command=run_command,
+            current_dir=_build_dir,
+            environment=None,
+            wrapped_environment=None,
+            print_output=True,
+        )
+        return output
+
+    # Parse the output from an experiment, and turn it into a dictionary, any information that is returned from this method will be added to the output `csv` file
+    def parse_output_to_results(
+        self,
+        command_output: str,
+        build_variables: Dict[str, Any],
+        run_variables: Dict[str, Any],
+        **kwargs,
+    ) -> Dict[str, Any]:
+        # This assumes that each experiments prints lines with the following format `<variable>=<value>`, delimited with `;`
+        # e.g. `var1=5;var2="value for var2"`
+        key_seq_values = command_output.strip().split(";")
+        result_dict = dict(map(lambda s: s.split("="), key_seq_values))
+        return result_dict
+```
+This benchmark can then be used for a [campaign](campaign.md).
+
+> [!NOTE]
+> The above is an, incomplete, example implementation that should be adapted to your own use case.
+> To facilitate this, `TODO`s have been added where you should change the implementation of the class to fit your needs.
+
+> [!NOTE]
+> Because of the definition of `parse_output_results` this benchmarking class expects a single output line of `scv` code as extra information.
+> If this is not what is outputted by your results, either change the output, or change the definition of `parse_output_to_results`.
+
+## Building the benchmark
+
+To build your benchmark code you have to implement the `get_build_var_names` and `build_bench` functions in your `Benchmark` class.
+
+The `get_build_var_names` function should return a list of all of the variables that are used inside of the `build_bench` function.
+The values for these variables will be supplied by the [campaign](campaign.md).
+
+The `build_bench` function should compile your benchmarking code.
+Note that this function will be called every time an experiment with different build variables (as defined by `get_build_var_names`) is started, this mean that you build folder might already contain some build artifacts and should be cleaned.
+
+The `build_bench` function should compile your benchmarking code, and is called with the following arguments:
+* `self`
+    * the self of the class
+* `benchmark_duration_seconds`
+    * How long a single benchmark should take, or `None`
+* `constants`
+    * A dictionary containing the constants of your benchmark, this is given by the [campaign](campaign.md).
+* The variables returned by `get_build_var_names`
+    * These are the variables that vary between experiments, and are given by the [campaign](campaign.md).
+If you don't need some of these variables, you can use the `**kwargs` which will contain all of the arguments that you do not specify inside of the function.
+
+## Running the benchmark
+
+To run your benchmark code you have to implement the `get_run_var_names` and `single_run` functions in your `Benchmark` class.
+
+The `get_run_var_names` function should return a list of all of the variables that are used inside of the `single_run` function.
+The values for these variables will be supplied by the [campaign](campaign.md).
+
+The `single_run` function should run a single experiment, and return either the console output, or the asynchronous process if the program is ran asynchronously.
+
+The `single_run` function should run your compiled benchmark code, and is called with the following arguments:
+* `self`
+    * the self of the class
+* `benchmark_duration_seconds`
+    * How long a single benchmark should take, or `None`
+* `constants`
+    * The constants for your benchmark, this is given by the [campaign](campaign.md).
+* `build_variables`
+    * The variables returned by `get_build_var_names`
+* `record_data_dir`
+    * The directory where the results of this experiment will be stored
+* `other_variables`
+    * The variables neither returned by `get_run_var_names` or by `get_build_var_names` 
+* The variables returned by `get_run_var_names`
+    * These are the variables that vary between experiments, and are given by the [campaign](campaign.md).
+If you don't need some of these variables, you can use the `**kwargs` which will contain all of the arguments that you do not specify inside of the function.
diff --git a/docs/campaign.md b/docs/campaign.md
@@ -0,0 +1,186 @@
+# Campaigns
+
+A campaign is the thing that runs your [benchmark](benchmark.md), a single campaign will run a single benchmark, but run it multiple times, using different build- and runtime-variables.
+
+Inside of `benchkit` there are already three campaigns implemented, below all three of these are explained
+
+The first of the campaigns is called `CampaignCartesianProduct`, will run an experiment for all sets of values for the variables, gotten when the Cartesian product is taken.
+This mean that each combination of values will be tested, using this campaign.
+For example, if you have the variables `var1` which can equal `1` or `2` and the variable `var2` which can equal `3` or `4`, you will get experiments with the following values.
+```
+var1 = 1; var2 = 3
+var1 = 1; var2 = 4
+var1 = 2; var2 = 3
+var1 = 2; var2 = 4
+```
+This kind of campaign can be useful when you want to test every possible combination of values, but it can make the amount of experiments grow very large.
+
+The second kind of campaign is the `CampaignIterateVariables`, will run an experiment for every pre-defined set of values for the given variables.
+This gives you a larger amount of control over the variables that will be combined, ensuring that the amount of experiments ran will not grow too large, even when using a large amount of variables.
+
+The last campaign is the `CampaignSuite`, this is a campaign that wraps other campaigns, instead of directly running experiments.
+This campaign takes multiple other campaigns, of any sort, and will run them, one after another.
+Note that, since `CampaignSuite` itself is not a subclass of `Campaign`, you cannot create a `CampaignSuite` that will run other `CampaignSuite`s.
+
+## Creating a campaign
+
+Creating a `CampaignCartesianProduct` and a `CampaignIterateVariables` can be done in very similar ways, because the two campaigns only differ in the way that they treat their variables, for this reason we will explain creating the two of them at the same time.
+
+```python
+from benchkit.campaign import CampaignCartesianProduct
+
+campaign = CampaignCartesianProduct(
+    # The name of your benchmark
+    name="benchmark_name",
+    # The benchmark to use
+    benchmark=Benchmark(),
+    # The amount of times each experiment should be ran
+    nb_runs=3,
+    # The variables that should be used for the experiments, this is the only thing that
+    # differs between `CampaignCartesianProduct` and `CampaignIterateVariables` 
+    variables={ },
+    # This is a variable that remains constant throughout all of the experiments that are
+    # ran in this campaign
+    constants={"importantConstant": 5},
+    # Wether or not debugging should be turned on, the actual implementation of the debugging
+    # is handled by the benchmark
+    debug=False,
+    # Wether or not gdb should be used, the way how gdb is used is handled by the benchmark
+    gdb=False,
+    # Wether to enable data directories for this campaign, see [results](#results) for more info
+    enable_data_dir=False,
+    # How to pretty print variables, this will replace certain variable values with more meaningful,
+    # values. This is only used to print certain variables in different ways.
+    pretty={"importantVaryingVariable": {5: "five", 6: "six"}},
+    ## Optional variables
+    # Can be used to limit the length that an experiment is allowed to run, actually limiting the
+    # experiment length should be implemented by the benchmark.
+    benchmark_duration_seconds = None, # Set to e.g. 5.
+    # 
+    continuing = False
+)
+```
+
+The above code snippet shows you how to initialize a campaign, the only thing that differs between `CampaignCartesianProduct` and `CampaignIterateVariables` is the `variables` argument.
+
+For `CampaignCartesianProduct` this argument requires a dictionary where each variable that should vary is assigned an array with all of its possible values.
+```python
+variables = {
+    "var1": [1, 2, 3],
+    "var2": [4, 5, 6],
+    "var3": ["very", "important", "values"]
+}
+```
+Using this dictionary for the variables in `CampaignCartesianProduct` will run `27` experiments, combining the three variables in every possible way.
+
+For `CampaignIterateVariables` this argument requires an array of dictionaries, each assigning a value to all of the values that can vary.
+```python
+variables = [
+    {
+        "var1": 1,
+        "var2": 5,
+        "var3": "very",
+    },
+    {
+        "var1": 2,
+        "var2": 5,
+        "var3": "important",
+    },
+    {
+        "var1": 2,
+        "var2": 4,
+        "var3": "values",
+    }
+]
+```
+Using this array for the variables in `CampaignIterateVariables` will run `3` experiments, each one using one dictionary to assign a value to every variable.
+
+Create a `CampaignSuite` can be done by initializing it with a list of campaigns.
+```python
+from benchkit.campaign import CampaignSuite
+
+# Add your campaigns here
+campaigns = [...]
+suite = CampaignSuite(campaigns = campaigns)
+```
+This will create a new `CampaignSuite` that will run all of the campaigns inside of `campaigns` when ran.
+
+## Running a campaign
+
+To run a campaign, you call the `run` method on that campaign.
+```python
+campaign.run()
+```
+
+If you want to run a campaign suite, you call the `run_suite` method on that suite.
+```python
+suite.run_suite()
+```
+This method also accepts a `parallel` argument, this is `false` by default, but when set to `true` the different campaigns inside of the suite will be ran in parallel.
+```python
+suite.run_suite(parallel=True)
+```
+
+You can also call the method `print_durations` on a suite to ensure that, while running, the time it took to run an experiment, and the expected time required to finish the campaign suite will be printed to the terminal.
+
+## Results
+
+When running a benchmark, all of the results will be put into the `results` folder.
+When `enable_data_dir` is disabled, all files will be placed directly into the this folder, otherwise each campaign will get its own folder, and in those folders each variable and run will also be given their own folder with more information about that particular run.
+
+The results are placed inside of `csv` files, with information stored about the system and benchmark configuration as comments, and the actual results stored as data inside of the `csv`.
+This can look like the following
+
+### Graphs
+
+`benchkit` also allows you to make graphs from the data that is collected.
+To do this you can run `generate_graph` on a finished campaign, or `generate_graphs` on a campaign suite to create a graph for each campaign, both of these methods take the same arguments.
+Calling `generate_graph` on a campaign suite will generate a graph using the combined results of all the campaigns inside of the suite.
+
+> [!NOTE]
+> When making graphs, you are required to enable data directories, this can be done by setting `enable_data_dirs` to `True` when creating the campaign, for more info see [results](#Results).
+
+These functions only require a `plot_name` as an argument, which is the name of the [`seaborn`](https://seaborn.pydata.org/) plot that should be generated.
+Afterwards you can pass optional arguments accepted by [`seaborn`](https://seaborn.pydata.org/), if the value of these arguments is the name of one of your variables (as [given](#Creating a campaign)) then `benchkit` will automatically give the correct values for that variable to [`seaborn`](https://seaborn.pydata.org/).
+This can be seen in the following example:
+```python
+suite.generate_graphs(plot_name="lineplot", x="nb_threads", y="duration", hue="elements");
+```
+This example will generate a [line plot](https://seaborn.pydata.org/generated/seaborn.lineplot.html) for every campaign in the given suite where the `x`-axis contains the amount of threads used, the `y`-axis the time it took to finish the experiment, and a different line will be created, with a different color, for each value of the variable `elements`.
+
+If you want to generate a different plot after finishing your experiments, without rerunning them, you can use the `generate_chart_from_single_csv` function.
+This function takes the same arguments as `generate_graph` but with the following extra arguments:
+* `csv_pathname`
+    * Type: `PathType`
+    * The path to the `CSV` file from which it can read the data
+* `output_dir`
+    * Type: `PathType`
+    * Default: `"/tmp/figs"`
+    * The directory in which it should place the new graph
+* `prefix`
+    * Type: `str`
+    * Default: `""`
+    * A prefix for the name of the generated file, the eventual filename will be `f"benchkit-{prefix}{timestamp}-{figure_id}.png"`
+* `nan_replace`
+    * Type: `bool`
+    * Default: `True`
+    * If `True`, replace all the `None`s in the data with `NaN`
+* `process_dataframe`
+    * Type: `DataframeProcessor`
+    * Default: `identical_dataframe`
+    * A function that can modifies the dataframe before using it
+
+This means you can generate a new graph, based on a given benchmark file without
+having to rerun your experiment using the following code:
+```python
+from benchkit.lwchart import generate_chart_from_single_csv
+
+generate_chart_from_single_csv(
+    "results/<benchmark file>.csv",
+    plot_name="histplot",
+    prefix="important_experiment-"
+    output_dir="results/",
+)
+```
+Note that, this graph will not include the name of the campaign that was run,
+if you want to add this you have to set the `prefix` argument.