Skip to content

Commit

Permalink
Big Documentation Update + align naming of configspace for yaml usage
Browse files Browse the repository at this point in the history
  • Loading branch information
danrgll committed Feb 10, 2024
1 parent 8a0ebd8 commit 01fe73f
Show file tree
Hide file tree
Showing 23 changed files with 525 additions and 116 deletions.
26 changes: 13 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,9 +50,10 @@ pip install neural-pipeline-search

Using `neps` always follows the same pattern:

1. Define a `run_pipeline` function that evaluates architectures/hyperparameters for your problem
1. Define a search space `pipeline_space` of architectures/hyperparameters
1. Call `neps.run` to optimize `run_pipeline` over `pipeline_space`
1. Define a `run_pipeline` function capable of evaluating different architectural and/or hyperparameter configurations
for your problem.
2. Define a search space named `pipeline_space` of those Parameters e.g. via a dictionary
3. Call `neps.run` to optimize `run_pipeline` over `pipeline_space`

In code, the usage pattern can look like this:

Expand All @@ -69,20 +70,20 @@ def run_pipeline(
model = MyModel(architecture_parameter)

# Train and evaluate the model with your training pipeline
validation_error, test_error = train_and_eval(
validation_error, training_error = train_and_eval(
model, hyperparameter_a, hyperparameter_b
)

return { # dict or float(validation error)
"loss": validation_error,
"info_dict": {
"test_error": test_error
"training_error": training_error
# + Other metrics
},
}


# 2. Define a search space of hyperparameters; use the same names as in run_pipeline
# 2. Define a search space of parameters; use the same names for the parameters as in run_pipeline
pipeline_space = dict(
hyperparameter_b=neps.IntegerParameter(
lower=1, upper=42, is_fidelity=True
Expand Down Expand Up @@ -111,20 +112,19 @@ if __name__ == "__main__":
## Examples

Discover how NePS works through these practical examples:
* **[Pipeline Space via YAML](neps_examples/basic_usage/defining_search_space)**: Explore how to define the `pipeline_space` using a
YAML file instead of a dictionary.

* **Hyperparameter Optimization (HPO)**: Learn the essentials of hyperparameter optimization with NePS. [View Example](neps_examples/basic_usage/hyperparameters.py)
* **[Hyperparameter Optimization (HPO)](neps_examples/basic_usage/hyperparameters.py)**: Learn the essentials of hyperparameter optimization with NePS.

* **Defining Search Space with YAML**: Explore how to define the search space for your neural network models using a YAML file. [View Example](neps_examples/basic_usage/defining_search_space)
* **[Architecture Search with Primitives](neps_examples/basic_usage/architecture.py)**: Dive into architecture search using primitives in NePS.

* **Architecture Search with Primitives**: Dive into architecture search using primitives in NePS. [View Example](neps_examples/basic_usage/architecture.py)
* **[Multi-Fidelity Optimization](neps_examples/efficiency/multi_fidelity.py)**: Understand how to leverage multi-fidelity optimization for efficient model tuning.

* **Multi-Fidelity Optimization**: Understand how to leverage multi-fidelity optimization for efficient model tuning. [View Example](neps_examples/efficiency/multi_fidelity.py)

* **Utilizing Expert Priors for Hyperparameters**: Learn how to incorporate expert priors for more efficient hyperparameter selection. [View Example](neps_examples/efficiency/expert_priors_for_hyperparameters.py)
* **[Utilizing Expert Priors for Hyperparameters](neps_examples/efficiency/expert_priors_for_hyperparameters.py)**: Learn how to incorporate expert priors for more efficient hyperparameter selection.

* **[Additional NePS Examples](neps_examples/)**: Explore more examples, including various use cases and advanced configurations in NePS.


## Documentation

For more details and features please have a look at our [documentation](https://automl.github.io/neps/latest/)
Expand Down
41 changes: 28 additions & 13 deletions docs/README.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,33 @@
# Installation
# Neural Pipeline Search (NePS)

## Install from pip
[![PyPI version](https://img.shields.io/pypi/v/neural-pipeline-search?color=informational)](https://pypi.org/project/neural-pipeline-search/)
[![Python versions](https://img.shields.io/pypi/pyversions/neural-pipeline-search)](https://pypi.org/project/neural-pipeline-search/)
[![License](https://img.shields.io/pypi/l/neural-pipeline-search?color=informational)](LICENSE)
[![Tests](https://github.com/automl/neps/actions/workflows/tests.yaml/badge.svg)](https://github.com/automl/neps/actions)

```bash
pip install neural-pipeline-search
```
Welcome to NePS, a powerful and flexible Python library for hyperparameter optimization (HPO) and neural architecture search (NAS) with its primary goal: enable HPO adoption in practice for deep learners!

## Install from source
NePS houses recently published and some more well-established algorithms that are all capable of being run massively parallel on any distributed setup, with tools to analyze runs, restart runs, etc.

!!! note
We use [poetry](https://python-poetry.org/docs/) to manage dependecies.

```bash
git clone https://github.com/automl/neps.git
cd neps
poetry install --no-dev
```
## Key Features

In addition to the common features offered by traditional HPO and NAS libraries, NePS stands out with the following key features:

1. [**Hyperparameter Optimization (HPO) With Prior Knowledge:**](neps_examples/template/priorband_template.py)
- NePS excels in efficiently tuning hyperparameters using algorithms that enable users to make use of their prior knowledge within the search space. This is leveraged by the insights presented in:
- [PriorBand: Practical Hyperparameter Optimization in the Age of Deep Learning](https://arxiv.org/abs/2306.12370)
- [πBO: Augmenting Acquisition Functions with User Beliefs for Bayesian Optimization](https://arxiv.org/abs/2204.11051)

2. [**Neural Architecture Search (NAS) With Context-free Grammar Search Spaces:**](neps_examples/basic_usage/architecture.py)
- NePS is equipped to handle context-free grammar search spaces, providing advanced capabilities for designing and optimizing architectures. this is leveraged by the insights presented in:
- [Construction of Hierarchical Neural Architecture Search Spaces based on Context-free Grammars](https://arxiv.org/abs/2211.01842)

3. [**Easy Parallelization:**](docs/parallelization.md)
- NePS simplifies the parallelization of optimization tasks. Whether experiments are running on a single machine or a distributed computing environment.

4. [**Resume Runs After Termination:**](docs/parallelization.md)
- NePS allows users to easily resume optimization runs after termination, providing a convenient and efficient workflow for long-running experiments.

5. [**Seamless User Code Integration:**](neps_examples/template/)
- NePS's modular design ensures flexibility and extensibility. Integrate NePS effortlessly into existing machine learning workflows.
101 changes: 101 additions & 0 deletions docs/getting_started.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
# Getting Started

Getting started with NePS involves a straightforward yet powerful process, centering around its three main components.
This approach ensures flexibility and efficiency in evaluating different architecture and hyperparameter configurations
for your problem.

## The 3 Main Components
1. **Define a [`run_pipeline`](https://automl.github.io/neps/latest/run_pipeline) Function**: This function is essential
for evaluating different configurations. You'll implement the specific logic for your problem within this function.
For detailed instructions on initializing and effectively using `run_pipeline`, refer to the guide.

2. **Establish a [`pipeline_space`](https://automl.github.io/neps/latest/pipeline_space)**: Your search space for
defining parameters. You can structure this in various formats, including dictionaries, YAML, or ConfigSpace.
The guide offers insights into defining and configuring your search space.

3. **Execute with [`neps.run`](https://automl.github.io/neps/latest/neps_run)**: Optimize your `run_pipeline` over
the `pipeline_space` using this function. For a thorough overview of the arguments and their explanations,
check out the detailed documentation.

By following these steps and utilizing the extensive resources provided in the guides, you can tailor NePS to meet
your specific requirements, ensuring a streamlined and effective optimization process.

## Basic Usage
In code, the usage pattern can look like this:

```python
import neps
import logging


# 1. Define a function that accepts hyperparameters and computes the validation error
def run_pipeline(
hyperparameter_a: float, hyperparameter_b: int, architecture_parameter: str
) -> dict:
# insert here your own model
model = MyModel(architecture_parameter)

# insert here your training/evaluation pipeline
validation_error, training_error = train_and_eval(
model, hyperparameter_a, hyperparameter_b
)

return { # dict or float(validation error)
"loss": validation_error,
"info_dict": {
"training_error": training_error
# + Other metrics
},
}


# 2. Define a search space of the parameters of interest; ensure that the names are consistent with those defined
# in the run_pipeline function
pipeline_space = dict(
hyperparameter_b=neps.IntegerParameter(
lower=1, upper=42, is_fidelity=True
), # Mark 'is_fidelity' as true for a multi-fidelity approach.
hyperparameter_a=neps.FloatParameter(
lower=0.001, upper=0.1, log=True
), # If True, the search space is sampled in log space.
architecture_parameter=neps.CategoricalParameter(
["option_a", "option_b", "option_c"]
),
)

if __name__ == "__main__":
# 3. Run the NePS optimization
logging.basicConfig(level=logging.INFO)
neps.run(
run_pipeline=run_pipeline,
pipeline_space=pipeline_space,
root_directory="path/to/save/results", # Replace with the actual path.
max_evaluations_total=100,
searcher="hyperband" # Optional specifies the search strategy,
# otherwise NePs decides based on your data.
)
```

## Examples

Discover the features of NePS through these practical examples:

* **[Hyperparameter Optimization (HPO)](
https://github.com/automl/neps/blob/master/neps_examples/template/basic_template.py)**: Learn the essentials of
hyperparameter optimization with NePS.

* **[Architecture Search with Primitives](
https://github.com/automl/neps/tree/master/neps_examples/basic_usage/architecture.py)**: Dive into architecture search
using primitives in NePS.

* **[Multi-Fidelity Optimization](
https://github.com/automl/neps/tree/master/neps_examples/efficiency/multi_fidelity.py)**: Understand how to leverage
multi-fidelity optimization for efficient model tuning.

* **[Utilizing Expert Priors for Hyperparameters](
https://github.com/automl/neps/blob/master/neps_examples/template/priorband_template.py)**:
Learn how to incorporate expert priors for more efficient hyperparameter selection.

* **[Additional NePS Examples](
https://github.com/automl/neps/tree/master/neps_examples/)**: Explore more examples, including various use cases and
advanced configurations in NePS.
24 changes: 24 additions & 0 deletions docs/installation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# Installation

## Prerequisites

Ensure you have Python version 3.8, 3.9, 3.10, or 3.11 installed. NePS installation will automatically handle
any additional dependencies via pip.

## Install from pip

```bash
pip install neural-pipeline-search
```
> Note: As indicated with the `v0.x.x` version number, NePS is early stage code and APIs might change in the future.
## Install from source

!!! note
We use [poetry](https://python-poetry.org/docs/) to manage dependecies.

```bash
git clone https://github.com/automl/neps.git
cd neps
poetry install --no-dev
```
100 changes: 100 additions & 0 deletions docs/neps_run.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
# Configuring and Running Optimizations

The `neps.run` function is the core of the NePS optimization process, where the search for the best hyperparameters
and architectures takes place. This document outlines the arguments and options available within this function,
providing a detailed guide to customize the optimization process to your specific needs.

## Search Strategy
At default NePS intelligently selects the most appropriate search strategy based on your defined configurations in
`pipeline_space`.
The characteristics of your search space, as represented in the `pipeline_space`, play a crucial role in determining
which optimizer NePS will choose. This automatic selection process ensures that the strategy aligns perfectly
with the specific requirements and nuances of your search space, thereby optimizing the effectiveness of the
hyperparameter and/or architecture optimization. You can also manually select a specific or custom optimizer that better
matches your specific needs. For more information, refer [here](https://automl.github.io/neps/latest/optimizers).

## Arguments

### Mandatory Arguments
- **`run_pipeline`** (function): The objective function, targeted by NePS for minimization, by evaluation various
configurations. It requires these configurations as input and should return either a dictionary or a sole loss
value as the
output. For correct setup instructions, refer to [here](https://automl.github.io/neps/latest/run_pipeline)
- **`pipeline_space`** (dict | yaml | configspace): This defines the search space for the configurations from which the
optimizer samples. It accepts either a dictionary with the configuration names as keys, a path to a YAML
configuration file, or a configSpace.ConfigurationSpace object. For comprehensive information and examples,
please refer to the detailed guide available [here](https://automl.github.io/neps/latest/pipeline_space)

- **`root_directory`** (str): The directory path where the information about the optimization and its progress gets
stored. This is also used to synchronize multiple calls to run(.) for parallelization.

- **Budget**:
To define a budget, provide either or both of the following parameters:

- **`max_evaluations_total`** (int, default: None): Specifies the total number of evaluations to conduct before
halting the optimization process.
- **`max_cost_total`** (int, default: None): Prevents the initiation of new evaluations once this cost
threshold is surpassed. This requires adding a cost value to the output of the `run_pipeline` function,
for example, return {'loss': loss, 'cost': cost}. For more details, please refer
[here](https://automl.github/io/neps/latest/run_pipeline)

### Optional Arguments
##### Further Monitoring Options
- **`overwrite_working_directory`** (bool, default: False): When set to True, the working directory
specified by
`root_directory` will be
cleared at the beginning of the run. This is e.g. useful when debugging a `run_pipeline` function.
- **`post_run_summary`** (bool, default: False): When enabled, this option generates a summary CSV file
upon the
completion of the
optimization process. The summary includes details of the optimization procedure, such as the best configuration,
the number of errors occurred, and the final performance metrics.
- **`development_stage_id`** (int | float | str, default: None): An optional identifier used when working with
multiple development stages. Instead of creating new root directories, use this identifier to save the results
of an optimization run in a separate dev_id folder within the root_directory.
- **`task_id`** (int | float | str, default: None): An optional identifier used when the optimization process
involves multiple tasks. This functions similarly to `development_stage_id`, but it creates a folder named
after the task_id instead of dev_id, providing an organized way to separate results for different tasks within
the `root_directory`.
##### Parallelization Setup
- **`max_evaluations_per_run`** (int, default: None): Limits the number of evaluations for this specific call of
`neps.run`.
- **`continue_until_max_evaluation_completed`** (bool, default: False): In parallel setups, pending evaluations
normally count towards max_evaluations_total, halting new ones when this limit is reached. Setting this to
True enables continuous sampling of new evaluations until the total of completed ones meets max_evaluations_total,
optimizing resource use in time-sensitive scenarios.

For an overview and further resources on how NePS supports parallelization in distributed systems, refer to
the [Parallelization Overview](#parallelization).
##### Handling Errors
- **`loss_value_on_error`** (float, default: None): When set, any error encountered in an evaluated configuration
will not halt the process; instead, the specified loss value will be used for that configuration.
- **`cost_value_on_error`** (float, default: None): Similar to `loss_value_on_error`, but for the cost value.
- **`ignore_errors`** (bool, default: False): If True, errors encountered during the evaluation of configurations
will be ignored, and the optimization will continue. Note: This error configs still count towards
max_evaluations_total.
##### Search Strategy Customization
- **`searcher`** (Literal["bayesian_optimization", "hyperband",..] | BaseOptimizer, default: "default"): Specifies
manually which of the optimization strategy to use. Provide a string identifying one of the built-in
search strategies or an instance of a custom `BaseOptimizer`.
- **`searcher_path`** (Path | str, default: None): A path to a custom searcher implementation.
- **`**searcher_kwargs`**: Additional keyword arguments to be passed to the searcher.

For more information about the available searchers and how to customize your own, refer
[here](https://automl.github.io/neps/latest/optimizers).
##### Others
- **`pre_load_hooks`** (Iterable, default: None): A list of hook functions to be called before loading results.

## Parallelization

`neps.run` can be called multiple times with multiple processes or machines, to parallelize the optimization process.
Ensure that `root_directory` points to a shared location across all instances to synchronize the optimization efforts.
For more information [look here](https://automl.github.io/neps/latest/parallelization)

## Customization

The `neps.run` function allows for extensive customization through its arguments, enabling to adapt the
optimization process to the complexities of your specific problems.

For a deeper understanding of how to use `neps.run` in a practical scenario, take a look at our
[examples and templates](https://github.com/automl/neps/tree/master/neps_examples).
Loading

0 comments on commit 01fe73f

Please sign in to comment.