Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pipeline_space as yaml input #46

Merged
merged 24 commits into from
Dec 27, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
4cf0981
add yaml to dict function and make it callable in api run
danrgll Nov 20, 2023
c1a0402
add tests, adapt pipeline_space_from_yaml(), fix some errors raised b…
danrgll Nov 25, 2023
65e340f
enable to run tests for yaml
danrgll Nov 25, 2023
155ed07
Merge branch 'master' into yaml_config_space
danrgll Nov 25, 2023
a62c1cc
adapt tests and documentation for yaml_search_space
danrgll Nov 25, 2023
c41a346
adapt tests and documentation for yaml_search_space
danrgll Nov 25, 2023
505519b
add documentation for config_space
danrgll Nov 25, 2023
2ac8118
Merge branch 'master' into yaml_config_space
danrgll Nov 27, 2023
5096129
clean up tests
danrgll Nov 27, 2023
e8d586c
add usage of e and 10^ for exponent format as input for yaml file + n…
danrgll Dec 3, 2023
f02e8da
fix issue regarding tests and search space from yaml file
danrgll Dec 4, 2023
3935c21
change tests to marker neps-api and resolve merge conflicts
danrgll Dec 4, 2023
04288ff
resolve merge conflicts
danrgll Dec 4, 2023
65a82e5
add examples + test adaptation to new functionalities + outsorcing ut…
danrgll Dec 5, 2023
9e1bff8
changes in yaml_search_space examples
danrgll Dec 5, 2023
40ce160
remove 10^ notation + introduce key checking for parameters + enable…
danrgll Dec 5, 2023
2723c09
made code more readable for validate parameter inputs + add tests + c…
danrgll Dec 6, 2023
5bcf6cf
merge master into branch
danrgll Dec 6, 2023
15d4cc8
fix naming of parameters in test
danrgll Dec 6, 2023
c70b85e
enable usage of Path object for yaml_file config_space
danrgll Dec 6, 2023
91d8a45
add type specification for arguments + add more detailed DocStrings f…
danrgll Dec 6, 2023
5d11639
fix format of Pipeline Space Documentation for mkdocs
danrgll Dec 7, 2023
075c79b
fix test cases
danrgll Dec 8, 2023
676ac4d
update to current master
danrgll Dec 27, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
158 changes: 158 additions & 0 deletions docs/pipeline_space.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,158 @@
# Initializing the Search Space

In NePS, defining the Search Space is one of two essential tasks. You can define it either through a Python dictionary
,YAML file or ConfigSpace. This section provides examples and instructions for both methods.

## Option 1: Using a Python Dictionary

To define the Search Space using a Python dictionary, follow these steps:

Create a Python dictionary that specifies the parameters and their respective ranges. For example:

```python
search_space = {
"learning_rate": neps.FloatParameter(lower=0.00001, upper=0.1, log=True),
"num_epochs": neps.IntegerParameter(lower=3, upper=30, is_fidelity=True),
"optimizer": neps.CategoricalParameter(choices=["adam", "sgd", "rmsprop"]),
"dropout_rate": neps.FloatParameter(value=0.5),
}
```

## Option 2: Using a YAML File

Create a YAML file (e.g., search_space.yaml) with the parameter definitions following this structure.

```yaml
search_space: # important to start with
learning_rate:
lower: 2e-3
upper: 0.1
log: true

num_epochs:
type: int # or "integer"
lower: 3
upper: 30
is_fidelity: True

optimizer:
choices: ["adam", "sgd", "rmsprop"]

dropout_rate:
value: 0.5
...
```

Ensure your YAML file starts with `search_space:`.
This is the root key under which all parameter configurations are defined.

## Option 3: Using ConfigSpace

For users familiar with the ConfigSpace library, can also define the Search Space through
ConfigurationSpace()

```python
from configspace import ConfigurationSpace, UniformFloatHyperparameter

configspace = ConfigurationSpace()
configspace.add_hyperparameter(
UniformFloatHyperparameter("learning_rate", 0.00001, 0.1, log=True)
)
```

For additional information on ConfigSpace and its features, please visit the following link:
https://github.com/automl/ConfigSpace

## Supported Hyperparameter Types using a YAML File

### Float/Integer Parameter

- **Expected Arguments:**
- `lower`: The minimum value of the parameter.
- `upper`: The maximum value of the parameter.
- Accepted Values: Int or Float depending on the specific parameter type one wishes to use.
- **Optional Arguments:**
- `type`: Specifies the data type of the parameter.
- Accepted Values: 'int', 'integer', or 'float'.
- Note: If type is not specified e notation gets converted to float
- `log`: Boolean that indicates if the parameter uses a logarithmic scale (default: False)
- [Details on how YAML interpret Boolean Values](#important-note-on-yaml-string-and-boolean-interpretation)
- `is_fidelity`: Boolean that marks the parameter as a fidelity parameter (default: False).
- `default`: Sets a prior central value for the parameter (default: None).
- Note: Currently, if you define a prior for one parameter, you must do so for all your variables.
- `default_confidence`: Specifies the confidence level of the default value,
indicating how strongly the prior
should be considered (default: "low").
- Accepted Values: 'low', 'medium', or 'high'.

### Categorical Parameter

- **Expected Arguments:**
- `choices`: A list of discrete options(int | float | str) that the parameter can take.
- **Optional Arguments:**
- `type`: Specifies the data type of the parameter.
- Accepted Values: 'cat' or 'categorical'.
- `is_fidelity`: Marks the parameter as a fidelity parameter (default: False).
- [Details on how YAML interpret Boolean Values](#important-note-on-yaml-string-and-boolean-interpretation)
- `default`: Sets a prior central value for the parameter (default: None).
- Note: Currently, if you define a prior for one parameter, you must do so for all your variables.
- `default_confidence`: Specifies the confidence level of the default value,
indicating how strongly the prior
should be considered (default: "low").

### Constant Parameter

- **Expected Arguments:**
- `value`: The fixed value(int | float | str) for the parameter.
- **Optional Arguments:**
- `type`: Specifies the data type of the parameter.
- Accepted Values: 'const' or 'constant'.
- `is_fidelity`: Marks the parameter as a fidelity parameter (default: False).

### Important Note on YAML Data Type Interpretation

When working with YAML files, it's essential to understand how the format interprets different data types:

1. **Strings in Quotes:**

- Any value enclosed in single (`'`) or double (`"`) quotes is treated as a string.
- Example: `"true"`, `'123'` are read as strings.

2. **Boolean Interpretation:**

- Specific unquoted values are interpreted as booleans. This includes:
- `true`, `True`, `TRUE`
- `false`, `False`, `FALSE`
- `on`, `On`, `ON`
- `off`, `Off`, `OFF`
- `yes`, `Yes`, `YES`
- `no`, `No`, `NO`

3. **Numbers:**

- Unquoted numeric values are interpreted as integers or floating-point numbers, depending on their format.
- Example: `123` is an integer, `4.56` is a float, `1e3` can be either an integer or a floating-point number,
depending on the type specified by the user. By default, 1e3 is treated as a floating-point number.
This interpretation is unique to our system.

4. **Empty Strings:**

- An empty string `""` or a key with no value is always treated as `null` in YAML.

5. **Unquoted Non-Boolean, Non-Numeric Strings:**

- Unquoted values that don't match boolean patterns or numeric formats are treated as strings.
- Example: `example` is a string.

Remember to use appropriate quotes and formats to ensure values are interpreted as intended.

## Supported ArchitectureParameter Types

**Note**: The definition of Search Space from a YAML file is limited to supporting only Hyperparameter Types.

If you are interested in exploring Architecture, particularly Hierarchical parameters, you can find detailed examples and usage in the following resources:

- [Basic Usage Examples](https://github.com/automl/neps/tree/master/neps_examples/basic_usage) - Basic usage
examples that can help you understand the fundamentals of Architecture parameters.

- [Experimental Examples](https://github.com/automl/neps/tree/master/neps_examples/experimental) - For more advanced and experimental use cases, including Hierarchical parameters, check out this collection of examples.
14 changes: 10 additions & 4 deletions neps/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,11 @@
from .optimizers import BaseOptimizer, SearcherMapping
from .plot.tensorboard_eval import tblogger
from .search_spaces.parameter import Parameter
from .search_spaces.search_space import SearchSpace, pipeline_space_from_configspace
from .search_spaces.search_space import (
SearchSpace,
pipeline_space_from_configspace,
pipeline_space_from_yaml,
)
from .status.status import post_run_csv
from .utils.common import get_searcher_data, get_value
from .utils.result_utils import get_loss
Expand Down Expand Up @@ -94,9 +98,8 @@ def write_loss_and_config(file_handle, loss_, config_id_, config_):
def run(
run_pipeline: Callable,
root_directory: str | Path,
pipeline_space: dict[str, Parameter | CS.ConfigurationSpace]
| CS.ConfigurationSpace
| None = None,
pipeline_space: dict[str, Parameter | CS.ConfigurationSpace] | str | Path |
CS.ConfigurationSpace | None = None,
overwrite_working_directory: bool = False,
post_run_summary: bool = False,
development_stage_id=None,
Expand Down Expand Up @@ -311,6 +314,9 @@ def _run_args(
# Support pipeline space as ConfigurationSpace definition
if isinstance(pipeline_space, CS.ConfigurationSpace):
pipeline_space = pipeline_space_from_configspace(pipeline_space)
# Support pipeline space as YAML file
elif isinstance(pipeline_space, (str, Path)):
pipeline_space = pipeline_space_from_yaml(pipeline_space)

# Support pipeline space as mix of ConfigurationSpace and neps parameters
new_pipeline_space: dict[str, Parameter] = dict()
Expand Down
5 changes: 1 addition & 4 deletions neps/search_spaces/hyperparameters/categorical.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,10 @@

import random
from copy import copy, deepcopy
from typing import Iterable
from typing import Iterable, Literal

import numpy as np
import numpy.typing as npt
from typing_extensions import Literal

from ..parameter import Parameter

Expand All @@ -32,9 +31,7 @@ def __init__(
self.upper = default
self.default_confidence_score = CATEGORICAL_CONFIDENCE_SCORES[default_confidence]
self.has_prior = self.default is not None

self.is_fidelity = is_fidelity

self.choices = list(choices)
self.num_choices = len(self.choices)
self.probabilities: list[npt.NDArray] = list(
Expand Down
3 changes: 1 addition & 2 deletions neps/search_spaces/hyperparameters/float.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,10 @@

import math
from copy import deepcopy
from typing import Literal

import numpy as np
import scipy.stats
from typing_extensions import Literal

from .numerical import NumericalParameter

Expand Down Expand Up @@ -37,7 +37,6 @@ def __init__(

if self.lower >= self.upper:
raise ValueError("Float parameter: bounds error (lower >= upper).")

self.log = log

if self.log:
Expand Down
Loading