|
| 1 | +# Benchmark Metadata Reference |
| 2 | + |
| 3 | +This document is the reference for fields supported in: |
| 4 | + |
| 5 | +```python |
| 6 | +client.alpha.benchmarks.register(..., metadata={...}) |
| 7 | +``` |
| 8 | + |
| 9 | +It covers: |
| 10 | + |
| 11 | +- `garak_config` (detailed command config) |
| 12 | +- shield fields (`shield_ids`, `shield_config`) |
| 13 | +- runtime controls (`timeout`, remote-only retry/GPU keys) |
| 14 | +- deep-merge behavior when updating predefined/existing benchmarks |
| 15 | + |
| 16 | +## 1) Metadata Shape |
| 17 | + |
| 18 | +```python |
| 19 | +metadata = { |
| 20 | + "garak_config": { |
| 21 | + "system": {...}, |
| 22 | + "run": {...}, |
| 23 | + "plugins": {...}, |
| 24 | + "reporting": {...}, |
| 25 | + }, |
| 26 | + "timeout": 1800, |
| 27 | + "shield_ids": ["Prompt-Guard-86M"], # or use shield_config |
| 28 | + "max_retries": 3, # remote mode only |
| 29 | + "use_gpu": False, # remote mode only |
| 30 | +} |
| 31 | +``` |
| 32 | + |
| 33 | +If `garak_config` is omitted, provider falls back to default Garak config (effectively broad/default probe selection), which can be very slow. |
| 34 | + |
| 35 | +### 1.1 Build `garak_config` via Python models (optional) |
| 36 | + |
| 37 | +You can construct config using typed models exported by this package: |
| 38 | + |
| 39 | +```python |
| 40 | +from llama_stack_provider_trustyai_garak import ( |
| 41 | + GarakCommandConfig, |
| 42 | + GarakSystemConfig, |
| 43 | + GarakRunConfig, |
| 44 | + GarakPluginsConfig, |
| 45 | + GarakReportingConfig, |
| 46 | +) |
| 47 | +``` |
| 48 | + |
| 49 | +Example: |
| 50 | + |
| 51 | +```python |
| 52 | +garak_cfg = GarakCommandConfig( |
| 53 | + system=GarakSystemConfig(parallel_attempts=20), |
| 54 | + run=GarakRunConfig(generations=2, eval_threshold=0.5), |
| 55 | + plugins=GarakPluginsConfig(probe_spec=["promptinject.HijackHateHumans"]), |
| 56 | + reporting=GarakReportingConfig(taxonomy="owasp"), |
| 57 | +) |
| 58 | + |
| 59 | +metadata = { |
| 60 | + "garak_config": garak_cfg.to_dict(), |
| 61 | + "timeout": 900, |
| 62 | +} |
| 63 | +``` |
| 64 | + |
| 65 | +## 2) Top-Level Metadata Keys |
| 66 | + |
| 67 | +| Key | Type | Default | Mode | Notes | |
| 68 | +|---|---|---|---|---| |
| 69 | +| `garak_config` | `dict` | default `GarakCommandConfig()` | inline + remote | Main Garak command schema. Recommended to always set. | |
| 70 | +| `timeout` | `int` (seconds) | provider default (`10800`) | inline + remote | Max scan runtime for a benchmark run. | |
| 71 | +| `shield_ids` | `list[str]` | `[]` | inline + remote | Shortcut for input shields only. | |
| 72 | +| `shield_config` | `dict` | `{}` | inline + remote | Explicit mapping: `{"input": [...], "output": [...]}`. | |
| 73 | +| `max_retries` | `int` | `3` | remote only | KFP pipeline retry count for scan step. | |
| 74 | +| `use_gpu` | `bool` | `False` | remote only | Requests GPU scheduling in KFP pipeline. | |
| 75 | + |
| 76 | +Notes: |
| 77 | + |
| 78 | +- If both `shield_ids` and `shield_config` are provided, `shield_ids` takes precedence. |
| 79 | +- Unknown top-level keys are passed as provider params but are ignored unless consumed by adapter logic. |
| 80 | + |
| 81 | +## 3) Shield Metadata Rules |
| 82 | + |
| 83 | +### `shield_ids` |
| 84 | + |
| 85 | +```python |
| 86 | +"shield_ids": ["Prompt-Guard-86M"] |
| 87 | +``` |
| 88 | + |
| 89 | +- Must be a list. |
| 90 | +- Treated as input shields. |
| 91 | +- Easier syntax for common cases. |
| 92 | + |
| 93 | +### `shield_config` |
| 94 | + |
| 95 | +```python |
| 96 | +"shield_config": { |
| 97 | + "input": ["Prompt-Guard-86M"], |
| 98 | + "output": ["Llama-Guard-3-8B"] |
| 99 | +} |
| 100 | +``` |
| 101 | + |
| 102 | +- Must be a dictionary. |
| 103 | +- Use when you need separate input/output shield chains. |
| 104 | + |
| 105 | +Validation behavior: |
| 106 | + |
| 107 | +- Provider validates shield IDs against Shields API. |
| 108 | +- If Shields API is not enabled and shield metadata is present, run fails. |
| 109 | + |
| 110 | +## 4) `garak_config` Detailed Schema |
| 111 | + |
| 112 | +`garak_config` has four primary sections: |
| 113 | + |
| 114 | +- `system` |
| 115 | +- `run` |
| 116 | +- `plugins` |
| 117 | +- `reporting` |
| 118 | + |
| 119 | +### 4.1 `garak_config.system` |
| 120 | + |
| 121 | +| Field | Type | Default | Description | |
| 122 | +|---|---|---|---| |
| 123 | +| `parallel_attempts` | `bool \| int` | `16` | Parallel prompt attempts where supported. | |
| 124 | +| `max_workers` | `int` | `500` | Upper bound for requested worker count. | |
| 125 | +| `parallel_requests` | `bool \| int` | `False` | Parallel requests for generators lacking multi-response support. | |
| 126 | +| `verbose` | `int` (`0..2`) | `0` | CLI verbosity. | |
| 127 | +| `show_z` | `bool` | `False` | Show Z-scores in CLI output. | |
| 128 | +| `narrow_output` | `bool` | `False` | Improve output for narrow terminals. | |
| 129 | +| `lite` | `bool` | `True` | Lite mode caution output behavior. | |
| 130 | +| `enable_experimental` | `bool` | `False` | Enable experimental Garak flags. | |
| 131 | + |
| 132 | +### 4.2 `garak_config.run` |
| 133 | + |
| 134 | +| Field | Type | Default | Description | |
| 135 | +|---|---|---|---| |
| 136 | +| `generations` | `int` | `1` | Number of generations per prompt. | |
| 137 | +| `probe_tags` | `str \| None` | `None` | Tag-based probe selection (e.g. `owasp:llm`). | |
| 138 | +| `eval_threshold` | `float` (`0..1`) | `0.5` | Detector threshold for hit/vulnerable decision. | |
| 139 | +| `soft_probe_prompt_cap` | `int` | `256` | Preferred prompt cap for autoscaling probes. Lower values reduce prompts per probe and make runs faster (with reduced coverage/comprehensiveness). | |
| 140 | +| `target_lang` | `str \| None` | `None` | BCP47 language target. | |
| 141 | +| `langproviders` | `list[str] \| None` | `None` | Providers for language conversion. | |
| 142 | +| `system_prompt` | `str \| None` | `None` | Default system prompt where applicable. | |
| 143 | +| `seed` | `int \| None` | `None` | Reproducibility seed. | |
| 144 | +| `deprefix` | `bool` | `True` | Remove prompt prefix echoed by model outputs. | |
| 145 | + |
| 146 | +Performance tuning tip: |
| 147 | + |
| 148 | +- Predefined benchmarks are comprehensive by default. |
| 149 | +- To speed up exploratory runs, override `garak_config.run.soft_probe_prompt_cap` with a smaller value. |
| 150 | +- For full security assessment/comparability, keep defaults (or use consistent cap across compared runs). |
| 151 | + |
| 152 | +### 4.3 `garak_config.plugins` |
| 153 | + |
| 154 | +| Field | Type | Default | Description | |
| 155 | +|---|---|---|---| |
| 156 | +| `probe_spec` | `list[str] \| str` | `"all"` | Probe/module/class selection. | |
| 157 | +| `detector_spec` | `list[str] \| str \| None` | `None` | Detector override (`None` uses probe defaults). | |
| 158 | +| `extended_detectors` | `bool` | `True` | Include extended detector set. | |
| 159 | +| `buff_spec` | `list[str] \| str \| None` | `None` | Buff/module selection. | |
| 160 | +| `buffs_include_original_prompt` | `bool` | `True` | Keep original prompt when buffing. | |
| 161 | +| `buff_max` | `int \| None` | `None` | Cap output count from buffs. | |
| 162 | +| `target_type` | `str` | auto-managed | Provider sets this for openai/function mode. | |
| 163 | +| `target_name` | `str \| None` | auto-managed | Provider sets this to model or shield orchestrator. | |
| 164 | +| `probes` | `dict \| None` | `None` | Probe plugin config tree. | |
| 165 | +| `detectors` | `dict \| None` | `None` | Detector plugin config tree. | |
| 166 | +| `generators` | `dict \| None` | `None` | Generator plugin config tree. | |
| 167 | +| `buffs` | `dict \| None` | `None` | Buff plugin config tree. | |
| 168 | +| `harnesses` | `dict \| None` | `None` | Harness plugin config tree. | |
| 169 | + |
| 170 | +Provider behavior worth knowing: |
| 171 | + |
| 172 | +- `probe_spec`, `detector_spec`, `buff_spec` accept string or list, and are normalized before run. |
| 173 | +- If shield metadata is present, provider switches generator mode to function-based shield orchestration automatically. |
| 174 | +- Otherwise provider uses OpenAI-compatible generator mode. |
| 175 | + |
| 176 | +### 4.4 `garak_config.reporting` |
| 177 | + |
| 178 | +| Field | Type | Default | Description | |
| 179 | +|---|---|---|---| |
| 180 | +| `taxonomy` | `str \| None` | `None` | Grouping taxonomy (`owasp`, `avid-effect`, `quality`, `cwe`). | |
| 181 | +| `show_100_pass_modules` | `bool` | `True` | Include fully passing entries in HTML report details. | |
| 182 | +| `show_top_group_score` | `bool` | `True` | Show top-level aggregate in grouped report sections. | |
| 183 | +| `group_aggregation_function` | `str` | `"lower_quartile"` | Group aggregation strategy in report. | |
| 184 | +| `report_dir` | `str \| None` | auto-managed | Provider-managed output location; usually leave unset. | |
| 185 | +| `report_prefix` | `str \| None` | auto-managed | Provider-managed output prefix; usually leave unset. | |
| 186 | + |
| 187 | +Please refer to [Garak configuration docs](https://reference.garak.ai/en/latest/configurable.html#config-files-yaml-and-json) for details about these controls. |
| 188 | + |
| 189 | +## 5) Deep-Merge Behavior (Updating Predefined/Existing Benchmarks) |
| 190 | + |
| 191 | +When registering with `provider_benchmark_id`, metadata is deep-merged: |
| 192 | + |
| 193 | +- base metadata comes from: |
| 194 | + - predefined profile (`trustyai_garak::...`), or |
| 195 | + - existing benchmark metadata |
| 196 | +- your new metadata overrides only specified keys |
| 197 | + |
| 198 | +Example: |
| 199 | + |
| 200 | +```python |
| 201 | +client.alpha.benchmarks.register( |
| 202 | + benchmark_id="quick_promptinject_tuned", |
| 203 | + dataset_id="garak", |
| 204 | + scoring_functions=["garak_scoring"], |
| 205 | + provider_id=garak_provider_id, |
| 206 | + provider_benchmark_id="trustyai_garak::quick", |
| 207 | + metadata={ |
| 208 | + "garak_config": { |
| 209 | + "plugins": {"probe_spec": ["promptinject"]}, |
| 210 | + "system": {"parallel_attempts": 20}, |
| 211 | + }, |
| 212 | + "timeout": 1200, |
| 213 | + }, |
| 214 | +) |
| 215 | +``` |
| 216 | + |
| 217 | +## 6) Practical Examples |
| 218 | + |
| 219 | +### Example A: Minimal custom benchmark |
| 220 | + |
| 221 | +```python |
| 222 | +metadata = { |
| 223 | + "garak_config": { |
| 224 | + "plugins": {"probe_spec": ["promptinject.HijackHateHumans"]}, |
| 225 | + "run": {"generations": 2, "eval_threshold": 0.5}, |
| 226 | + "reporting": {"taxonomy": "owasp"}, |
| 227 | + }, |
| 228 | + "timeout": 900, |
| 229 | +} |
| 230 | +``` |
| 231 | + |
| 232 | +### Example B: Explicit input/output shield mapping |
| 233 | + |
| 234 | +```python |
| 235 | +metadata = { |
| 236 | + "garak_config": { |
| 237 | + "plugins": {"probe_spec": ["promptinject.HijackHateHumans"]}, |
| 238 | + }, |
| 239 | + "shield_config": { |
| 240 | + "input": ["Prompt-Guard-86M"], |
| 241 | + "output": ["Llama-Guard-3-8B"], |
| 242 | + }, |
| 243 | + "timeout": 600, |
| 244 | +} |
| 245 | +``` |
| 246 | + |
| 247 | +### Example C: Remote retry/GPU controls |
| 248 | + |
| 249 | +```python |
| 250 | +metadata = { |
| 251 | + "garak_config": { |
| 252 | + "run": {"probe_tags": "owasp:llm"}, |
| 253 | + }, |
| 254 | + "timeout": 7200, |
| 255 | + "max_retries": 2, |
| 256 | + "use_gpu": True, |
| 257 | +} |
| 258 | +``` |
| 259 | + |
| 260 | +### Example D: Faster predefined benchmark variant |
| 261 | + |
| 262 | +```python |
| 263 | +metadata = { |
| 264 | + "garak_config": { |
| 265 | + "run": { |
| 266 | + "soft_probe_prompt_cap": 100 |
| 267 | + } |
| 268 | + }, |
| 269 | + "timeout": 7200, |
| 270 | +} |
| 271 | + |
| 272 | +# Register as a tuned variant of a predefined benchmark |
| 273 | +client.alpha.benchmarks.register( |
| 274 | + benchmark_id="owasp_fast", |
| 275 | + dataset_id="garak", |
| 276 | + scoring_functions=["garak_scoring"], |
| 277 | + provider_id=garak_provider_id, |
| 278 | + provider_benchmark_id="trustyai_garak::owasp_llm_top10", |
| 279 | + metadata=metadata, |
| 280 | +) |
| 281 | +``` |
| 282 | + |
| 283 | +## 7) Legacy / Compatibility Notes |
| 284 | + |
| 285 | +- Prefer `metadata.garak_config.plugins.probe_spec` over old top-level `metadata.probes`. |
| 286 | +- Prefer `metadata.garak_config.run.eval_threshold` for threshold control. |
| 287 | +- Keep benchmark metadata focused on benchmark/run concerns. |
| 288 | + KFP control-plane settings such as `experiment_name` belong in provider config (`kubeflow_config.experiment_name`, environment: `KUBEFLOW_EXPERIMENT_NAME`), not benchmark metadata. |
0 commit comments