Skip to content

Commit d5f489d

Browse files
Merge pull request #67 from trustyai-explainability/main
[pull] main from trustyai-explainability:main
2 parents d71ccd9 + fbd360a commit d5f489d

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

63 files changed

+14871
-4735
lines changed

BENCHMARK_METADATA_REFERENCE.md

Lines changed: 288 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,288 @@
1+
# Benchmark Metadata Reference
2+
3+
This document is the reference for fields supported in:
4+
5+
```python
6+
client.alpha.benchmarks.register(..., metadata={...})
7+
```
8+
9+
It covers:
10+
11+
- `garak_config` (detailed command config)
12+
- shield fields (`shield_ids`, `shield_config`)
13+
- runtime controls (`timeout`, remote-only retry/GPU keys)
14+
- deep-merge behavior when updating predefined/existing benchmarks
15+
16+
## 1) Metadata Shape
17+
18+
```python
19+
metadata = {
20+
"garak_config": {
21+
"system": {...},
22+
"run": {...},
23+
"plugins": {...},
24+
"reporting": {...},
25+
},
26+
"timeout": 1800,
27+
"shield_ids": ["Prompt-Guard-86M"], # or use shield_config
28+
"max_retries": 3, # remote mode only
29+
"use_gpu": False, # remote mode only
30+
}
31+
```
32+
33+
If `garak_config` is omitted, provider falls back to default Garak config (effectively broad/default probe selection), which can be very slow.
34+
35+
### 1.1 Build `garak_config` via Python models (optional)
36+
37+
You can construct config using typed models exported by this package:
38+
39+
```python
40+
from llama_stack_provider_trustyai_garak import (
41+
GarakCommandConfig,
42+
GarakSystemConfig,
43+
GarakRunConfig,
44+
GarakPluginsConfig,
45+
GarakReportingConfig,
46+
)
47+
```
48+
49+
Example:
50+
51+
```python
52+
garak_cfg = GarakCommandConfig(
53+
system=GarakSystemConfig(parallel_attempts=20),
54+
run=GarakRunConfig(generations=2, eval_threshold=0.5),
55+
plugins=GarakPluginsConfig(probe_spec=["promptinject.HijackHateHumans"]),
56+
reporting=GarakReportingConfig(taxonomy="owasp"),
57+
)
58+
59+
metadata = {
60+
"garak_config": garak_cfg.to_dict(),
61+
"timeout": 900,
62+
}
63+
```
64+
65+
## 2) Top-Level Metadata Keys
66+
67+
| Key | Type | Default | Mode | Notes |
68+
|---|---|---|---|---|
69+
| `garak_config` | `dict` | default `GarakCommandConfig()` | inline + remote | Main Garak command schema. Recommended to always set. |
70+
| `timeout` | `int` (seconds) | provider default (`10800`) | inline + remote | Max scan runtime for a benchmark run. |
71+
| `shield_ids` | `list[str]` | `[]` | inline + remote | Shortcut for input shields only. |
72+
| `shield_config` | `dict` | `{}` | inline + remote | Explicit mapping: `{"input": [...], "output": [...]}`. |
73+
| `max_retries` | `int` | `3` | remote only | KFP pipeline retry count for scan step. |
74+
| `use_gpu` | `bool` | `False` | remote only | Requests GPU scheduling in KFP pipeline. |
75+
76+
Notes:
77+
78+
- If both `shield_ids` and `shield_config` are provided, `shield_ids` takes precedence.
79+
- Unknown top-level keys are passed as provider params but are ignored unless consumed by adapter logic.
80+
81+
## 3) Shield Metadata Rules
82+
83+
### `shield_ids`
84+
85+
```python
86+
"shield_ids": ["Prompt-Guard-86M"]
87+
```
88+
89+
- Must be a list.
90+
- Treated as input shields.
91+
- Easier syntax for common cases.
92+
93+
### `shield_config`
94+
95+
```python
96+
"shield_config": {
97+
"input": ["Prompt-Guard-86M"],
98+
"output": ["Llama-Guard-3-8B"]
99+
}
100+
```
101+
102+
- Must be a dictionary.
103+
- Use when you need separate input/output shield chains.
104+
105+
Validation behavior:
106+
107+
- Provider validates shield IDs against Shields API.
108+
- If Shields API is not enabled and shield metadata is present, run fails.
109+
110+
## 4) `garak_config` Detailed Schema
111+
112+
`garak_config` has four primary sections:
113+
114+
- `system`
115+
- `run`
116+
- `plugins`
117+
- `reporting`
118+
119+
### 4.1 `garak_config.system`
120+
121+
| Field | Type | Default | Description |
122+
|---|---|---|---|
123+
| `parallel_attempts` | `bool \| int` | `16` | Parallel prompt attempts where supported. |
124+
| `max_workers` | `int` | `500` | Upper bound for requested worker count. |
125+
| `parallel_requests` | `bool \| int` | `False` | Parallel requests for generators lacking multi-response support. |
126+
| `verbose` | `int` (`0..2`) | `0` | CLI verbosity. |
127+
| `show_z` | `bool` | `False` | Show Z-scores in CLI output. |
128+
| `narrow_output` | `bool` | `False` | Improve output for narrow terminals. |
129+
| `lite` | `bool` | `True` | Lite mode caution output behavior. |
130+
| `enable_experimental` | `bool` | `False` | Enable experimental Garak flags. |
131+
132+
### 4.2 `garak_config.run`
133+
134+
| Field | Type | Default | Description |
135+
|---|---|---|---|
136+
| `generations` | `int` | `1` | Number of generations per prompt. |
137+
| `probe_tags` | `str \| None` | `None` | Tag-based probe selection (e.g. `owasp:llm`). |
138+
| `eval_threshold` | `float` (`0..1`) | `0.5` | Detector threshold for hit/vulnerable decision. |
139+
| `soft_probe_prompt_cap` | `int` | `256` | Preferred prompt cap for autoscaling probes. Lower values reduce prompts per probe and make runs faster (with reduced coverage/comprehensiveness). |
140+
| `target_lang` | `str \| None` | `None` | BCP47 language target. |
141+
| `langproviders` | `list[str] \| None` | `None` | Providers for language conversion. |
142+
| `system_prompt` | `str \| None` | `None` | Default system prompt where applicable. |
143+
| `seed` | `int \| None` | `None` | Reproducibility seed. |
144+
| `deprefix` | `bool` | `True` | Remove prompt prefix echoed by model outputs. |
145+
146+
Performance tuning tip:
147+
148+
- Predefined benchmarks are comprehensive by default.
149+
- To speed up exploratory runs, override `garak_config.run.soft_probe_prompt_cap` with a smaller value.
150+
- For full security assessment/comparability, keep defaults (or use consistent cap across compared runs).
151+
152+
### 4.3 `garak_config.plugins`
153+
154+
| Field | Type | Default | Description |
155+
|---|---|---|---|
156+
| `probe_spec` | `list[str] \| str` | `"all"` | Probe/module/class selection. |
157+
| `detector_spec` | `list[str] \| str \| None` | `None` | Detector override (`None` uses probe defaults). |
158+
| `extended_detectors` | `bool` | `True` | Include extended detector set. |
159+
| `buff_spec` | `list[str] \| str \| None` | `None` | Buff/module selection. |
160+
| `buffs_include_original_prompt` | `bool` | `True` | Keep original prompt when buffing. |
161+
| `buff_max` | `int \| None` | `None` | Cap output count from buffs. |
162+
| `target_type` | `str` | auto-managed | Provider sets this for openai/function mode. |
163+
| `target_name` | `str \| None` | auto-managed | Provider sets this to model or shield orchestrator. |
164+
| `probes` | `dict \| None` | `None` | Probe plugin config tree. |
165+
| `detectors` | `dict \| None` | `None` | Detector plugin config tree. |
166+
| `generators` | `dict \| None` | `None` | Generator plugin config tree. |
167+
| `buffs` | `dict \| None` | `None` | Buff plugin config tree. |
168+
| `harnesses` | `dict \| None` | `None` | Harness plugin config tree. |
169+
170+
Provider behavior worth knowing:
171+
172+
- `probe_spec`, `detector_spec`, `buff_spec` accept string or list, and are normalized before run.
173+
- If shield metadata is present, provider switches generator mode to function-based shield orchestration automatically.
174+
- Otherwise provider uses OpenAI-compatible generator mode.
175+
176+
### 4.4 `garak_config.reporting`
177+
178+
| Field | Type | Default | Description |
179+
|---|---|---|---|
180+
| `taxonomy` | `str \| None` | `None` | Grouping taxonomy (`owasp`, `avid-effect`, `quality`, `cwe`). |
181+
| `show_100_pass_modules` | `bool` | `True` | Include fully passing entries in HTML report details. |
182+
| `show_top_group_score` | `bool` | `True` | Show top-level aggregate in grouped report sections. |
183+
| `group_aggregation_function` | `str` | `"lower_quartile"` | Group aggregation strategy in report. |
184+
| `report_dir` | `str \| None` | auto-managed | Provider-managed output location; usually leave unset. |
185+
| `report_prefix` | `str \| None` | auto-managed | Provider-managed output prefix; usually leave unset. |
186+
187+
Please refer to [Garak configuration docs](https://reference.garak.ai/en/latest/configurable.html#config-files-yaml-and-json) for details about these controls.
188+
189+
## 5) Deep-Merge Behavior (Updating Predefined/Existing Benchmarks)
190+
191+
When registering with `provider_benchmark_id`, metadata is deep-merged:
192+
193+
- base metadata comes from:
194+
- predefined profile (`trustyai_garak::...`), or
195+
- existing benchmark metadata
196+
- your new metadata overrides only specified keys
197+
198+
Example:
199+
200+
```python
201+
client.alpha.benchmarks.register(
202+
benchmark_id="quick_promptinject_tuned",
203+
dataset_id="garak",
204+
scoring_functions=["garak_scoring"],
205+
provider_id=garak_provider_id,
206+
provider_benchmark_id="trustyai_garak::quick",
207+
metadata={
208+
"garak_config": {
209+
"plugins": {"probe_spec": ["promptinject"]},
210+
"system": {"parallel_attempts": 20},
211+
},
212+
"timeout": 1200,
213+
},
214+
)
215+
```
216+
217+
## 6) Practical Examples
218+
219+
### Example A: Minimal custom benchmark
220+
221+
```python
222+
metadata = {
223+
"garak_config": {
224+
"plugins": {"probe_spec": ["promptinject.HijackHateHumans"]},
225+
"run": {"generations": 2, "eval_threshold": 0.5},
226+
"reporting": {"taxonomy": "owasp"},
227+
},
228+
"timeout": 900,
229+
}
230+
```
231+
232+
### Example B: Explicit input/output shield mapping
233+
234+
```python
235+
metadata = {
236+
"garak_config": {
237+
"plugins": {"probe_spec": ["promptinject.HijackHateHumans"]},
238+
},
239+
"shield_config": {
240+
"input": ["Prompt-Guard-86M"],
241+
"output": ["Llama-Guard-3-8B"],
242+
},
243+
"timeout": 600,
244+
}
245+
```
246+
247+
### Example C: Remote retry/GPU controls
248+
249+
```python
250+
metadata = {
251+
"garak_config": {
252+
"run": {"probe_tags": "owasp:llm"},
253+
},
254+
"timeout": 7200,
255+
"max_retries": 2,
256+
"use_gpu": True,
257+
}
258+
```
259+
260+
### Example D: Faster predefined benchmark variant
261+
262+
```python
263+
metadata = {
264+
"garak_config": {
265+
"run": {
266+
"soft_probe_prompt_cap": 100
267+
}
268+
},
269+
"timeout": 7200,
270+
}
271+
272+
# Register as a tuned variant of a predefined benchmark
273+
client.alpha.benchmarks.register(
274+
benchmark_id="owasp_fast",
275+
dataset_id="garak",
276+
scoring_functions=["garak_scoring"],
277+
provider_id=garak_provider_id,
278+
provider_benchmark_id="trustyai_garak::owasp_llm_top10",
279+
metadata=metadata,
280+
)
281+
```
282+
283+
## 7) Legacy / Compatibility Notes
284+
285+
- Prefer `metadata.garak_config.plugins.probe_spec` over old top-level `metadata.probes`.
286+
- Prefer `metadata.garak_config.run.eval_threshold` for threshold control.
287+
- Keep benchmark metadata focused on benchmark/run concerns.
288+
KFP control-plane settings such as `experiment_name` belong in provider config (`kubeflow_config.experiment_name`, environment: `KUBEFLOW_EXPERIMENT_NAME`), not benchmark metadata.

COMPATIBILITY.md

Lines changed: 22 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,14 +6,25 @@ This document tracks the compatibility of `llama-stack-provider-trustyai-garak`
66

77
| Provider Version | Llama-Stack Version | Python Version | Key Dependencies | Status | Notes |
88
|------------------|---------------------|----------------|------------------|---------|-------|
9-
| 0.1.3 | ==0.2.18 | >=3.12 | greenlet, httpx[http2], kfp, kfp-kubernetes, kfp-server-api, boto3, garak | Current | Latest stable release with thin dependencies and lazy kfp & s3 client init for remote mode |
9+
| 0.2.0 | >=0.5.0 | >=3.12 | kfp>=2.14.6, kfp-kubernetes>=2.14.6, kfp-server-api>=2.14.6, boto3>=1.35.88 | Current | Current release with updated metadata schema (`metadata.garak_config`) and remote/inline support |
10+
| 0.1.3 | ==0.2.18 | >=3.12 | greenlet, httpx[http2], kfp, kfp-kubernetes, kfp-server-api, boto3, garak | | Latest stable release with thin dependencies and lazy kfp & s3 client init for remote mode |
1011
| 0.1.2 | >=0.2.15 | >=3.12 | fastapi, opentelemetry-api, opentelemetry-exporter-otlp, aiosqlite, greenlet, uvicorn, ipykernel, httpx[http2], kfp, kfp-kubernetes, kfp-server-api, boto3, garak | | Release with both remote and inline implementation |
1112
| 0.1.1 | >=0.2.15 | >=3.12 | fastapi, opentelemetry-api, opentelemetry-exporter-otlp, aiosqlite, greenlet, uvicorn, ipykernel, httpx[http2], garak | | Initial stable release with inline implementation |
1213

1314
## Dependency Details
1415

1516
### Core Dependencies
1617

18+
#### Version 0.2.0 (latest)
19+
- **llama-stack-client**: >=0.5.0
20+
- **llama-stack-api**: >=0.5.0
21+
- **llama-stack** (server extra): >=0.5.0
22+
- **garak** (inline extra): ==0.14.0
23+
- **kfp**: >=2.14.6
24+
- **kfp-kubernetes**: >=2.14.6
25+
- **kfp-server-api**: >=2.14.6
26+
- **boto3**: >=1.35.88
27+
1728
#### Version 0.1.3
1829
- **llama-stack**: == 0.2.18
1930
- **greenlet**: Latest compatible (3.2.4)
@@ -68,6 +79,16 @@ The provider is built and compatible with:
6879
- **Llama-Stack Version**: 0.2.18 (in container builds)
6980
- **Additional Runtime Dependencies**: torch, transformers, sqlalchemy, and others as specified in the Containerfile
7081

82+
## Image Compatibility (Latest Deployments)
83+
84+
Use the table below as a quick reference for image fields used in current remote deployments.
85+
86+
| Use Case | Config Key / Field | Where to Set | Recommended Image | Alternative | Notes |
87+
|---|---|---|---|---|---|
88+
| LLS distro image (total remote) | `spec.distribution.image` | `lsd_remote/llama_stack_distro-setup/lsd-garak.yaml` | `quay.io/opendatahub/llama-stack@sha256:cf21d3919d265f8796ed600bfe3d2eb3ce797b35ab8e60ca9b6867e0516675e5` | `quay.io/rhoai/odh-llama-stack-core-rhel9:rhoai-3.4` | Pick image matching your RHOAI/ODH release stream |
89+
| Garak KFP base image (total remote) | `KUBEFLOW_GARAK_BASE_IMAGE` | `lsd_remote/llama_stack_distro-setup/lsd-config.yaml` | `quay.io/opendatahub/odh-trustyai-garak-lls-provider-dsp:dev` | `quay.io/rhoai/odh-trustyai-garak-lls-provider-dsp-rhel9:rhoai-3.4` | Injected into LSD env via `lsd-garak.yaml` |
90+
| Garak KFP base image (partial remote) | `kubeflow_config.garak_base_image` (env: `KUBEFLOW_GARAK_BASE_IMAGE`) | `demos/2-partial-remote/partial-remote.yaml` | `quay.io/opendatahub/odh-trustyai-garak-lls-provider-dsp:dev` | `quay.io/rhoai/odh-trustyai-garak-lls-provider-dsp-rhel9:rhoai-3.4` | Used by KFP components for scan/parse/validate steps |
91+
7192
## Breaking Changes
7293

7394
### Version 0.1.3

Containerfile

Lines changed: 18 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -15,22 +15,25 @@ COPY . .
1515
# Build argument to specify architecture
1616
ARG TARGETARCH=x86_64
1717

18-
# Install dependencies
19-
RUN if [ "$TARGETARCH" = "amd64" ] || [ "$TARGETARCH" = "x86_64" ]; then \
20-
echo "Installing x86_64 dependencies ..."; \
21-
pip install --no-cache-dir -r requirements-x86_64.txt; \
22-
elif [ "$TARGETARCH" = "arm64" ] || [ "$TARGETARCH" = "aarch64" ]; then \
23-
echo "Installing ARM64 dependencies ..."; \
24-
pip install --no-cache-dir -r requirements-aarch64.txt; \
25-
else \
26-
echo "ERROR: Unsupported architecture: $TARGETARCH"; \
27-
exit 1; \
28-
fi
29-
30-
# Install the package itself (--no-deps since dependencies already installed)
18+
# # Install dependencies
19+
# RUN if [ "$TARGETARCH" = "amd64" ] || [ "$TARGETARCH" = "x86_64" ]; then \
20+
# echo "Installing x86_64 dependencies ..."; \
21+
# pip install --no-cache-dir -r requirements-x86_64.txt; \
22+
# elif [ "$TARGETARCH" = "arm64" ] || [ "$TARGETARCH" = "aarch64" ]; then \
23+
# echo "Installing ARM64 dependencies ..."; \
24+
# pip install --no-cache-dir -r requirements-aarch64.txt; \
25+
# else \
26+
# echo "ERROR: Unsupported architecture: $TARGETARCH"; \
27+
# exit 1; \
28+
# fi
29+
30+
# Install cpu torch to reduce image size
31+
RUN pip install torch --index-url https://download.pytorch.org/whl/cpu
32+
33+
# Install the package itself
3134
# Use [inline] to get garak dependency
32-
RUN pip install --no-cache-dir --no-deps -e ".[inline]"
33-
35+
RUN pip install --no-cache-dir ".[inline]"
36+
RUN pip install --no-cache-dir -r requirements-inline-extra.txt
3437
# Set XDG environment variables to use /tmp (always writable) for garak to write to
3538
ENV XDG_CACHE_HOME=/tmp/.cache
3639
ENV XDG_DATA_HOME=/tmp/.local/share

0 commit comments

Comments
 (0)