Skip to content

Commit 355f536

Browse files
Checks Customization RFC (#62)
* Add Checks Customization RFC * Specifiy that we want the user to know which is the value currently being taken into account given the context * Highlight that storing customization in web would make us miss the opportunity to have such a valuable feature in a standalone Check Compliance engine scenario * Clarify that specific input needs for customization operations will be determined at due time * Add Authorization related considerations and solutions * Clarify reasoning about standalone compliance check engine scenario * Fix typo * Add considerations about snapshotting custom values with execution results * Fix typo * Link RFC Pull Request
1 parent 32cab5d commit 355f536

File tree

1 file changed

+324
-0
lines changed

1 file changed

+324
-0
lines changed

rfc/0001-checks-customization.md

Lines changed: 324 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,324 @@
1+
| | |
2+
| :----------- | :------------------------------------------------------|
3+
| Feature Name | Checks Customization |
4+
| Start Date | Jan 15th, 2025 |
5+
| Category | Architecture |
6+
| PR | [#62](https://github.com/trento-project/docs/pull/62) |
7+
8+
# Summary
9+
10+
Certain checks in the Trento checks catalog should be customized by customers on a target basis to adjust the corresponding expected values to ones that work better in their environment.
11+
12+
### Use Cases outline
13+
14+
- As a user, I want to override a Check's expected values on a target basis (host or cluster at the moment) so that it matches my specific environment needs
15+
- As a user, I want the overridden values to be used in a checks execution
16+
- As a user, I want to reset a customized check to default so that I can return to using SUSE's suggested best practices
17+
- As a user, I want check customization attempts to be tracked in the Activity Log
18+
19+
# Motivation
20+
21+
The purpose of this RFC is to define what Checks customization entails in overall Trento Architecture:
22+
- where data and operations belong
23+
- the interactions involved between the components (mainly server and checks engine)
24+
- authorizing customization operations
25+
- tracking customization activities
26+
27+
# Detailed design
28+
29+
Considering the outlined [use cases](#use-cases-outline) we need to:
30+
- define [when a check can be customized](#customizable-check)
31+
- persist custom values (basically CRUD-ish operations) on a check+target basis
32+
- use custom values in the checks execution by the Check Engine (Wanda)
33+
34+
Since check engine is meant to deal with Checks, it looks naturally reasonable considering it as the component who's responsible to deal with [custom values operations and data](#operations-on-custom-values).
35+
36+
This makes sure also that [during checks execution](#custom-values-usage-during-checks-execution) the engine already has the information needed to override built-in values.
37+
38+
## Customizable check
39+
40+
A Check can be considered *customizable*, meaning a customer is allowed to modify its expected values, if it's specification contains the `values` entry and if that entry is not empty.
41+
42+
**Customizable Check excerpt**
43+
```yaml
44+
id: "156F64"
45+
name: Check Corosync token_timeout value
46+
# ...
47+
values:
48+
- name: expected_token_timeout
49+
default: 5000
50+
conditions:
51+
- value: 30000
52+
when: env.provider == "azure" || env.provider == "aws"
53+
- value: 20000
54+
when: env.provider == "gcp"
55+
```
56+
57+
In order to allow customization of checks where hardcoded values are used in expectation rhai scripts, those checks need to be adjusted by moving hardcoded values to DSL specified values.
58+
59+
**Not Customizable Check excerpt**
60+
```yaml
61+
# not customizable check
62+
id: "AAA000"
63+
name: Lorem ipsum dolor sit amet
64+
# ...
65+
# missing `values` entry
66+
expectations:
67+
- name: foo_expectation
68+
expect: facts.some_fact == "yes"
69+
```
70+
71+
Need to be refactored to something like the following
72+
```yaml
73+
# refactored to be customizable
74+
id: "AAA000"
75+
name: Lorem ipsum dolor sit amet
76+
# ...
77+
values:
78+
- name: expected_yes_value
79+
default: "yes"
80+
81+
expectations:
82+
- name: foo_expectation
83+
expect: facts.some_fact == values.expected_yes_value
84+
```
85+
86+
This is necessary because operating on the rhai script is currently not an option.
87+
88+
### Caveats
89+
90+
After some deeper research there are some caveates to be take into account for check customizability
91+
- check [3A59DC](https://github.com/trento-project/checks/blob/main/checks/3A59DC.yaml) will be not allowed for customization because of its nature (what it is attempting to do and how it is written)
92+
- if one of the values of a check is a list, that specific value won't be allowed for modification (deferred implementation)
93+
94+
## Operations on custom values
95+
96+
In order to support applying custom values, changing them after they've been set and also reset to SUSE's default ones, the following new operations would be introduced:
97+
- [Apply custom values](#apply-custom-values)
98+
- [Change custom values](#change-custom-values)
99+
- [Reset custom values](#reset-check-to-defaults)
100+
- [Reading custom values](#reading-customization)
101+
102+
**Disclaimer:** all following endpoints, methods, paths, query strings, parameters are indicative at this point and up for further refinement.
103+
The intent is to give the general feeling of the moving parts and what is needed.
104+
105+
### Apply custom values
106+
This operation allows to apply custom values for a given customizable check that has not been customized yet.
107+
The operation allows to apply a custom value for at least one customizable value, meaning that if a check defines more than one **expected value**, this operation allows to customize one, many or all of them (ie. if a check has 3 expected values defined, a user might be interested in customizing only one of them and use the default ones for the others)
108+
109+
#### Endpoint
110+
111+
`POST /checks/:check_id/values`
112+
```json
113+
{
114+
"target_id": "target-uuid",
115+
"group_id": "group-uuid",
116+
"values": [
117+
{
118+
"name": "expected_token_timeout",
119+
"value": 42
120+
},
121+
// possibly other entries
122+
]
123+
}
124+
```
125+
126+
### Change custom values
127+
This operation allows to change custom values on an already customized check.
128+
The operation allows to change custom values for at least one customizable value, meaning that if a check defines more than one **expected value**, this operation allows to change customization for one, many or all of them, independently from the fact that they've been previously customized or not (ie. if a check has 3 expected values defined, and a user has customized one of them, with this operation a user might be interested in customizing only one of them and use the default ones for the others)
129+
130+
#### Endpoint
131+
132+
`PATCH /checks/:check_id/values`
133+
```json
134+
{
135+
"target_id": "target-uuid",
136+
"group_id": "group-uuid",
137+
"values": [
138+
{
139+
"name": "expected_token_timeout",
140+
"value": 42
141+
},
142+
// possibly other entries
143+
]
144+
}
145+
```
146+
147+
### Reset check to defaults
148+
This operation clears any previously set custom value, effectively resulting in checks execution considering built-in values defined in check's specification.
149+
150+
#### Endpoint
151+
152+
`DELETE /checks/:check_id/values`
153+
```json
154+
{
155+
"target_id": "target-uuid",
156+
"group_id": "group-uuid"
157+
}
158+
```
159+
160+
Whether such operations require both `target_id` and `group_id` as input for the operations will be defined at due time.
161+
162+
### Reading customization
163+
164+
A target's checks selection workflow gets extended with customization capabilities, hence the following extra information is needed:
165+
- whether a check is customizable
166+
- whether a value of a customizable check can be customized (ie a value which is a list cannot be customized, yet)
167+
- identify which is the value being used based on the context (requires evaluating `when` conditions) so that the user know what actually is going to be overridden
168+
- whether a check has been already customized
169+
- which are the custom values that have been applied
170+
171+
Since reading the checks catalog alone wouldn't be enough anymore, options are:
172+
- the current read operation on the catalog is extended to carry the customization data
173+
- a read operation is added specifically targeting custom values
174+
- a read operation is added to fulfill the overall Checks Selection (meaning the read catalog operation remains as such while an *extended catalog* representation, as depicted in the first option, becomes an operation on its own)
175+
176+
#### Option 1: enriching the catalog
177+
178+
This option, besides requiring the addition of extra field to the catalog's representation, also demands for the target identifier, at least, to be part of the operation input so that the correct overriding values are retrieved.
179+
180+
`GET /checks/catalog?provider=...&target_type=...&target_id=uuid`
181+
182+
Sample response would be what the catalog currently exposes plus the extra information
183+
```json
184+
[
185+
{
186+
"id": "check1",
187+
// other check fields
188+
"values": [
189+
//...
190+
],
191+
"customizable": true,
192+
"customized": true,
193+
"custom_values": [
194+
//...
195+
]
196+
},
197+
// other checks
198+
]
199+
200+
// Note: customization related new fields could be grouped into their own entry, rather than adding all of them at the root level
201+
```
202+
203+
This option adds perhaps too much responsibilities to the catalog operation which is also used when non in a checks selection workflow.
204+
205+
#### Option 2: exposing a read operation for target based customized values of a check
206+
207+
This option entails the introduction of a new operation to get all customized values information for a specific target. Better not do it on a per check basis as we usually need info for a bulck of checks.
208+
209+
`GET /checks/:target_id/values`
210+
```json
211+
[
212+
{
213+
"id": "check1",
214+
"default_values": [
215+
//...
216+
],
217+
"custom_values": [ // having both original ones + overriding ones allows exposing the difference
218+
//...
219+
]
220+
},
221+
// other checks customized values
222+
]
223+
```
224+
225+
Such an options requires to delegate to a client aggregating information from the catalog and this new data to get the full picture.
226+
227+
#### Option 3: exposing a read operation for target based catalog with customization information
228+
229+
This option proposes a new operation which effectively is a narrowed version of the catalog specific for the target containing also customization information, as in option 1, keeping the actual catalog operation untouched and scoped only for generic consultation.
230+
231+
`GET /checks/:target_id/catalog?qs...`
232+
```json
233+
[
234+
{
235+
"id": "check1",
236+
// other check fields
237+
"values": [
238+
//...
239+
],
240+
"customizable": true,
241+
"customized": true,
242+
"custom_values": [ // having both original ones + overriding ones allows exposing the difference
243+
//...
244+
]
245+
},
246+
{
247+
"id": "check2",
248+
"values": [
249+
//...
250+
],
251+
"customizable": true,
252+
"customized": false,
253+
"custom_values": [] // empty list | null | absent
254+
},
255+
{
256+
"id": "check3",
257+
// missing or empty values entry
258+
"customizable": false,
259+
"customized": false,
260+
"custom_values": [] // empty list | null | absent
261+
}
262+
]
263+
264+
// Notes:
265+
// - delegating detection of whether a check is customizable (ie it has values) to a client sounds like logic leakage
266+
// - delegating detection of whether a check is customized (ie it has custom values) to a client sounds less of a leakage but if we decide to expose `customizable` it is just trivial exposing also a `customized` entry
267+
```
268+
269+
Somewhat related to [this hackweek exploration](https://github.com/trento-project/web/pull/3160).
270+
271+
## Custom values usage during checks execution
272+
273+
When there are custom values for a check on a specific target, those need to be used instead of the built-in ones during a [checks execution](https://github.com/trento-project/wanda/blob/main/guides/specification.md#checks-execution).
274+
275+
By having the custom values available in its state, wanda can simply query and use them instead of the built-in ones.
276+
277+
### Notes on execution results
278+
279+
Since custom values at a certain point in time might differ from custom values used during an execution it becomes necessary snapshotting the specific custom values used during a specific execution, that is storing the custom values along with the execution result they're being used in.
280+
281+
Then, to get a proper overview of a checks execution results, data from the catalog and from the extended checks execution results keep being aggregated together as we already do.
282+
283+
## Authorizing and Logging customization activities
284+
285+
### Authorization
286+
287+
Currently Wanda only supports checking for an authenticated token, it does not check whether user is authorized to perform an action.
288+
289+
We can make sure that the JWT generated by the auth server (aka web) also contains abilities, so that a service provider (like wanda in this case) can allow/disallow certain operations.
290+
291+
### Logging
292+
293+
Currently, [Activity Logging](https://github.com/trento-project/docs/blob/main/adr/0015-activity-logging.md) is pretty much scoped to server component tracking activities of the following nature:
294+
- API based operations, that is calls to specific http requests
295+
- domain events emitted by the system
296+
297+
Having checks customization operations in wanda adds a challenge since those interesting actions are not passing through the Activity Logging subsystem.
298+
299+
Options available to get a valuable outcome are:
300+
- checks customization operations are proxied via server component (meaning we have twin operations exposed by trento web that actually just call wanda)
301+
- wanda emits messages after it applies customization so that interested parties - web - acknowledges the operation was successfully completed and tracks relevant entries in the activity log
302+
303+
Activity Logging needs to evolve to support logging actions in a distributed system like Trento, however this is out of scope for this specific RFC.
304+
305+
# Drawbacks
306+
307+
# Alternatives
308+
309+
The main alternative point is storing custom values in Web rather than in Wanda.
310+
311+
Generally speaking all the considerations made previously keep their validity.
312+
313+
Here's the main points considered about storing custom values in web:
314+
- having web responsible for checks customization data and operations means leaking a responsibility where it does not naturally belong
315+
- in the context of a standalone Compliance Check Engine, Checks Customization feature would be clunky to use because custom values would not be organically part of the check engine, but would need to be provided every time even if they did not change
316+
- using the overriding values in a checks execution requires sending those from web to wanda via the `ExecutionRequested` message hence either:
317+
- we change the message contract
318+
- we inappropriately send the overriding values in `ExecutionRequested` env entry
319+
- also the start endpoint in Wanda needs to be changed
320+
- having checks and their customization in different places makes it harder to operate when checks change (meaning that if a check changes its spec we need to react to that and possibly invalidate a previously made customization for instance. We have a similar situation with selected checks stored in web and the possibility that a selected check is removed)
321+
322+
# Unresolved questions
323+
324+
- consider the difference between customizing checks for a host vs customizing checks for a cluster (host checks execution vs cluster checks execution) target_id might not be sufficient, we might need to take into account the group id as well

0 commit comments

Comments
 (0)