Skip to content

Commit 878deaf

Browse files
committedApr 15, 2021
Add da docs
1 parent b2d5298 commit 878deaf

File tree

4 files changed

+154
-33
lines changed

4 files changed

+154
-33
lines changed
 

‎Docs/features/canary-deploys.md

+137
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,137 @@
1+
## Canary Deploys
2+
3+
As of `1.5.0`, a new implementation of canary deploys replaces the old incremental deploys in Singularity. Some initial notes on the update:
4+
5+
- Previous behavior will be preserved if using `incrementalDeploys` and not specifying a `canarySettings` object in your SingularityDeploy json
6+
- Default behavior remains unchanged for deploys, settings can be explicitly enabled per deploy
7+
8+
### Starting a Canary Deploy
9+
10+
To run a canary deploy, simply specify a `canarySettings` object in your SingularityDeploy json (defaults with canary disabled are shown in the json below):
11+
12+
```
13+
{
14+
"deploy": {
15+
"canarySettings": {
16+
"enableCanaryDeploy":false,
17+
"instanceGroupSize":1,
18+
"acceptanceMode":"NONE",
19+
"waitMillisBetweenGroups":0,
20+
"allowedTasksFailuresPerGroup":0,
21+
"canaryCycleCount":3
22+
},
23+
...
24+
}
25+
}
26+
```
27+
28+
- `acceptanceMode` - defaults to `NONE`
29+
- `NONE` - No additional checks are run against a deploy
30+
- `TIMED` - Wait a set amount of time between deploy steps. Relevant if `enableCanaryDeploy` is `true`
31+
- `CHECKS` - Run all bound implementations of `DeployAcceptanceHook` (see more info below) after each deploy step. Applies to all tasks at once if `enableCanaryDeploy` is `false` and will run on each individual canary step if `enableCanaryDeploy` is `true`
32+
- `enableCanaryDeploy` - Defaults to `false`. If `true` enables a step-wise deploy and use of the other canary settings fields (see below for more details). If `false`, performs a normal atomic deploy where all new instances are spun up and all old ones taken down ones new are healthy.
33+
- _Load balancer note_: If `false` will add all new and remove all old instances in the LB during a deploy in one atomic operation. If `true` new instances will be added to the load balancer alongside old ones, and old ones cleaned after the deploy has fully succeeded.
34+
- `instanceGroupSize` - The number of instances to start per canary group. e.g. if set to `1`, the canary deploy will start `1` instance -> health/acceptance check -> spin down `1` old instance -> start `1` new -> etc
35+
- `waitMillisBetweenGroups` - If `acceptanceMode` is set to `TIMED`, wait this long between groups of new instances of size `instanceGroupSize` (e.g. launch `1`, wait 10 minutes, launch `1` and so on)
36+
- `canaryCycleCount` - Run this many rounds of canary steps before skipping to the full request scale. e.g. if deploying a request of scale `10` and `canaryCycleCount` is set to `3`, 3 instances will be launched one at a time, then the remaining 7 will be launched all at once in a final step
37+
- `allowedTasksFailuresPerGroup` - Replaces the global configuration for allowed task failures in a deploy. For each canary deploy step, this many tasks are allowed to fail and retry before the deploy is considered to have failed
38+
39+
### Custom Deploy Hooks
40+
41+
Applies when `acceptanceMode` is set to `CHECKS`. For those extending SingularityService, you can bind any additional number of implementations of `DeployAcceptanceHook` in guice modules like:
42+
43+
```
44+
Multibinder
45+
.newSetBinder(binder, DeployAcceptanceHook.class)
46+
.addBinding()
47+
.to(MyAcceptanceHook.class);
48+
```
49+
50+
Each implementation of an acceptance hook should look like:
51+
52+
```java
53+
public class MyAcceptanceHook implements DeployAcceptanceHook {
54+
55+
@Inject
56+
public MyAcceptanceHook() {}
57+
58+
@Override
59+
public boolean isFailOnUncaughtException() {
60+
// If `true` an uncaught exception fails a deploy,
61+
// if `false` the deploy can still succeed. Useful for testing
62+
return false;
63+
}
64+
65+
@Override
66+
public String getName() {
67+
// Should be unique per hook
68+
return "My-Test-Hook";
69+
}
70+
71+
@Override
72+
public DeployAcceptanceResult getAcceptanceResult(
73+
SingularityRequest request, // request object. Reflects any updates made during deploy
74+
SingularityDeploy deploy, // Full deploy json object
75+
SingularityPendingDeploy pendingDeploy, // Pending deploy state
76+
Collection<SingularityTaskId> activeTasksForPendingDeploy, // Tasks that are part of the current pending deploy
77+
Collection<SingularityTaskId> inactiveTasksForPendingDeploy, // Tasks from the pending deploy which may have shut down or crashed
78+
Collection<SingularityTaskId> otherActiveTasksForRequest // Tasks from other deploys (e.g. the previous active one)
79+
) {
80+
// Do stuff here
81+
return new DeployAcceptanceResult(
82+
DeployAcceptanceState.SUCCEEDED,
83+
"Test hook passed"
84+
);
85+
}
86+
}
87+
```
88+
89+
The `canarySettings` object will change the state/time during which `getAcceptanceResult` is called:
90+
- If `enableCanaryDeploy` is set to `false`, the state of tasks will be:
91+
- All new tasks in `activeTasksForPendingDeploy` are launched and health checked
92+
- If the deploy is load balanced, tasks in `otherActiveTasksForRequest` are no longer in the load balancer. Only the new deploy tasks in `activeTasksForPendingDeploy` are active in the load balancer
93+
- *Note* - Singularity will re-add the old tasks back to the load balancer if deploy acceptance checks fail
94+
- If `enableCanaryDeploy` is set to `true`, `getAcceptanceResult` is called after each deploy step
95+
- `activeTasksForPendingDeploy` contains _all_ active tasks launched so far, not just those for the current canary step. These tasks are in running state and have passed initial health checks
96+
- If load balanced, all tasks in `activeTasksForPendingDeploy` as well as all in `otherActiveTasksForRequest` are active in the load balancer at once
97+
98+
#### Available Data For Hooks
99+
100+
Since hooks are compiled into the Singularity jar and exstentions of SingularityService, all classes available in guice are also available to the hook. In particular:
101+
- `TaskManager` - On the leader most calls here will be in memory lookups and can be used to fetch the full data for a task (ports, environment, etc)
102+
- `AsyncHttpClient` - Singularity's default http client
103+
- `@Singularity ObjectMapper` - pre-configured object mapper for Singularity objects
104+
105+
### Incremental Deploys (Deprecated)
106+
107+
_Deprecated_: behavior will be preserved, but prefer using the newer `canarySettings` documented above. Incremental deploys are essentially equivalent to using an `acceptanceMode` of `TIMED`.
108+
109+
As of `0.5.0` Singularity supports an incremental deploy for finer-grained control when rolling out new changes. This deploy is enabled via a few extra fields on the `SingularityDeploy` object when starting a deploy:
110+
111+
- `deployInstanceCountPerStep`: Deploy this many instances at a time until the total instance count for the request is reached is reached (`Optional<Integer>`, default is all instances at once)
112+
- `deployStepWaitTimeMs`: Wait this many milliseconds between deploy steps before continuing to deploy the next `deployInstanceCountPerStep` instances (`Optional<Integer>`, default is 0, i.e. continue immediately)
113+
- `autoAdvanceDeploySteps`: automatically advance to the next target instance count after `deployStepWaitTimeMs` seconds (`Optional<Boolean>`, defaults to `true`). If this is `false`, then manual confirmation will be needed to move to the next target instance count. This can be done via the ui.
114+
115+
116+
#### Example
117+
118+
`TestService` is currently running `3` instances. During the next deploy, you want to replace only `1` of these instances at a time and have Singularity wait at least a minute after deploying one so you can verify that everything works as expected. The following fields can be added to the deploy json to accomplish this:
119+
120+
```
121+
deployInstanceCountPerStep: 1
122+
deployStepWaitTimeMs: 60000
123+
autoAdvanceDeploySteps: true
124+
```
125+
126+
When the deploy starts, Singularity will start `1` (`deployInstanceCountPerStep`) instance from the new deploy (The `3` old instances will still be running). Once the new task is determined to be healthy a few things happen:
127+
128+
- Singularity will add the instance from the new deploy to the load balancer (if applicable)
129+
- Singularity will shut down `1` (`deployInstanceCountPerStep`) of the instances from the old deploy after removing it from the load balancer (if applicable)
130+
- Singularity will start counting down the `60000 ms` until it launches the next `deployInstanceCountPerStep` instances
131+
132+
Once the `deployStepWaitTimeMs` of wait time has elapsed, Singularity will start this process again, launching a second task for the new deploy, waiting until it is healthy, then shutting down a task from the old deploy. This will continue until the deploy fails, the deploy is cancelled, or all instances are part of the new deploy and it succeeds.
133+
134+
A few more things to note about the incremental deploy process:
135+
- If the deploy fails or is cancelled, Singularity replaces any missing instances from the old deploy and makes sure they are healthy before shutting down active/healthy instances from the new deploy. (i.e. you will never be under capacity)
136+
- At any time, it is possible to advance the deploy to another target instance count via the UI or API. In other words, you can skip the remaining `deployStepWaitTimeMs`, skip steps of the deploy, or even decrease the instance count to roll back a step.
137+

‎Docs/features/incremental-deploys.md

-31
This file was deleted.

‎SUMMARY.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@
2121
* [Expiring Actions](Docs/features/expiring-actions.md)
2222
* [Shell Commands](Docs/features/shell-commands.md)
2323
* [Task Search](Docs/features/task-search.md)
24-
* [Incremental Deploys](Docs/features/incremental-deploys.md)
24+
* [Canary Deploys](Docs/features/canary-deploys.md)
2525
* [Disaster Detection & Disabled Actions](Docs/features/disaster-detection.md)
2626
* [Cluster Coordinator](Docs/features/cluster-coordinator.md)
2727
* [Upgrading to Mesos 1](Docs/features/mesos-1.md)
@@ -35,4 +35,4 @@
3535
* [Deploy Defaults](Docs/reference/deploy-defaults.md)
3636
* [Health Checks](Docs/reference/healthchecks.md)
3737
* [API Reference](Docs/reference/api.html)
38-
* [OpenAPI JSON](Docs/reference/openapi.json)
38+
* [OpenAPI JSON](Docs/reference/openapi.json)

‎SingularityService/src/main/java/com/hubspot/singularity/hooks/DeployAcceptanceHook.java

+15
Original file line numberDiff line numberDiff line change
@@ -8,12 +8,27 @@
88
import java.util.Collection;
99

1010
public interface DeployAcceptanceHook {
11+
// Unique per hook. Used to track results in SingularityDeployChecker
1112
String getName();
1213

14+
/*
15+
* If `true` any uncaught exception will cause a DeployAcceptanceState of FAILED.
16+
* If `false` an uncaught exception will be ignored and a DeployAcceptanceState of SUCCESS used. Useful for testing
17+
*/
1318
default boolean isFailOnUncaughtException() {
1419
return true;
1520
}
1621

22+
/*
23+
* The `canarySettings` object will change the state/time during which `getAcceptanceResult` is called:
24+
* - If `enableCanaryDeploy` is set to `false`, the state of tasks will be:
25+
* - All new tasks in `activeTasksForPendingDeploy` are launched and health checked
26+
* - If the deploy is load balanced, tasks in `otherActiveTasksForRequest` are no longer in the load balancer. Only the new deploy tasks in `activeTasksForPendingDeploy` are active in the load balancer
27+
* - *Note* - Singularity will re-add the old tasks back to the load balancer if deploy acceptance checks fail
28+
* - If `enableCanaryDeploy` is set to `true`, `getAcceptanceResult` is called after each deploy step
29+
* - `activeTasksForPendingDeploy` contains _all_ active tasks launched so far, not just those for the current canary step. These tasks are in running state and have passed initial health checks
30+
* - If load balanced, all tasks in `activeTasksForPendingDeploy` as well as all in `otherActiveTasksForRequest` are active in the load balancer at once
31+
*/
1732
DeployAcceptanceResult getAcceptanceResult(
1833
SingularityRequest request,
1934
SingularityDeploy deploy,

0 commit comments

Comments
 (0)
Please sign in to comment.