Skip to content

Commit

Permalink
Implements mermaid and adds mermaid shortcode to implement diagram ma…
Browse files Browse the repository at this point in the history
…rkup from markdown processor
  • Loading branch information
lougroshek committed Jun 18, 2020
1 parent 436ea9e commit 6fc428c
Show file tree
Hide file tree
Showing 18 changed files with 619 additions and 506 deletions.
1 change: 1 addition & 0 deletions .eslintignore
@@ -0,0 +1 @@
./themes/**
46 changes: 26 additions & 20 deletions README.md
Expand Up @@ -10,29 +10,35 @@ Documentation is versioned using subdirectories in the `./content/{lang}/` direc

At present, there is only one language in the `./content` directory. Docsy assumes lang-en and uses this language automatically, but you can add additional directories with different contents. There is also a langauge switcher in the navbar that can be enabled by adding that language to the `[languages]` map in `.config.toml`.


## TODO

- [ ] Newsletter signup (?)
- [ ] Format for news and videos list pages? Use docs template or other?
- [ ] Theming
- [ ] Layout for landing page
- [ ] Layout for getting started page

## Notes/Questions

- [ ] Please review the content hierarchy for the docs, since I'm combining what were a few separate directories from the Jekyll site. Just to make sure this is organized the way you want.
- [ ] Want Algolia search in nav bar (default) or get Algolia search to work with side nav in docs template?
- [ ] Docsy has a built-in version selector for documentation, will it meet the stated need? I can also create a new shortcode in order to place it elsewhere...
- [ ] Header hierarchy in some of the docs pages needs to be changed so that the page menu will work. Specifically, and for a11y reasons as well, there should never be more than one h1 on a page. I assume that can be done internally?
- [ ] You may also want to shorten the `linkTitle` frontmatter variable for docs `_index.md` pages. Right now the long page titles makes the content menu very long.
- Docsy docs automatically renders a `description` frontmatter string below section and page titles. You may want to move the short descriptions at the beginning of docs pages into that frontmatter variable and out of page content. Ex: "The goal of this codelab is to trigger a Spinnaker pipeline with a Pub/Sub message from GCS upon upload of a tarball."
- Docsy also automatically renders a list of contents for each directory with an `_index.*` file in the `./contents` dir. So where pages have a table of contents written into the landing page that TOC could be removed. (This would make the docs easier to maintain in the long run.)
- [ ] Hugo's new markdown processor, goldmark, doesn't support some of the markdown conventions I'm seeing. For example, link attribute assignment: `{:target="\_blank"}`. I'm leaving these things in for now.

## Docs Frontmatter Variables

`title`: Displayed on the content page
`linkTitle`: displayed where a link to the page appears (in the docs menu)
`weight`: Determines the order of appearance in lists of content in the same directory, lowest first. To let all titles appear in alphabetical order, remove all weights.
`description`: Short description, appears in lists of directory contents and on content page.
`mermaid`: Boolean `true` indicates that MermaidJS should be loaded on the page.

## Mermaid

Mermaid is loaded into content pages only when the boolean frontmatter variable `mermaid` is set to `true`.

1. Use the `mermaid` shortcode to make sure your graph isn't processed as markdown:

```
{{< mermaid >}}
graph TB
clouddriver(Clouddriver) --> clouddriver-caching(Clouddriver-Caching);
clouddriver --> clouddriver-rw(Clouddriver-RW);
clouddriver --> clouddriver-ro(Clouddriver-RO);
clouddriver --> clouddriver-ro-deck(Clouddriver-RO-Deck)
classDef default fill:#d8e8ec,stroke:#39546a;
linkStyle default stroke:#39546a,stroke-width:1px,fill:none;
classDef split fill:#42f4c2,stroke:#39546a;
class clouddriver-caching,clouddriver-ro,clouddriver-ro-deck,clouddriver-rw,echo-scheduler,echo-worker split
{{< /mermaid >}}
```

2. Add the frontmatter variable to the page: `mermaid: true`.
115 changes: 56 additions & 59 deletions content/en/docs/v1.19/guides/runbooks/orca-quality-of-service.md
@@ -1,13 +1,12 @@
---
title: "Orca Quality of Service"
linkTitle: "Orca Quality of Service"
title: 'Orca Quality of Service'
linkTitle: 'Orca Quality of Service'
weight: 2
description:
description:
mermaid: true
---



**EXPERIMENTAL**: This feature is still in an early adoption / experimental phase.
**EXPERIMENTAL**: This feature is still in an early adoption / experimental phase.
While you can use it today (Orca v6.71.0), Netflix is currently running this in learning mode / judiciously enabling in response to on-call events.

Spinnaker ships with an optional Quality of Service (QoS) module that can be used to manage the amount of active executions running at any given time.
Expand All @@ -22,17 +21,17 @@ The rest of this section assumes that QoS is enabled.

When an execution is submitted to Orca (either manually via the API or UI, or through an automated trigger), Orca will first emit a synchronous `BeforeExecutionPersist` event which the QoS [ExecutionBufferActuator][actuator] is listening on.
The behavior of the `ExecutionBufferActuator` depends firstly on the result of a
[BufferStateSupplier][buffer-state-supplier].
The `BufferStateSupplier` can perform whatever heuristics necessary to determine whether or not any new execution should go through the QoS process.
[BufferStateSupplier][buffer-state-supplier].
The `BufferStateSupplier` can perform whatever heuristics necessary to determine whether or not any new execution should go through the QoS process.
If the `BufferStateSupplier` returns `false`, no other QoS actions occur and the execution is started as normal.

In the event `BufferStateSupplier` returns `true`, the execution is passed through a chain of ordered [BufferPolicy][buffer-policy] functions.
These `BufferPolicy` functions return a result defining whether or not to `BUFFER` or `ENQUEUE` the execution.
These `BufferPolicy` functions return a result defining whether or not to `BUFFER` or `ENQUEUE` the execution.
All `BufferPolicy` functions must return `ENQUEUE`, otherwise the execution will be assigned a status of `BUFFERED`, delaying the initialization of the execution.
When an execution is `BUFFERED`, it will effectively stay in a waiting state until it is unbuffered, which we'll go over later.

`BufferPolicy` functions are pluggable and can contain arbitrary logic.
For example, one `BufferPolicy` that is always enabled is [EnqueueDeckOrchestrationBufferPolicy][deck-buffer-policy], which will always `ENQUEUE` an execution it is an Orchestration and from the UI.
`BufferPolicy` functions are pluggable and can contain arbitrary logic.
For example, one `BufferPolicy` that is always enabled is [EnqueueDeckOrchestrationBufferPolicy][deck-buffer-policy], which will always `ENQUEUE` an execution it is an Orchestration and from the UI.
This specific policy forces the `ENQUEUE` status, even if other policies call for the execution to be buffered; this is done through a `force` flag that policies can return.
An example of other pluggable behavior is determining buffering action based on criticality of the execution: At Netflix we have a custom concept of application criticality, so we can buffer low criticality executions to allow capacity for higher criticality executions.

Expand All @@ -44,36 +43,36 @@ This promotion process happens (by default) on a 5-second interval on every Orca

With both `BufferPolicy` and `PromotionPolicy`, the results of each function returns a result with a human readable "reason", which is logged out for each execution that is evaluated so it is easy to trace.

<div class="mermaid">
sequenceDiagram
participant ExecutionPersister
participant ExecutionBufferActuator
participant BufferPolicy
participant ExecutionPromoter
participant PromotionPolicy
participant ExecutionLauncher
ExecutionPersister->>ExecutionBufferActuator: BeforeExecutionPersistEvent
ExecutionBufferActuator->>BufferPolicy: Execution
loop Buffer Chain
BufferPolicy->BufferPolicy: Evaluate if BUFFERED or ENQUEUED
end
alt ENQUEUED
BufferPolicy->>ExecutionLauncher: Start Execution
else BUFFERED
Note over BufferPolicy: Set BUFFERED status
end
note right of ExecutionPromoter: Every n seconds
ExecutionPromoter->>PromotionPolicy: All BUFFERED executions "candidates"
loop PromoteCycle
PromotionPolicy->PromotionPolicy: Reduce candidates
end
PromotionPolicy->>ExecutionPromoter: Final promotion candidates
loop Promote Executions
note over ExecutionPromoter: For each promoted execution
ExecutionPromoter->>ExecutionPersister: Update Execution status to NOT_STARTED
ExecutionPromoter->>ExecutionLauncher: Start Execution
end
</div>
{{< mermaid >}}
sequenceDiagram
participant ExecutionPersister
participant ExecutionBufferActuator
participant BufferPolicy
participant ExecutionPromoter
participant PromotionPolicy
participant ExecutionLauncher
ExecutionPersister->>ExecutionBufferActuator: BeforeExecutionPersistEvent
ExecutionBufferActuator->>BufferPolicy: Execution
loop Buffer Chain
BufferPolicy->BufferPolicy: Evaluate if BUFFERED or ENQUEUED
end
alt ENQUEUED
BufferPolicy->>ExecutionLauncher: Start Execution
else BUFFERED
Note over BufferPolicy: Set BUFFERED status
end
note right of ExecutionPromoter: Every n seconds
ExecutionPromoter->>PromotionPolicy: All BUFFERED executions "candidates"
loop PromoteCycle
PromotionPolicy->PromotionPolicy: Reduce candidates
end
PromotionPolicy->>ExecutionPromoter: Final promotion candidates
loop Promote Executions
note over ExecutionPromoter: For each promoted execution
ExecutionPromoter->>ExecutionPersister: Update Execution status to NOT_STARTED
ExecutionPromoter->>ExecutionLauncher: Start Execution
end
{{< /mermaid >}}

**Note**: This is the first implementation of the QoS system, we plan to iterate on this concept and make it more advanced over time.
You can read the [original proposal][proposal] to get an idea of a potential roadmap.
Expand All @@ -83,53 +82,51 @@ You can read the [original proposal][proposal] to get an idea of a potential roa
These configurations are not guaranteed to be fully inclusive of all knobs.
A definitive list is available via the codebase.

* `qos.enabled`: Boolean (default `false`). Global flag controlling whether or not the system is enabled. This flag will not disable the `ExecutionPromoter`.
* `qos.learningMode.enabled`: Boolean (default `true`). If enabled, executions will always be `ENQUEUED`, but log messages & metrics will be emitted saying what the system would have done. This flag has no effect on `ExecutionPromoter`.
* `pollers.qos.promoteIntervalMs`: Integer (default `5000`). The time (in milliseconds) that the promotion process will be run.
- `qos.enabled`: Boolean (default `false`). Global flag controlling whether or not the system is enabled. This flag will not disable the `ExecutionPromoter`.
- `qos.learningMode.enabled`: Boolean (default `true`). If enabled, executions will always be `ENQUEUED`, but log messages & metrics will be emitted saying what the system would have done. This flag has no effect on `ExecutionPromoter`.
- `pollers.qos.promoteIntervalMs`: Integer (default `5000`). The time (in milliseconds) that the promotion process will be run.

## BufferPolicy: Naive

The `NaiveBufferPolicy` will always buffer executions when enabled.

* `qos.bufferPolicy.naive.enabled`: Boolean (default `true`).
- `qos.bufferPolicy.naive.enabled`: Boolean (default `true`).

## BufferStateSupplier: ActiveExecutions

The `ActiveExecutionsBufferStateSupplier` will enable/disable the buffering state based on the number of active executions in the system.

* `qos.bufferingState.supplier` must be set to `activeExecutions`.
* `qos.bufferingState.activeExecutions.threshold`: Integer (default `100`). The high threshold of active executions before QoS will start actuating on executions.
* `pollers.qos.updateStateIntervalMs`: Integer (default `5000`). The time (in milliseconds) that the function will update its internal record for how many executions are running in the system.
- `qos.bufferingState.supplier` must be set to `activeExecutions`.
- `qos.bufferingState.activeExecutions.threshold`: Integer (default `100`). The high threshold of active executions before QoS will start actuating on executions.
- `pollers.qos.updateStateIntervalMs`: Integer (default `5000`). The time (in milliseconds) that the function will update its internal record for how many executions are running in the system.

## BufferStateSupplier: KillSwitch

The `KillSwitchBufferStateSupplier` will enable/disable the buffering state based on configuration only.
This is handy if you're evaluating the fundamentals of the QoS system, or you want a break-the-glass operator knob to control QoS.

* `qos.bufferingState.supplier` must be set to `killSwitch`.
* `qos.bufferingState.killSwitch.enabled`: Boolean (default `false`). If `true`, QoS will be enabled.
- `qos.bufferingState.supplier` must be set to `killSwitch`.
- `qos.bufferingState.killSwitch.enabled`: Boolean (default `false`). If `true`, QoS will be enabled.

## PromotionPolicy: Naive

The `NaivePromotionPolicy` will promote _N_ executions every promotion cycle.

* `qos.promotionPolicy.naive.enabled`: Boolean (default `true`). Whether or not this policy is enabled.
* `qos.promotionPolicy.naive.size`: Integer (default `1`). The max number of executions to promote.
- `qos.promotionPolicy.naive.enabled`: Boolean (default `true`). Whether or not this policy is enabled.
- `qos.promotionPolicy.naive.size`: Integer (default `1`). The max number of executions to promote.

# Monitoring

* `qos.executionsBuffered`: Counter. The number of executions that have been buffered.
* `qos.executionsEnqueued`: Counter. The number of executions that have been enqueued (e.g. passed through the system and were judged not to be buffered).
* `qos.actuator.elapsedTime`: Timer. The amount of time that is spent passing an execution through all enabled `BufferPolicy`s.
* `qos.promoter.elapsedTime`: Timer. The amount of time that is spent passing an execution through all enabled `PromotionPolicy`s. Since the promoter is run on a static interval, this should usually be a relatively high, yet constant, number.
* `qos.promoter.executionsPromoted`: Counter. The number of executions that have been promoted.
- `qos.executionsBuffered`: Counter. The number of executions that have been buffered.
- `qos.executionsEnqueued`: Counter. The number of executions that have been enqueued (e.g. passed through the system and were judged not to be buffered).
- `qos.actuator.elapsedTime`: Timer. The amount of time that is spent passing an execution through all enabled `BufferPolicy`s.
- `qos.promoter.elapsedTime`: Timer. The amount of time that is spent passing an execution through all enabled `PromotionPolicy`s. Since the promoter is run on a static interval, this should usually be a relatively high, yet constant, number.
- `qos.promoter.executionsPromoted`: Counter. The number of executions that have been promoted.

# Additional Notes

The QoS system is currently shared-nothing state. Each Orca instance will maintain its own state (aside from configuration) about whether or not it should be buffering executions, or when it should be running pollers.

{% include mermaid %}

[module]: https://github.com/spinnaker/orca/tree/master/orca-qos
[actuator]: https://github.com/spinnaker/orca/blob/master/orca-qos/src/main/kotlin/com/netflix/spinnaker/orca/qos/ExecutionBufferActuator.kt
[buffer-state-supplier]: https://github.com/spinnaker/orca/blob/master/orca-qos/src/main/kotlin/com/netflix/spinnaker/orca/qos/BufferStateSupplier.kt
Expand Down

0 comments on commit 6fc428c

Please sign in to comment.