Skip to content

Releases: uber/cadence

v1.2.10

04 Jun 19:57
02c7efb
Compare
Choose a tag to compare

What's Changed

Read more

v1.2.9

01 May 17:46
ba39678
Compare
Choose a tag to compare

What's Changed

Read more

v1.2.8

26 Mar 18:46
3f64176
Compare
Choose a tag to compare

What's Changed

Added

Changed

Fixed

  • Set proper max reset points by @neil-xie in #5623
  • Put a timeout for timer task deletion loop during shutdown by @taylanisikdemir in #5626
  • Catch unit test failures in make test by @Groxx in #5635
  • fix: get messages between query over message_id typo by @zedongh in #5607
  • Fix context leak in tests by @munahaf in #5377
  • Make sure task processing rate limiter is only done in the active side by @sankari165 in #5654
  • Fix Pinot query validator bug when user pass in not equal query with value missing by @neil-xie in #5662
  • Update Pinto query validator failed log, minor refactor pinot visibility store to remove panics by @neil-xie in #5664
  • Fix context leak in pinot integration test by @neil-xie in #5682
  • Fix SignalWithStartWorkflow API by @Shaddoll in #5671
  • Fix wrong migration paths in example by @kotcrab in #5668
  • Fix comment in workflow id cache config by @sankari165 in #5661
  • Fix the local integration test docker-compose file by @jakobht in #5695
  • Do not get workflow execution from database when shard is closed by @Shaddoll in #5697

Removed

  • Removed useless metrics tag from the workflowIDcache by @jakobht in #5651
  • Removed the shadower service for cadence-server by @agautam478 in #5660

New Contributors

Full Changelog: v1.2.7...v1.2.8

v1.2.7

09 Feb 19:00
08d5994
Compare
Choose a tag to compare

What's Changed

Added

Fixed

Changed

Read more

v1.2.6

14 Dec 22:11
558780b
Compare
Choose a tag to compare

What's Changed

Added

Fixed

Changed

  • Cassandra version is changed from 3.11 to 4.1.3 by @taylanisikdemir (#5461)
    • If your machine already has ubercadence/server:master-auto-setup image then you need to repull so it works with latest docker-compose*.yml files
  • Move dynamic ratelimiter to its own file by @jakobht (#5451)
  • Create and use a limiter struct instead of just passing a function by @jakobht (#5454)
  • Dynamic ratelimiter factories by @jakobht (#5455)
  • Update github action for image publishing to released by @3vilhamster (#5460)
  • Update matching to emit metric for tasklist backlog size by @Shaddoll (#5448)
  • Change variable name from SecondsSinceEpoch into EventTimeMs by @bowenxia (#5463)

Removed

New Contributors

Full Changelog: v1.2.5...v1.2.6

v1.2.5

02 Nov 19:07
eb8eea9
Compare
Choose a tag to compare

What's Changed

Added

  • Scanner / Fixer changes by @Groxx in #5361
    • Stale-workflow detection and cleanup added to shardscanner, disabled by default.
    • New dynamic config to better control scanner and fixer, particularly for concrete executions.
    • Documentation about how scanner/fixer work and how to control them, see the scanner readme.md
    • This also includes example config to enable the new fixer.
  • MigrationChecker interface to expose migration CLI by @abhishekj720 in #5424
  • Added Pinot as new visibility store option by @neil-xie in #5201
    • Added pinot visibility triple manager to provide options to write to both ES and Pinot.
    • Added pinotVisibilityStore and pinotClient to support CRUD operations for Pinot.
    • Added pinot integration test to set up Pinot test cluster and test Pinot functionality.

Fixed

Full Changelog: v1.2.4...v1.2.5-prerelease3

v1.2.4

27 Sep 19:03
c93d6af
Compare
Choose a tag to compare

What's Changed

Full Changelog: v1.2.3...v1.2.4

v1.2.3 (Retracted, please use v1.2.4)

15 Sep 22:10
4a16136
Compare
Choose a tag to compare
Pre-release

Added

Expose workflow history size and count to client by @timl3136 (#5392)

Fixed

[cadence-cli] fix typo in input flag for parallelism by @sankari165 (#5397)

Changed

Update config store client to support SQL database by @Shaddoll (#5395)
Scaffold config store for sql plugins by @Shaddoll (#5396)
Improve poller detection for isolation by @Shaddoll (#5399)

v1.2.2

19 Sep 16:34
e5f605c
Compare
Choose a tag to compare

What's Changed

Full Changelog: v1.2.1...v1.2.2

v1.2.1

19 Sep 03:56
0e17485
Compare
Choose a tag to compare

Project release: Zonal isolation

This version introduces a few resiliency concepts into customers' worker task processing such that they can detect deployment or configuration failures earlier. These features are opt-in.

The high-level concept is to provide a means to subdivide work (called 'isolation-groups') for workers along whatever partitioning mechanism that is required for your service.

By default the partitioning mechanism provided will attempt to keep workflows running in the location the are started, such that customers may identify broken changes earlier, rather than waiting for the deployment of an entire region. However, if there are no pollers available available in that subdivision, it'll route the work elsewhere.

Nomenclature

Partitioning: A means to subdivide the tasks given to workflows, of which there are many possible schemes and one default one provided. When a workflow is started, a group of partition keys are provided by request headers. The partition keys are used to determine which isolation group of workers should process these workflows.
Workflow pinning: A partitioning scheme which emphasizes keeping workflows running in the location they were started
Isolation-groups: A division of work within a customer region in which they can subdivide their workers and pin the workflows. This originally was intended as a synonym for 'zone' in the site reliability, as a subdivision of a region. However the important point is that this is a failure domain for customer workflows, so this may be an arbitrary subdivision of your cluster's traffic.
Isolation-group drain: A means of excluding work from an isolation-group. If an isolation group is drained, workers from that isolation group won't be able to get any task. And customers cannot start workflows from that isolation group.

Default concepts and approaches

The partitioning and isolation concepts are intended to be provided as general purpose orchestration concepts and flexible, with some basic defaults provided. By default the following behaviour is given:

  • Partition data is persisted with workflow execution records by the provided middleware if the provided header is passed when workflows are created.
  • The cadence client and worker Go libraries will pass these as headers if provided in client options

Pinning behaviour

The workflow original zone is captured on workflow start and will be used on workflow processing.

The default partitioner provides the following behaviour: It will attempt to dispatch work in a zone where the workflow was started. However, workers may not be available in that zone, or no longer available for some reason. So the partitioner takes information from a lookback of poller information and uses this lookback data to ensure that the workflow can be processed. If the the start isolation-group is not available it'll another healthy random one.

'Health', here, is determined as the presence of pollers and the absence of drains.

The 'unpinning' is import for two main reasons: firstly, it's quite possible to start a workflow from an unrelated isolation-group in which the pollers are created and to suddenly blackhole that work would likely be not the desired behaviour. But secondly, and probably more importantly, this prevents a head-of-line blocking problem internally for Cadence. At the database level (in this release anyway) tasks need to be dispatched in-order and so if an isolation-group were to be not processed it would block task processing.

Drains

This release also introduces a simplistic notion of drains, which allow for isolation-groups to be excluded from traffic processing, should that be required. Drains are issuable via the Admin API or via cli:

eg:

cadence admin isolation-groups update-global --set-drains zone-1
cadence admin isolation-groups get-global

This information is stored in the config-store and is not part of dynamic configuration.

Configuration

In order to use this feature, the requisite configuration is required:

system.allIsolationGroups: This is a list of all the possible isolation-groups
system.enableTasklistIsolation: This is the bool flag to enable it for a domain

Implementation

The changes for this feature are largely in Matching and can be (reductively) described as: Sync and Async-match in Cadence as being made aware of a new dimension; their associated isolation-group. The tasks piped through the Matching service are matching the appropriate isolation-group channel.

What's Changed

Read more