Releases · uber/cadence

04 Jun 19:57

neil-xie

v1.2.10

02c7efb

v1.2.10 Latest

Latest

What's Changed

Update duplicate request error to include request type by @Shaddoll in #5910
Update mutable state to generate workflow requests by @Shaddoll in #5821
Add AsDuplicateRequestError function by @Shaddoll in #5914
Bugfix for enumer in go 1.22 by @Groxx in #5915
Add tests for common/persistence/retryer.go by @natemort in #5911
Add tests for common/persistence/shardManager.go by @natemort in #5916
Add tests for persistence/workflow_execution_info.go by @natemort in #5918
Add more unit test to history handler by @timl3136 in #5897
Get rid of mutex in matching/liveness and reduce test duration by @taylanisikdemir in #5917
Add memo in pinot by @bowenxia in #5902
Added Executor Interface and TimerTaskExecutorBase with stop() Method and improve context management in TimerQueueProcessor by @timl3136 in #5920
[code-coverage] Add more tests for service/history/decision package by @ketsiambaku in #5909
Add document explaining the schema of Cassandra executions table by @Shaddoll in #5921
Add tests for ReadHistoryBranch by @jakobht in #5899
Fix failover error causing child workflows to get stuck by @davidporter-id-au in #5919
Adding tests for nosqlQueueStore by @dkrotx in #5924
Changed the error to DomainNotActive for Deprecated domains by @abhishekj720 in #5929
[code-coverage] clean up tests in history/decision/handler by @ketsiambaku in #5932
[code-coverage] add tests for HandleDecisionTaskCompleted() by @ketsiambaku in #5934
Fix bug when pass close status as an integar string by @neil-xie in #5935
Workaround for query-consistency-strong which is presently partially broken by @davidporter-id-au in #5928
Fix GetListWorkflowExecutionsByStatusQuery to set status as int by @neil-xie in #5936
Upgrade apache thrift to v.0.17.0 by @3vilhamster in #5814
[cassandra] Expose timeout and consistency level configuration by @mantas-sidlauskas in #5675
Fix slice reuse in cassandra/domain.go by @natemort in #5937
Add double read for latency comparison for Pinot Migration by @bowenxia in #5927
Add missing metric tag for GetTaskListSizeRequest by @Shaddoll in #5939
Add tests for ForkHistoryBranch by @jakobht in #5922
Migrate Buildkite CI from AWS to GKE agent queues by @mstifflin in #5912
Fix checksum validation for SQL by @Shaddoll in #5940
Global ratelimiter, part 2: Any-typed RPCs, mappers, and stub handler by @Groxx in #5817
Integration test for workflow ID based rate limiting task processing by @sankari165 in #5933
[code-coverage] Add more tests for HandleDecisionTaskCompleted by @ketsiambaku in #5945
Update internal types to adopt new IDL changes by @Shaddoll in #5946
[Pinot] fix bug when querying a string field in attr with an empty value by @bowenxia in #5941
Add tests for DeleteHistoryBranch by @jakobht in #5943
We now wait 10 seconds before we start returning shard closed errors, also stop retrying on shard closed errors by @jakobht in #5938
Revert lowering the new line check by @jakobht in #5954
Increase timeouts to prevent flakiness by @sankari165 in #5953
Added tests for GetAllHistoryTreeBranches by @jakobht in #5944
Bugfix: we address hosts using string(rune(shardID)), not by itoa(shardD) by @dkrotx in #5952
Add staleness check to RecordChildExecutionCompleted by @Shaddoll in #5955
[code-coverage] Add more test cases for HandleDecisionTaskCompleted by @ketsiambaku in #5950
Adding unit tests for client/matching/client.go by @sankari165 in #5959
[code-coverage] Introduced first set of tests for taskHandler in service/history/decision by @ketsiambaku in #5960
Fix a bug when set memo in pinot visibility store by @neil-xie in #5961
unit test for cassandra/visibility.go by @d-vignesh in #5948
[code-coverage] Tests for Decision taskHandler by @ketsiambaku in #5951
Publish multiple platform docker image when release server by @neil-xie in #5962
Updated the changelog for release 1.2.9 by @jakobht in #5963
Update task executor to handle WorkflowAlreadyCompletedError for signal and cancel workflow by @Shaddoll in #5956
Fix wrong comment on enableAsyncWorkflowConsumption dynamic config by @taylanisikdemir in #5964
Add metric for async request payload size by @Shaddoll in #5965
Async wf consumer manager should watch its enabled/disabled state instead of relying on restart by @taylanisikdemir in #5966
chore: fix function names in comment by @verytrap in #5894
Replace wurstmeister kafka/zookeeper images with bitnami kafka image by @taylanisikdemir in #5975
Split historyEngine.go into small files by @taylanisikdemir in #5972
Added unit tests for service/history/handler by @timl3136 in #5970
Add unit tests for mutable state task refresher by @Shaddoll in #5971
Revert codecov patch threshold to 85% by @taylanisikdemir in #5982
Api handler test respond activity task failed alternate by @ibarrajo in #5980
Move shardscanner workflow tests to the shardscanner package by @natemort in #5981
Add tests for service/frontend/config/config.go by @natemort in #5968
Added tests for the history_events.go by @agautam478 in #5978
Added additional unit tests for service/history/handler.go by @timl3136 in #5984
Reduce flakiness on workflow-ID-specific ratelimit test by @Groxx in #5986
Enforcing go vet -copylocks and fixing current violations by @Groxx in #5967
Added new tests to config_Store_client_test.go by @agautam478 in #5983
Add tests for history/execution/history_builder.go by @natemort in #5977
History engine start/stop unit tests by @taylanisikdemir in #5985
Added tests to history_events.go. by @agautam478 in #5988
Added unit tests for history handler by @timl3136 in #5987
Add unit test for open search client bulk requests by @neil-xie in #5974
Add tests for history/engine/engineimpl/describe_workflow_execution.go by @natemort in #5992
Add test for NewHistoryReplicator in history_replicator.go by @bowenxia in #5994
Added additional unit tests for methods history/handler.go by @timl3136 in #5993
lowering threshold for PRs for a one-time refactor/split by @davidporter-id-au in #5997
Add unit test for frontend/admin/handler - part 1 by @neil-xie in #5991
Minor splitting of mutable state builder file by @davidporter-id-au in #5990
Write tests for history engine's RefreshWorkflowTasks by @taylanisikdemir in #5995
Update coverage exclusions by @taylanisikdemir in #5999
Replication task processor shutdown improvements and start/stop unit tests by @taylanisikdemir in #5996
Added additional unit tests testing history handler by @timl3136 in #6001
Add test coverage for service/history/engine/engineimpl/reset_workflow_execution.go by @natemort in #6002
mutable-state: copy to persistence round-trip test by @davidporter-id-au in #5998
Added tests for GetResurrected timers in integrity for hi...

Contributors

Groxx, jakobht, and 18 other contributors

Assets 3

01 May 17:46

jakobht

v1.2.9

ba39678

v1.2.9

What's Changed

Addition of tests for ArchivalConfigStateMachine in common/domain by @abhishekj720 in #5698
Introduce new dynamic config for enabling wfID based ratelimiting by @jakobht in #5703
Add unit tests for sql plugin registration by @Shaddoll in #5705
Add unit tests for sql helper functions by @Shaddoll in #5706
Add unit test for helper function of sql execution store by @Shaddoll in #5707
Generate a metadata file artifact in unit test buildkite job by @taylanisikdemir in #5708
Write tests for cdb.UpdateWorkflowExecutionWithTasks by @taylanisikdemir in #5709
Add unit tests for helper functions in sql execution store util by @Shaddoll in #5710
Add unit tests for CreateWorkflowExecution by @Shaddoll in #5715
Test: Addition of tests for replicationQueue publish and publish to dlq by @abhishekj720 in #5700
Implemented ratelimiting for external calls pr wfid (guarded by feature flag) by @jakobht in #5704
remove old metrics wrappers and use new generated metered wrappers by @3vilhamster in #5717
Proper shutdown of kafka consumer impl and fix test by @taylanisikdemir in #5712
Add additional unit tests for functions in constants.go by @timl3136 in #5713
Initial codecov integration by @taylanisikdemir in #5711
Add tests for UpdateWorkflowExecution by @Shaddoll in #5718
Tests for UpdateWorkflowEecution in nosql store-Part1 by @agautam478 in #5719
Add unit tests for ConflictResolveWorkflowExecution by @Shaddoll in #5721
Add tests for elasticsearch v6 client by @neil-xie in #5716
Add unit tests for persistence task types in DataManagerInterfaces by @timl3136 in #5720
Add unit tests for CreateFailoverMarkerTasks by @Shaddoll in #5724
Change noisy frontend poll timeout log to debug level by @taylanisikdemir in #5725
Added unit tests for nosql_execution_Store_util.go - Part1 by @agautam478 in #5723
Straightforwardly fixes a few minor copy bugs and adds a small fuzz util by @davidporter-id-au in #5572
Add test for ES v6 client Search method by @neil-xie in #5727
Tests for Common/Domain: Adding tests for replication queue message handling and ack update by @abhishekj720 in #5730
Add more unit tests for persistence task types in DataManagerInterfaces by @timl3136 in #5726
Added two more test cases for the updateworkflowexecution by @agautam478 in #5722
[history] refactor history client with timeout wrapper by @shijiesheng in #5728
Add unit tests for PinotVisibilityStore by @bowenxia in #5714
Removed errors file from test coverage by @abhishekj720 in #5735
Test for Common/domain/replication_queue: GetMessagesfromDLQ & AckLevel by @abhishekj720 in #5734
Added unit tests for Delete current and workflow execution, list all … by @agautam478 in #5733
Added unit tests for PrepareResetWorkflowExecutionRequestWithMapsAndE… by @agautam478 in #5731
Adding more unit tests for ES v6 client by @neil-xie in #5739
Tests for GetDLQAckLevel and UpdateDLQAckLevel by @abhishekj720 in #5740
Add unit tests for TaskInfo types and utility functions by @timl3136 in #5732
Tests for common/domain: tests TestGetDLQSize, TestRangeDeleteMessagesFromDLQ and TestDeleteMessageFromDLQ by @abhishekj720 in #5741
Add error case tests for pinot_visibility_store by @bowenxia in #5746
Add unit test for util methods in es v6 client bulk processor by @neil-xie in #5748
Add unit tests for GetWorkflowExecution by @Shaddoll in #5736
Adds test for execution/mutable_state_builder.go by @davidporter-id-au in #5744
Add unit tests for the util functions in data_manager_interface by @timl3136 in #5742
Very minor nil-or-empty cleanup by @Groxx in #5745
Added more tests for nosql_execution_store.go by @agautam478 in #5738
Write more tests for cassandra/workflows.go by @taylanisikdemir in #5750
Added more tests for nosql_execution_stor_util.go by @agautam478 in #5752
Enforce leading space on comments by @Groxx in #5747
Add unit tests for common/persistence/sql/factory.go by @Shaddoll in #5751
[history] fix generated timeout wrapper by @shijiesheng in #5737
Add unit tests for functions in gocql/batch.go by @timl3136 in #5759
Add test for es v6 bulk processor by @neil-xie in #5758
Added test for replicationTaskExecutor: execute by @abhishekj720 in #5754
Add unit test for ES v7 client by @neil-xie in #5760
Added test cases for more util methods by @agautam478 in #5755
More unit tests for nosql_execution_store_test.go by @agautam478 in #5753
Add unit test for pinot folder with coverage to 93.4% by @bowenxia in #5761
[code-coverage] update admin and frontend client to use generated code by @ketsiambaku in #5702
Tests for PurgeAckedMessages and replicationMessage in common/domain/replication_queue by @abhishekj720 in #5749
Code cleanup for sql package by @Shaddoll in #5756
Add unit test for es v7 bulk processor by @neil-xie in #5764
Added test for pinot_visibility_metric_clients.go by @bowenxia in #5767
adding mutable state builder tests - adding continue-as-new events by @davidporter-id-au in #5768
Refactor/adding mutable state builder tests iv by @davidporter-id-au in #5769
Add unit test for open search client part 1 by @neil-xie in #5774
minor mutable-state log fix by @davidporter-id-au in #5776
refactor common/persistence/pinot tests by @bowenxia in #5777
Addition of tests for archivalConfigStateMachine in common/domain by @abhishekj720 in #5778
Re-enable sql unit test by @Shaddoll in #5779
Test: Validate domain config test for attrValidator by @abhishekj720 in #5699
refactor pinot_visibility_store_test by @bowenxia in #5780
[code-coverage] Generate code for matching client timeout wrapper by @ketsiambaku in #5771
Fix data race in matching test suite by @taylanisikdemir in #5781
hot fix for unit test cases that might cause a failure by @bowenxia in #5787
Adding unit tests for TestPrepareTransferTasksForWorkflowTxn by @agautam478 in #5763
Ignore requests send from pinot response comparator by @bowenxia in #5788
Coverage for dataStoreInterfaces by @Groxx in #5743
Retryable error for workflow rate limits in task processing by @sankari165 in #5782
Re-enable kafka consumer test by @taylanisikdemir in #5791
Global ratelimiter, part 1: core algorithm for computing weights by @Groxx in #5689
Write tests for cassandra SelectWorkflowExecution by @taylanisikdemir in #5792
Fix workflow deletion by @Shaddoll in #5793
Fix checksum validation for SQL implementation by @Shaddoll in #5790
added unit test for function in mapper-thrift-configstore file by @d-vignesh in #5789
Error mapper tests by @jakobht in #5795
Add a benchmark test for crc checksum by @Shaddoll in #5798
Add metric and retry backoff for checksum failure by @Shaddoll in #5797
Added new er...

Contributors

Groxx, jakobht, and 15 other contributors

Assets 3

26 Mar 18:46

neil-xie

v1.2.8

3f64176

v1.2.8

What's Changed

Added

Adding unit-test for matching:newTaskListID by @dkrotx in #5513
Get/Update DomainAsyncWorkflowConfiguration methods in admin API and CLI by @taylanisikdemir in #5616
Workflow ID cache size metric by @jakobht in #5619
Add a helper script to run cassandra and execute tests by @taylanisikdemir in #5620
Scaffold StartWorkflowExecutionAsync API by @Shaddoll in #5621
Scaffold async workflow queue provider component by @Shaddoll in #5627
Update run_cass_and_test.sh script to setup cassandra schemas by @taylanisikdemir in #5628
Add debug logs in PinotTripleVisibilityManager for response comparator testing by @bowenxia in #5631
Adding a sample call to TaskValidator in update workflow cycle by @agautam478 in #5634
Add a middleware for comparator to use by @bowenxia in #5637
Generate rate limit frontend api handler by @Shaddoll in #5636
Add generic OAuth support by @mantas-sidlauskas in #5638
Added metrics for when we rate limit by @jakobht in #5640
Implement StartWorkflowExecutionAsync API by @Shaddoll in #5642
Added 2 more tags in log for comparator to use. by @bowenxia in #5646
Async workflow request consumer manager in worker by @taylanisikdemir in #5655
Add async workflow request consumer for Start/SignalWithStart support by @taylanisikdemir in #5658
Set rate limit on Async APIs by @Shaddoll in #5659
Implement SignalWithStartWorkflowExecutionAsync API by @Shaddoll in #5657
Docker compose setup for async workflows with kafka queue by @taylanisikdemir in #5663
Add a make pr target for an easy "do automated checks for PR" command by @Groxx in #5670
Added debug information for decision timeout handling by @3vilhamster in #5674
Async workflows integration test with kafka by @taylanisikdemir in #5678
Add missing IsolationGroups field in domain cache entry by @taylanisikdemir in #5679
Add close status parse method in pinot query validator by @neil-xie in #5680
Add async workflow integration test step to CI by @taylanisikdemir in #5681
Add metrics for external calls for the workflow ID specific rate limits by @jakobht in #5684
Write tests for cdb (Cassandra DB wrapper) basic functions by @taylanisikdemir in #5686
Added a unit test for nosql execution store - createworkflowexecution by @agautam478 in #5687
Write tests for cdb.InsertWorkflowExecutionWithTasks by @taylanisikdemir in #5688
Added more scenarios to createworkflowexecution test- Part1 by @agautam478 in #5690
Added a test for the GetworkflowExecution in the nosql_execution_store.go file. by @agautam478 in #5692
Write tests for cdb.SelectCurrentWorkflow by @taylanisikdemir in #5693
Support AsyncWorkflowConfiguration decoding in admin CLI by @taylanisikdemir in #5694

Changed

Replace JWT validation library by @mantas-sidlauskas in #5592
feat: pprof support config host by @zedongh in #5601
Refactor persistence serializer tests and add more cases by @taylanisikdemir in #5625
Upgrade domain_config type in cassandra schema to add async wf config by @taylanisikdemir in #5630
Refactor frontend API handler and use generated code to emit metrics by @Shaddoll in #5639
Enable the workflow ID cache in shadow mode for start workflow by @jakobht in #5641
Filtering the prefix in custom query log for pinot response comparator by @bowenxia in #5643
The ratelimiter needs to be created with the domain name not the ID by @jakobht in #5644
Update async workflow queue idl change by @Shaddoll in #5645
Rewrite async workflow queue provider component by @Shaddoll in #5648
Store mutable state checksum in SQL storage by @Shaddoll in #5649
Splitting wfCacheEnabled config for internal and external requests by @sankari165 in #5647
Convert pinot query to use unix milliseconds instead of nano by @neil-xie in #5650
Emit metrics when transfer tasks could be ratelimited by @sankari165 in #5652
Update change log for v1.2.7 release by @neil-xie in #5653
Update pinot query validator to handle raw time string by @neil-xie in #5656
Emit metrics when transfer tasks for decisions could be ratelimited by @sankari165 in #5665
Upgrade pinot client version by @neil-xie in #5666
Update the build-changed message failure by @Groxx in #5667
Improve error message for membership resolver by @Shaddoll in #5669
Emits a counter value for every unique view of the hashring by @davidporter-id-au in #5672
Refactor history packages by @jakobht in #5673
Improve test coverage for sql_execution_store_util by @Shaddoll in #5676
Improve test coverage for sql_execution_store by @Shaddoll in #5677
Improve test coverage for constants.go by @timl3136 in #5685
Enable retry on mutable state checksum verification failure by @Shaddoll in #5691

Fixed

Set proper max reset points by @neil-xie in #5623
Put a timeout for timer task deletion loop during shutdown by @taylanisikdemir in #5626
Catch unit test failures in make test by @Groxx in #5635
fix: get messages between query over message_id typo by @zedongh in #5607
Fix context leak in tests by @munahaf in #5377
Make sure task processing rate limiter is only done in the active side by @sankari165 in #5654
Fix Pinot query validator bug when user pass in not equal query with value missing by @neil-xie in #5662
Update Pinto query validator failed log, minor refactor pinot visibility store to remove panics by @neil-xie in #5664
Fix context leak in pinot integration test by @neil-xie in #5682
Fix SignalWithStartWorkflow API by @Shaddoll in #5671
Fix wrong migration paths in example by @kotcrab in #5668
Fix comment in workflow id cache config by @sankari165 in #5661
Fix the local integration test docker-compose file by @jakobht in #5695
Do not get workflow execution from database when shard is closed by @Shaddoll in #5697

Removed

Removed useless metrics tag from the workflowIDcache by @jakobht in #5651
Removed the shadower service for cadence-server by @agautam478 in #5660

New Contributors

@zedongh made their first contribution in #5607
@munahaf made their first contribution in #5377
@kotcrab made their first contribution in #5668

Full Changelog: v1.2.7...v1.2.8

Contributors

Groxx, jakobht, and 14 other contributors

Assets 3

09 Feb 19:00

neil-xie

v1.2.7

08d5994

v1.2.7

What's Changed

Added

Add metrics to monitor task validation. by @agautam478 in #5466
Add an "all results" query to scanner/fixer workflows by @Groxx in #5470
Add retries into Scanner BlobWriter by @agautam478 in #5471
Added a unit test for the BlobStoreWriter. by @agautam478 in #5472
Add Debugf and some minor updates to timer queue processor base by @taylanisikdemir in #5475
Add unit tests for cassandra workflow utils part-1 by @taylanisikdemir in #5476
Add workflow query-types command to CLI by @arzonus in #5456
Add unit test for cassandra workflow utils part-2 by @taylanisikdemir in #5480
Unit tests for admin cli decode_thrift command by @taylanisikdemir in #5485
Add unit test for sqlConfigStore by @Shaddoll in #5491
Add unit test for mysql configstore by @Shaddoll in #5502
Add persistence serialization unit tests by @3vilhamster in #5507
Adding unit tests to workflowHandler_test.go by @sankari165 in #5500
Add unit tests for AwaitWaitGroup by @arzonus in #5512
Add unit test for sql domain store by @Shaddoll in #5508
Add unit test for cassandra workflow utils part-3 by @taylanisikdemir in #5506
Adding unit tests for RecordActivityTaskHeartbeat by @sankari165 in #5511
add unit tests for ValidIDLength by @arzonus in #5520
Test for rate limited wrappers around persistence clients by @3vilhamster in #5518
Test for error injection clients by @3vilhamster in #5515
Add unit test for sql history store by @Shaddoll in #5524
Adding unit tests to RespondActivityTaskCompleted and RecordActivityT… by @sankari165 in #5521
Add unit tests for IsEntityNotExistsError by @arzonus in #5528
Add unit tests for CreateXXXRetryPolicy by @arzonus in #5527
Add unit tests for ValidateRetryPolicy by @arzonus in #5529
Add unit tests for ConvertGetTaskFailedCauseToErr by @arzonus in #5531
Add unit tests for WorkflowIDToHistoryShard and DomainIDToHistoryShard by @arzonus in #5533
Added a unit test for the timer.go file in reconciliation folder. by @agautam478 in #5505
Adding logging to scanner.go by @agautam478 in #5535
Adding a metric for hosts not being found in resolver by @davidporter-id-au in #5414
Added logs to concrete_execution.go by @agautam478 in #5536
Add unit tests for sql queue store by @Shaddoll in #5541
Unit tests for timer/transfer queue processor pump loops by @taylanisikdemir in #5540
Add unit tests for sql shard store by @Shaddoll in #5543
Add unit test for kafka partition ack manager by @neil-xie in #5545
Add unit tests for GenerateRandomString by @arzonus in #5532
Add unit tests for IsValidContext by @arzonus in #5546
Add unit tests for CreateChildContext by @arzonus in #5547
Add unit tests for DeserializeSearchAttributeValue by @arzonus in #5548
Add unit tests for GetSizeOfHistoryEvent by @arzonus in #5550
Add unit tests for thrift mappers by @taylanisikdemir in #5542
Add unit tests for sql task store by @Shaddoll in #5558
Added logs into the current execution.go and a unit test by @agautam478 in #5555
Add unit test for kafka producer impl by @neil-xie in #5559
Add shard id to queue processor related metrics by @taylanisikdemir in #5557
Add unit tests for sql execution store by @Shaddoll in #5565
Add unit test for new Kafka client by @neil-xie in #5570
Add unit tests for helper functions in sql execution store util by @Shaddoll in #5571
Added tests for visibility sampling wrapper by @3vilhamster in #5564
Add unit test for consumer impl by @neil-xie in #5573
Add unit tests for workflow state non maps by @Shaddoll in #5578
Add logs to debug timer tasks by @Shaddoll in #5581
Added deprecated domain check to the taskvalidator by @agautam478 in #5580
Add unit tests for IsServiceTransientError by @arzonus in #5551
Add unit tests for for IsAdvancedVisibilityWritingEnabled by @arzonus in #5552
Add unit tests for ValidateLongPollXXX by @arzonus in #5553
Add grafana dashboard to visualize persistence metrics for default docker-compose setup by @taylanisikdemir in #5582
Add missing exclude-query support to list-workflows on the CLI by @Groxx in #5583
Add unit tests for DurationToXXX and XXXToDuration by @arzonus in #5530
Add more debug logs for user timer task execution by @taylanisikdemir in #5595
Add cache for workflow specific in memory data by @jakobht in #5594
Added three dynamic config properties by @jakobht in #5602
add ContextKey Struct by @bowenxia in #5606
Adding a stale workflow check to the taskvalidator and code cleanup. by @agautam478 in #5604
Added more error handling in workflow cache by @jakobht in #5611

Fixed

Improves metric and error handling for history by @davidporter-id-au in #5469
Address map access data race in matching engine by @taylanisikdemir in #5477
fix docker compose tests by @3vilhamster in #5479
Fix copying suite.Suite in integration tests by @3vilhamster in #5481
fix scavenger test suite by @3vilhamster in #5490
fix scavenger suite by @3vilhamster in #5498
Fixing matching:TestCheckIdleTaskList test flackiness by @dkrotx in #5494
fix leaky goroutines in matching by @3vilhamster in #5499
Unit test for the fetcher/current.go. by @agautam478 in #5504
More fixes for golint.sh by @Groxx in #5519
Fix race between startup and shutdown in task reader by @Groxx in #5522
Ensure scanner scavenger stops in tests by @3vilhamster in #5510
Bugfix/debugging stuck tasklist by @davidporter-id-au in #5436
Fix multiple lock acquire on membership update by @3vilhamster in #5576
Properly catch errors in ldflag-gathering and fail the build by @Groxx in #5539
Addressed sync issue in workflow cache by @jakobht in #5605
fix a comment by @bowenxia in #5610
Fixed lint errors introduced in previous PR by @jakobht in #5613

Changed

Update kafka config to have isSecure option by @neil-xie in #5473
Minor change to include domainTag and pass domainName. by @agautam478 in #5468
Wrap isSecure config in config map for kafka topic by @neil-xie in #5474
Update changelog for v1.2.6 release by @neil-xie in #5478
Unify cassandra setup in docker-compose by @3vilhamster in #5482
Unify logging in tests by @3vilhamster in #5487
Updated the unit test for BlobstoreIterator into a table format by @agautam478 in #5488
update cassandra dev setup by @3vilhamster in #5501
Converted the existing test for concrete.go execution into a table test by @agautam478 in #5503
Improve logs/metrics of HandleDecisionTaskCompleted by @taylanisikdemir in #5497
Revert gofuzz us...

Contributors

Groxx, jakobht, and 11 other contributors

Assets 3

14 Dec 22:11

neil-xie

v1.2.6

558780b

v1.2.6

What's Changed

Added

Added range query support for Pinot json index by @bowenxia (#5426)
Implemented GetTaskListSize method at persistence layer by @Shaddoll (#5442, #5447)
Added a framework for the Task validator service by @agautam478 (#5446)
Added nit comments describing the Update workflow cycle @agautam478 (#5432)
Added log user query param by @bowenxia (#5437)
Added CODEOWNERS file by @taylanisikdemir (#5453)
Added a function to evict all elements older than the cache TTL by @jakobht (#5464)

Fixed

Fixed workflow replication for reset workflow by @Shaddoll (#5412)
Fixed visibility mode for admin when use Pinot visibility by @neil-xie (#5441)
Fixed workflow started metric by @ketsiambaku (#5443)
Fixed timer-fixer, unfortunately broken in 1.2.5 by @Groxx (#5433)
Fixed confusing comment in matching handler by @jakobht (#5450)

Changed

Cassandra version is changed from 3.11 to 4.1.3 by @taylanisikdemir (#5461)
- If your machine already has ubercadence/server:master-auto-setup image then you need to repull so it works with latest docker-compose*.yml files
Move dynamic ratelimiter to its own file by @jakobht (#5451)
Create and use a limiter struct instead of just passing a function by @jakobht (#5454)
Dynamic ratelimiter factories by @jakobht (#5455)
Update github action for image publishing to released by @3vilhamster (#5460)
Update matching to emit metric for tasklist backlog size by @Shaddoll (#5448)
Change variable name from SecondsSinceEpoch into EventTimeMs by @bowenxia (#5463)

Removed

Get rid of noisy task adding failure log in matching service by @taylanisikdemir (#5445)

New Contributors

@jakobht made their first contribution in #5450

Full Changelog: v1.2.5...v1.2.6

Contributors

Groxx, jakobht, and 7 other contributors

Assets 3

02 Nov 19:07

sankari165

v1.2.5

eb8eea9

v1.2.5

What's Changed

Added

Scanner / Fixer changes by @Groxx in #5361
- Stale-workflow detection and cleanup added to shardscanner, disabled by default.
- New dynamic config to better control scanner and fixer, particularly for concrete executions.
- Documentation about how scanner/fixer work and how to control them, see the scanner readme.md
- This also includes example config to enable the new fixer.
MigrationChecker interface to expose migration CLI by @abhishekj720 in #5424
Added Pinot as new visibility store option by @neil-xie in #5201
- Added pinot visibility triple manager to provide options to write to both ES and Pinot.
- Added pinotVisibilityStore and pinotClient to support CRUD operations for Pinot.
- Added pinot integration test to set up Pinot test cluster and test Pinot functionality.

Fixed

Fix CreateWorkflowModeContinueAsNew for SQL by @Shaddoll in #5413
Fix CLI count&list workflows error message by @ketsiambaku in #5417
Hotfix for async matching for isolation-group redirection by @davidporter-id-au in #5423
Fix closeStatus for --format flag by @ketsiambaku in #5422

Full Changelog: v1.2.4...v1.2.5-prerelease3

Contributors

Groxx, davidporter-id-au, and 4 other contributors

Assets 3

27 Sep 19:03

neil-xie

v1.2.4

c93d6af

v1.2.4

What's Changed

Remove database check for config store tests by @Shaddoll in #5401
Fix persistence tests setup by @Shaddoll in #5402
Implement config store for MySQL by @Shaddoll in #5403
Retract v1.2.3 by @sankari165 in #5406
Implement config store for PostgresSQL by @Shaddoll in #5405
Release v1.2.4 by @Shaddoll in #5407

Full Changelog: v1.2.3...v1.2.4

Contributors

Shaddoll and sankari165

Assets 3

15 Sep 22:10

Shaddoll

v1.2.3

4a16136

v1.2.3 (Retracted, please use v1.2.4) Pre-release

Pre-release

Added

Expose workflow history size and count to client by @timl3136 (#5392)

Fixed

[cadence-cli] fix typo in input flag for parallelism by @sankari165 (#5397)

Changed

Update config store client to support SQL database by @Shaddoll (#5395)
Scaffold config store for sql plugins by @Shaddoll (#5396)
Improve poller detection for isolation by @Shaddoll (#5399)

Contributors

Shaddoll, timl3136, and sankari165

Assets 2

19 Sep 16:34

sankari165

v1.2.2

e5f605c

v1.2.2

What's Changed

add a update workflow execution count metric for RI by @allenchen2244 in #5386
Pass partition config and isolation group to history/matching even if isolation is disabled by @Shaddoll in #5385
[CLI] fix nil pointer issue in domain migration command rendering by @shijiesheng in #5378
Release v1.2.2 by @shijiesheng in #5388

Full Changelog: v1.2.1...v1.2.2

Contributors

shijiesheng, Shaddoll, and allenchen2244

Assets 2

19 Sep 03:56

davidporter-id-au

v1.2.1

0e17485

v1.2.1

Project release: Zonal isolation

This version introduces a few resiliency concepts into customers' worker task processing such that they can detect deployment or configuration failures earlier. These features are opt-in.

The high-level concept is to provide a means to subdivide work (called 'isolation-groups') for workers along whatever partitioning mechanism that is required for your service.

By default the partitioning mechanism provided will attempt to keep workflows running in the location the are started, such that customers may identify broken changes earlier, rather than waiting for the deployment of an entire region. However, if there are no pollers available available in that subdivision, it'll route the work elsewhere.

Nomenclature

Partitioning: A means to subdivide the tasks given to workflows, of which there are many possible schemes and one default one provided. When a workflow is started, a group of partition keys are provided by request headers. The partition keys are used to determine which isolation group of workers should process these workflows.
Workflow pinning: A partitioning scheme which emphasizes keeping workflows running in the location they were started
Isolation-groups: A division of work within a customer region in which they can subdivide their workers and pin the workflows. This originally was intended as a synonym for 'zone' in the site reliability, as a subdivision of a region. However the important point is that this is a failure domain for customer workflows, so this may be an arbitrary subdivision of your cluster's traffic.
Isolation-group drain: A means of excluding work from an isolation-group. If an isolation group is drained, workers from that isolation group won't be able to get any task. And customers cannot start workflows from that isolation group.

Default concepts and approaches

The partitioning and isolation concepts are intended to be provided as general purpose orchestration concepts and flexible, with some basic defaults provided. By default the following behaviour is given:

Partition data is persisted with workflow execution records by the provided middleware if the provided header is passed when workflows are created.
The cadence client and worker Go libraries will pass these as headers if provided in client options

Pinning behaviour

The workflow original zone is captured on workflow start and will be used on workflow processing.

The default partitioner provides the following behaviour: It will attempt to dispatch work in a zone where the workflow was started. However, workers may not be available in that zone, or no longer available for some reason. So the partitioner takes information from a lookback of poller information and uses this lookback data to ensure that the workflow can be processed. If the the start isolation-group is not available it'll another healthy random one.

'Health', here, is determined as the presence of pollers and the absence of drains.

The 'unpinning' is import for two main reasons: firstly, it's quite possible to start a workflow from an unrelated isolation-group in which the pollers are created and to suddenly blackhole that work would likely be not the desired behaviour. But secondly, and probably more importantly, this prevents a head-of-line blocking problem internally for Cadence. At the database level (in this release anyway) tasks need to be dispatched in-order and so if an isolation-group were to be not processed it would block task processing.

Drains

This release also introduces a simplistic notion of drains, which allow for isolation-groups to be excluded from traffic processing, should that be required. Drains are issuable via the Admin API or via cli:

eg:

cadence admin isolation-groups update-global --set-drains zone-1
cadence admin isolation-groups get-global

This information is stored in the config-store and is not part of dynamic configuration.

Configuration

In order to use this feature, the requisite configuration is required:

system.allIsolationGroups: This is a list of all the possible isolation-groups
system.enableTasklistIsolation: This is the bool flag to enable it for a domain

Implementation

The changes for this feature are largely in Matching and can be (reductively) described as: Sync and Async-match in Cadence as being made aware of a new dimension; their associated isolation-group. The tasks piped through the Matching service are matching the appropriate isolation-group channel.

What's Changed

Set config for shardscanner fixer by @mantas-sidlauskas in #3844
Fix get raw history for transient decision by @yycptt in #3847
Fix error handling when processing parent close policy by @yycptt in #3845
Add logging/metrics for decision attempts by @yycptt in #3849
Switch to gocql interface by @yycptt in #3837
Fix NPE in DescribeMutableState by @yycptt in #3850
Switch the remaining history component to internal types by @vytautas-karpavicius in #3843
Switch Health status endpoints to internal types by @vytautas-karpavicius in #3842
reset workflow with no decision task complete by @yux0 in #3687
error check before return the ActivityLocalDispatchInfo by @mkolodezny in #3853
Delete unused dynamic configs that have no referrence anymore by @longquanzheng in #3859
Merge sql updates: Blob size increase by @yux0 in #3858
Handle matching task list conditional error by @yux0 in #3867
Fix go-generate by @yycptt in #3864
Support visibility query with close status represented in string by @yycptt in #3865
Add timers shardscanner by @mantas-sidlauskas in #3846
replace string based logging with tagged logs by @mantas-sidlauskas in #3871
Downgrade golang tools version by @yycptt in #3876
Add instructions to setup local MySQL and Postgres by @yux0 in #3868
Make max activity schedule to start timeout for retry configurable by domain by @yycptt in #3878
Task processing debug logs by @yycptt in #3877
Transfer queue validator by @yycptt in #3875
Pick sql index changes by @yux0 in #3866
Remove strict sanity check to allow reset by @yux0 in #3879
Improve shard context timeout handling by @yycptt in #3881
Add domain name tag in failover metrics by @yux0 in #3882
break out when response is nil by @mantas-sidlauskas in #3886
Allow using Kafka TLS without cert ca and key by @longquanzheng in #3862
Fix dynamic config collection logValue function by @yycptt in #3880
Update read DLQ messages API to return raw task info by @yux0 in #3869
break if adminClient returns error by @mantas-sidlauskas in #3887
Latest idl by @yux0 in #3888
Fix activity lost metrics by @yycptt in #3889
Add replication error logging and metrics by @yux0 in #3891
Simplify templateGetLastMessageIDQuery sql query by @andrewjdawson2016 in #3890
Add task processing workflow busy metric by @yycptt in #3892
CLI 0.18.0 release by @yycptt in #3896
Handle data corruption error in replication by @yux0 in #3895
Add a "help" target to the makefile by @Groxx in #3898
Initial protobuf types and API by @vytautas-karpavicius in #3863
Fix workflow reset command by @yycptt in #3904
CLI 0.18.1 patch release by @yycptt in #3908
Use GetDomainName instead of GetDomainByID for retrieving domain names by @yycptt in #3899
Start enabled shardscanner fixers by @mantas-sidlauskas in #3906
Switch to protoc-gen-go by @vytautas-karpavicius in #3905
Fix scan unsupported workflow in SQl DB by @yux0 in #3909
Makefile cleanup / thrift revamp / gobin removed by @Groxx in #3903
Version goveralls, remove unused go bins from docker setup by @Groxx in #3913
Remove duplicate doc...