You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
OS / Image: ami-089e047033a16995a ami-0c34f939e95d0c640 (aws: undefined_region)
Test: EaR-longevity-kms-20gb-6h-multidc-test
Test id: 9f46998d-096f-4bf9-81b9-ca36f3002489
Test name: scylla-2025.1/features/EncryptionAtRest/EaR-longevity-kms-20gb-6h-multidc-test
Test method: longevity_test.LongevityTest.test_custom_time
Test config file(s):
in the past we had some issues testing such a scenario on multi-dc like scylladb/scylladb#20282
in the PR of #7625
We added a dtest for a single-dc scenario. perhaps we should also enhance Dtest to cover multi-dc.
in the past we had some issues testing such a scenario on multi-dc like scylladb/scylladb#20282 in the PR of #7625 We added a dtest for a single-dc scenario. perhaps we should also enhance Dtest to cover multi-dc.
maybe the coredump was cause by the fact you drop a whole region out of replication ?
regardless if it could be tested or not, it's need to be solved, and it's doesn't have anything todo with dtest
fruch
added a commit
to fruch/scylla-cluster-tests
that referenced
this issue
Feb 20, 2025
code in `_alter_keyspace_rf` was reffering only to one DC
in the ALTER command, that lead to RF=0 on other DCs
this complication the situation for scylla, and surface multiple
bug in operation that are talking place during this operation
also is kind of change is not the intended behavier of this code
so now it would update the DC RF as needed, while leaving the other
DCs RF the same as they were
Fix: scylladb#10136
Packages
Scylla version:
2025.1.0~rc2-20250216.6ee17795783f
with build-id8fc682bcfdf0a8cd9bc106a5ecaa68dce1c63ef6
Kernel Version:
6.8.0-1021-aws
Issue description
decrease_keyspaces_rf
logic isn't aware of multiple regions, and drop a whole region out of the replication:this command can be stuck and timeout, regardless it's the wrong thing todo, and it would drop all the data from one region.
Impact
all case using multiple regions, might fail on any nemesis that remove node first.
How frequently does it reproduce?
100%
Installation details
Cluster size: 6 nodes (i4i.xlarge)
Scylla Nodes used in this run:
OS / Image:
ami-089e047033a16995a ami-0c34f939e95d0c640
(aws: undefined_region)Test:
EaR-longevity-kms-20gb-6h-multidc-test
Test id:
9f46998d-096f-4bf9-81b9-ca36f3002489
Test name:
scylla-2025.1/features/EncryptionAtRest/EaR-longevity-kms-20gb-6h-multidc-test
Test method:
longevity_test.LongevityTest.test_custom_time
Test config file(s):
Logs and commands
$ hydra investigate show-monitor 9f46998d-096f-4bf9-81b9-ca36f3002489
$ hydra investigate show-logs 9f46998d-096f-4bf9-81b9-ca36f3002489
Logs:
Jenkins job URL
Argus
The text was updated successfully, but these errors were encountered: