- Data Domiciling Demo
- Setup
- Global Tables
- Regional Tables - Zone Survival Goal
- Regional-By-Row Tables
- Limitations
- Appendix
** All data in this demo was created with Python/Faker. None of the names, national IDs, etc. are real.
AWS EC2 Hosted 9 Node, 3 Region Cluster with 3 nodes in each region, all nodes in separate AZs.
The data you enter here, must map to the localities you used when starting each node. If you are using CockroachDB Dedicated, you do not need to perform this step. These are the localities I used when starting the nodes in the cluster.
INSERT into system.locations VALUES ('region', 'aws-us-east-1', 37.478397, -76.453077);
INSERT into system.locations VALUES ('region', 'aws-us-east-2', 40.417287, -76.453077);
INSERT into system.locations VALUES ('region', 'aws-us-west-1', 38.837522, -120.895824);
INSERT into system.locations VALUES ('region', 'aws-us-west-2', 43.804133, -120.554201);
INSERT into system.locations VALUES ('region', 'aws-ca-central-1', 56.130366, -106.346771);
INSERT into system.locations VALUES ('region', 'aws-eu-central-1', 50.110922, 8.682127);
INSERT into system.locations VALUES ('region', 'aws-eu-west-1', 53.142367, -7.692054);
INSERT into system.locations VALUES ('region', 'aws-eu-west-2', 51.507351, -0.127758);
INSERT into system.locations VALUES ('region', 'aws-eu-west-3', 48.856614, 2.352222);
INSERT into system.locations VALUES ('region', 'aws-ap-northeast-1', 35.689487, 139.691706);
INSERT into system.locations VALUES ('region', 'aws-ap-northeast-2', 37.566535, 126.977969);
INSERT into system.locations VALUES ('region', 'aws-ap-northeast-3', 34.693738, 135.502165);
INSERT into system.locations VALUES ('region', 'aws-ap-southeast-1', 1.352083, 103.819836);
INSERT into system.locations VALUES ('region', 'aws-ap-southeast-2', -33.86882, 151.209296);
INSERT into system.locations VALUES ('region', 'aws-ap-south-1', 19.075984, 72.877656);
INSERT into system.locations VALUES ('region', 'aws-sa-east-1', -23.55052, -46.633309);
In order to use multi-region you'll need to be running enterprise edition. If you're using CockroachDB Dedicated or Serverless, these will be automatically set for you. If you're self-hosting the cluster, then you need to update your cluster settings. The cluster settings are not necessarily "production settings", but great for a demo.
SET CLUSTER SETTING cluster.organization = {your org};
SET CLUSTER SETTING enterprise.license = {your license};
SET CLUSTER SETTING kv.snapshot_rebalance.max_rate = '1g';
SET CLUSTER SETTING kv.snapshot_recovery.max_rate = '1g';
SET CLUSTER SETTING server.time_until_store_dead = '1m15s';
select node_id, locality, is_live from crdb_internal.gossip_nodes;
node_id | locality |
---|---|
1 | region=aws-us-west-2,zone=aws-us-west-2a |
2 | region=aws-us-west-2,zone=aws-us-west-2b |
3 | region=aws-us-west-2,zone=aws-us-west-2c |
4 | region=aws-eu-central-1,zone=aws-eu-central-1a |
5 | region=aws-eu-central-1,zone=aws-eu-central-1b |
6 | region=aws-eu-central-1,zone=aws-eu-central-1c |
7 | region=aws-ap-southeast-1,zone=aws-ap-southeast-1a |
8 | region=aws-ap-southeast-1,zone=aws-ap-southeast-1b |
9 | region=aws-ap-southeast-1,zone=aws-ap-southeast-1c |
create database db_with_abstractions;
alter database db_with_abstractions set primary region "aws-us-west-2";
alter database db_with_abstractions add region "aws-ap-southeast-1";
alter database db_with_abstractions add region "aws-eu-central-1";
The survival goal for the database db_with_abstractions
should be zone
.
show survival goal from database db_with_abstractions;
database | survival_goal |
---|---|
db_with_abstractions | zone |
You can read about global tables in the Cockroach Documentation.
Global tables rely on a database configured with the multi-region abstractions. I'll be using the db_with_abstractions
database for this part of the demo.
use db_with_abstractions;
create table postal_codes (
id uuid default gen_random_uuid() primary key,
postal_code string
) locality global;
insert into postal_codes (postal_code) values ('92270'), ('92271'), ('92272'),('97773'),('97774'),('97775'),('97776'),('97777'),('97778'),('97779'),('97780'),('97781'),('97782'),('97783'),('97784'),('97785'),('97786'),('97787'),('97788'),('97789'),('97790'),('97791'),('97792'),('97793'),('97794'),('97795'),('97796'),('97797'),('97798'),('97799'),('97800'),('97801'),('97802'),('97803'),('97804'),('97805'),('97806'),('97807'),('97808'),('97809'),('97810'),('97811'),('97812'),('97813'),('97814'),('97815'),('97816'),('97817'),('97818'),('97819'),('97820'),('97821'),('97822'),('97823'),('97824'),('97825'),('97826'),('97827'),('97828'),('97829'),('97830'),('97831'),('97832'),('97833'),('97834'),('97835'),('97836'),('97837'),('97838'),('97839'),('97840'),('97841'),('97842'),('97843'),('97844'),('97845'),('97846'),('97847'),('97848'),('97849'),('97850'),('97851'),('97852'),('97853'),('97854'),('97855'),('97856'),('97857'),('97858'),('97859'),('97860'),('97861'),('97862'),('97863'),('97864'),('97865'),('97866'),('97867'),('97868'),('97869') returning id;
Where are the ranges for the postal_codes
table?
select lease_holder, lease_holder_locality, replicas from [show ranges from table postal_codes];
select lease_holder, replicas, voting_replicas, non_voting_replicas
from crdb_internal.ranges where table_id = (
select table_id
from crdb_internal.tables
where name = 'postal_codes' and database_name = 'db_with_abstractions' and state != 'DROP');
lease_holder | lease_holder_locality | replicas |
---|---|---|
2 | region=aws-us-west-2,zone=aws-us-west-2b | {1,2,3,5,9} |
lease_holder | replicas | voting_replicas | non_voting_replicas |
---|---|---|---|
3 | {1,2,3,6,8} | {3,1,2} | {8,6} |
Notice that there are ranges in all regions of the cluster. Whenever you read from a global table, the read will be from a "local" replica -- a follower read. Therefore reads of a global table will always be fast in CockroachDB, however write will be very slow. Reads from Global Tables are always current.
Let's read this table from across the different nodes of the cluster
To find the node and locality of the node I am working on I'll issue the following:
select node_id, locality from crdb_internal.gossip_nodes where node_id = crdb_internal.node_id();
And I'm going to run the following query in each region to find the latency.
select postal_code from postal_codes where id = '00f37184-1fd9-4f44-9eee-74f62473a029';
node_id | locality | latency |
---|---|---|
1 | region=aws-us-west-2,zone=aws-us-west-2a | 4ms |
4 | region=aws-eu-central-1,zone=aws-eu-central-1a | 6ms |
7 | region=aws-ap-southeast-1,zone=aws-ap-southeast-1a | 2ms |
No matter where we read the global table from, we're going to be reading from the local region.
use db_with_abstractions;
CREATE TABLE public.users_regional_uswest2 (
id UUID NOT NULL DEFAULT gen_random_uuid(),
auth_id VARCHAR(100) NOT NULL,
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NULL,
email VARCHAR NOT NULL,
profile_picture_id VARCHAR NULL,
default_picture VARCHAR NULL,
preferences JSONB NULL,
metadata JSONB NULL,
created_at TIMESTAMPTZ NULL DEFAULT now():::TIMESTAMPTZ,
updated_at TIMESTAMPTZ NULL DEFAULT now():::TIMESTAMPTZ,
CONSTRAINT users_rbr_pkey PRIMARY KEY (id ASC)
-- UNIQUE INDEX users_auth_id_key (auth_id ASC),
-- UNIQUE INDEX users_email_key (email ASC),
-- INDEX idx_users_auth_id (auth_id ASC)
) locality regional in "aws-us-west-2";
INSERT INTO public."users_regional_uswest2" (id,auth_id,first_name,last_name,email,profile_picture_id,default_picture,preferences,metadata,created_at,updated_at) VALUES ('2a231114-fd02-4e8c-bc9e-95fd5c132b96','adasdfadsf','adfa','poinc','[email protected]',NULL,NULL,NULL,NULL,'2022-06-07 11:21:11.618178-07','2022-06-07 11:21:11.618178-07');
INSERT INTO public."users_regional_uswest2" (id,auth_id,first_name,last_name,email,profile_picture_id,default_picture,preferences,metadata,created_at,updated_at) VALUES ('58dc4051-3acb-4c61-88b7-2cbcca459abb','balkdsfhha','boiuadsaf','oiubab','[email protected]',NULL,NULL,NULL,NULL,'2022-06-07 11:21:11.67236-07','2022-06-07 11:21:11.67236-07');
INSERT INTO public."users_regional_uswest2" (id,auth_id,first_name,last_name,email,profile_picture_id,default_picture,preferences,metadata,created_at,updated_at) VALUES ('91a0ea14-4c44-4854-9438-af823bba2617','weuhiouasug','cdapoifha','sapdoifhhiu','[email protected]',NULL,NULL,NULL,NULL,'2022-06-07 11:21:11.723185-07','2022-06-07 11:21:11.723185-07');
Let's look at the ranges associated with the table:
select lease_holder, replicas, voting_replicas, non_voting_replicas
from crdb_internal.ranges where table_id = (
select table_id
from crdb_internal.tables
where name = 'users_regional_uswest2' and database_name = 'db_with_abstractions');
lease_holder | replicas | voting_replicas | non_voting_replicas |
---|---|---|---|
2 | {1,2,3,5,9} | {2,1,3} | {9,5} |
We can see that the voting replicas are all in the expected regions (1,2, & 3), but that there are non-voting replicas in both the EMEA (node 5) and APAC (node 9).
To use the non-voting replicas we would have to do a follower read from that region:
SELECT email FROM users_regional_uswest2 AS OF SYSTEM TIME follower_read_timestamp() where id = '91a0ea14-4c44-4854-9438-af823bba2617';
Now what happens if we change the database to be placement restricted for data domiciling?
set session enable_multiregion_placement_policy=on;
show session enable_multiregion_placement_policy;
alter database db_with_abstractions placement restricted;
lease_holder | replicas | voting_replicas | non_voting_replicas |
---|---|---|---|
2 | {1,2,3} | {3,2,1} | {} |
The non-voting replicas have been removed -- which is what we would expect.
Finally, if we take one of the nodes down in the US region (node 2 which contains the lease holder), what happens to the remaining ranges?
lease_holder | replicas | voting_replicas | non_voting_replicas |
---|---|---|---|
3 | {1,2,3} | {3,2,1} | {} |
Notice that the lease holder moved, but ranges were not moved out of us-west-2.
There are now under-replicated ranges:
select zone_id, total_ranges, unavailable_ranges, under_replicated_ranges from system.replication_stats where zone_id = (select table_id from crdb_internal.tables where name = 'users_regional_uswest2' and database_name = 'db_with_abstractions');
zone_id | total_ranges | unavailable_ranges | under_replicated_ranges |
---|---|---|---|
117 | 1 | 0 | 1 |
What happens if I kill the entire region?
node_id | locality | is_live |
---|---|---|
1 | region=aws-us-west-2,zone=aws-us-west-2a | f |
2 | region=aws-us-west-2,zone=aws-us-west-2b | f |
3 | region=aws-us-west-2,zone=aws-us-west-2c | f |
4 | region=aws-eu-central-1,zone=aws-eu-central-1a | t |
5 | region=aws-eu-central-1,zone=aws-eu-central-1b | t |
6 | region=aws-eu-central-1,zone=aws-eu-central-1c | t |
7 | region=aws-ap-southeast-1,zone=aws-ap-southeast-1a | t |
8 | region=aws-ap-southeast-1,zone=aws-ap-southeast-1b | t |
9 | region=aws-ap-southeast-1,zone=aws-ap-southeast-1c | t |
The query for ranges will not respond; it will just "hang".
The query for unavilable ranges does run and indicates that we have unavailable ranges:
zone_id | total_ranges | unavailable_ranges | under_replicated_ranges |
---|---|---|---|
117 | 1 | 1 | 1 |
Once I bring the nodes back up, the system resumes.
With regional tables, I can choose a region in which both the lease holder and the voting replicas should be placed (this is the locality regional in "aws-us-west-2"
clause of ). By default, the database will place non-voting replicas in all additional regions of the database. If we restrict the placement of replicas, then the ranges will be constrained to the tables region.
Regional-By-Row (RBR) tables allows the architect to place the data in a single logical table close to the user which adds a data-domiciling component. RBR tables act as one logical table, but the data is distributed across the cluster.
You can read the Cockroach Documentation on Regional-By-Row tables to find out more.
We're going to create an RBR table, populate some data and confirm that the data is constrained within the confines we created.
We will use the same table definition as we did for the regional table, but the locality clause will be different:
use db_with_abstractions;
CREATE TABLE public.users (
id UUID NOT NULL DEFAULT gen_random_uuid(),
auth_id VARCHAR(100) NOT NULL,
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NULL,
email VARCHAR NOT NULL,
profile_picture_id VARCHAR NULL,
default_picture VARCHAR NULL,
preferences JSONB NULL,
metadata JSONB NULL,
national_id STRING NULL,
created_at TIMESTAMPTZ NULL DEFAULT now():::TIMESTAMPTZ,
updated_at TIMESTAMPTZ NULL DEFAULT now():::TIMESTAMPTZ,
CONSTRAINT users_rbr_pkey PRIMARY KEY (id ASC),
INDEX data_domiciling_violation (national_id ASC) STORING (auth_id, first_name, last_name, email)
) locality regional by row;
The stored DDL has some subtle changes from what we submitted to the database:
select create_statement from [show create table users];
CREATE TABLE public.users (
id UUID NOT NULL DEFAULT gen_random_uuid(),
auth_id VARCHAR(100) NOT NULL,
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NULL,
email VARCHAR NOT NULL,
profile_picture_id VARCHAR NULL,
default_picture VARCHAR NULL,
preferences JSONB NULL,
metadata JSONB NULL,
created_at TIMESTAMPTZ NULL DEFAULT now():::TIMESTAMPTZ,
updated_at TIMESTAMPTZ NULL DEFAULT now():::TIMESTAMPTZ,
crdb_region db_with_abstractions.public.crdb_internal_region NOT VISIBLE NOT NULL DEFAULT default_to_database_primary_region(gateway_region())::db_with_abstractions.public.crdb_internal_region,
CONSTRAINT users_rbr_pkey PRIMARY KEY (id ASC),
) LOCALITY REGIONAL BY ROW
Notice that an additional column crdb_region
was added to the table:
- it is a hidden column
- the type is
db_with_abstractions.public.crdb_internal_region
- the defualt value is
default_to_database_primary_region(gateway_region())
In addition to the new column the primary key has also been changed:
show indexes from users;
table_name | index_name | non_unique | seq_in_index | column_name | direction | storing | implicit | visible |
---|---|---|---|---|---|---|---|---|
users | users_rbr_pkey | f | 1 | crdb_region | ASC | f | t | t |
users | users_rbr_pkey | f | 2 | id | ASC | f | f | t |
users | users_rbr_pkey | f | 3 | auth_id | N/A | t | f | t |
users | users_rbr_pkey | f | 4 | first_name | N/A | t | f | t |
users | users_rbr_pkey | f | 5 | last_name | N/A | t | f | t |
users | users_rbr_pkey | f | 6 | N/A | t | f | t | |
users | users_rbr_pkey | f | 7 | profile_picture_id | N/A | t | f | t |
users | users_rbr_pkey | f | 8 | default_picture | N/A | t | f | t |
users | users_rbr_pkey | f | 9 | preferences | N/A | t | f | t |
users | users_rbr_pkey | f | 10 | metadata | N/A | t | f | t |
users | users_rbr_pkey | f | 11 | created_at | N/A | t | f | t |
users | users_rbr_pkey | f | 12 | updated_at | N/A | t | f | t |
If I create a secondary index on users, say on email
, I'll get the following:
create index my_second_index on users (email) storing (first_name, last_name);
table_name | index_name | non_unique | seq_in_index | column_name | direction | storing | implicit | visible |
---|---|---|---|---|---|---|---|---|
users | my_second_index | t | 1 | crdb_region | ASC | f | t | t |
users | my_second_index | t | 2 | ASC | f | f | t | |
users | my_second_index | t | 3 | id | ASC | f | t | t |
Notice that crdb_region
was added to my non-unique index automatically.
select lease_holder, replicas, voting_replicas, non_voting_replicas
from crdb_internal.ranges where table_id = (
select table_id
from crdb_internal.tables
where name = 'users' and database_name = 'db_with_abstractions');
Now let's add some data. I'm going to add the data in 3 batches into the database. What I'm going to do is log into a region insert some rows and repeat that for all my regions.
us-west-2
use db_with_abstractions;
select crdb_internal.node_id(), gateway_region();
INSERT INTO public."users" (id,auth_id,first_name,last_name,email,profile_picture_id,default_picture,preferences,metadata,created_at,updated_at, national_id) VALUES ('a282a39f-cc9a-461c-94aa-a88fdbae651c','us-west-2','hqfwrqaF','vqweasfdSF','[email protected]',NULL,NULL,NULL,NULL,'2022-06-07 11:21:12.031851-07','2022-06-07 11:21:12.031851-07','100-00-0000');
INSERT INTO public."users" (id,auth_id,first_name,last_name,email,profile_picture_id,default_picture,preferences,metadata,created_at,updated_at, national_id) VALUES ('c16a6f4a-cae3-4125-876c-741d858d0cbc','us-west-2','dadfdffa','dafadfwe','[email protected]',NULL,NULL,NULL,NULL,'2022-06-07 11:21:11.827312-07','2022-06-07 11:21:11.827312-07','200-00-0000');
INSERT INTO public."users" (id,auth_id,first_name,last_name,email,profile_picture_id,default_picture,preferences,metadata,created_at,updated_at, national_id) VALUES ('da01effd-22af-4eda-8fcc-c0ace2dbec59','us-west-2','eqrtawefased','qrgwrggw','[email protected]',NULL,NULL,NULL,NULL,'2022-06-07 11:21:11.878483-07','2022-06-07 11:21:11.878483-07','300-00-0000');
select count(*) from users where crdb_region='aws-us-west-2';
eu-central-1
use db_with_abstractions;
select crdb_internal.node_id(), gateway_region();
INSERT INTO public."users" (id,auth_id,first_name,last_name,email,profile_picture_id,default_picture,preferences,metadata,created_at,updated_at, national_id) VALUES ('e106df5a-30f4-44a4-8f17-1c57e145f78f','eu-central-1','fabgsfteradfa','poqeqerinc','[email protected]',NULL,NULL,NULL,NULL,'2022-06-07 11:21:11.9297-07','2022-06-07 11:21:11.9297-07','400-00-0000');
INSERT INTO public."users" (id,auth_id,first_name,last_name,email,profile_picture_id,default_picture,preferences,metadata,created_at,updated_at, national_id) VALUES ('d7ee5cda-570b-4633-b7e4-253fe52fe5d0','eu-central-1','gadgrwtaewtfa','agretaewrQ2','[email protected]',NULL,NULL,NULL,NULL,'2022-06-07 11:21:11.981086-07','2022-06-07 11:21:11.981086-07','500-00-0000');
INSERT INTO public."users" (id,auth_id,first_name,last_name,email,profile_picture_id,default_picture,preferences,metadata,created_at,updated_at, national_id) VALUES ('aa4a7222-6962-44ad-ad90-301226e31128','eu-central-1','iawetataet','gqgeqwarfe','[email protected]',NULL,NULL,NULL,NULL,'2022-06-07 11:21:12.08442-07','2022-06-07 11:21:12.08442-07','600-00-0000');
select count(*) from users where crdb_region='aws-eu-central-1';
ap-southeast-1
use db_with_abstractions;
select crdb_internal.node_id(), gateway_region();
INSERT INTO public."users" (id,auth_id,first_name,last_name,email,profile_picture_id,default_picture,preferences,metadata,created_at,updated_at, national_id) VALUES ('88404e06-91d2-465d-ae41-d1a7d4b4abc9','ap-southeast-1','jqaytagzasg','graasgsafaA','[email protected]',NULL,NULL,NULL,NULL,'2022-06-07 11:21:12.136411-07','2022-06-07 11:21:12.136411-07','700-00-0000');
INSERT INTO public."users" (id,auth_id,first_name,last_name,email,profile_picture_id,default_picture,preferences,metadata,created_at,updated_at, national_id) VALUES ('e6fe85c3-1113-4614-b73b-8f36f67e19f9','ap-southeast-1','kgfsfsdf','gwrsegrs','[email protected]',NULL,NULL,NULL,NULL,'2022-06-07 11:21:12.188812-07','2022-06-07 11:21:12.188812-07','800-00-0000');
INSERT INTO public."users" (id,auth_id,first_name,last_name,email,profile_picture_id,default_picture,preferences,metadata,created_at,updated_at, national_id) VALUES ('2d7be3f4-f104-482e-8a47-b2b9e811ccd8','ap-southeast-1','lWAReefDSDSF','aeweafgGDSGD','[email protected]',NULL,NULL,NULL,NULL,'2022-06-07 11:21:12.243045-07','2022-06-07 11:21:12.243045-07','900-00-0000');
select count(*) from users where crdb_region='aws-ap-southeast-1';
select 'select '''||crdb_region::string||''' as data_region, range_id, lease_holder, replicas from [show range from table users for row ('''||crdb_region::string||''', '''||id||''')] union all'
from users;
data_region | range_id | lease_holder | replicas |
---|---|---|---|
aws-us-west-2 | 118 | 1 | {1,2,3} |
aws-us-west-2 | 118 | 1 | {1,2,3} |
aws-us-west-2 | 118 | 1 | {1,2,3} |
aws-eu-central-1 | 116 | 4 | {4,5,6} |
aws-eu-central-1 | 116 | 4 | {4,5,6} |
aws-eu-central-1 | 116 | 4 | {4,5,6} |
aws-ap-southeast-1 | 114 | 9 | {7,8,9} |
aws-ap-southeast-1 | 114 | 9 | {7,8,9} |
aws-ap-southeast-1 | 114 | 9 | {7,8,9} |
Let's look at a record in our users
table, see where it currently lives and then update the crdb_location
.
select * from users where id = 'a282a39f-cc9a-461c-94aa-a88fdbae651c';
select 'select '''||crdb_region::string||''' as data_region, range_id, lease_holder, replicas from [show range from table users for row ('''||crdb_region::string||''', '''||id||''')]; '
from users where first_name = 'Ron';
update users set first_name = 'Ron' where id = 'a282a39f-cc9a-461c-94aa-a88fdbae651c';
update users set crdb_region = 'aws-ap-southeast-1' where first_name = 'Ron';
select 'select '''||crdb_region::string||''' as data_region, range_id, lease_holder, replicas from [show range from table users for row ('''||crdb_region::string||''', '''||id||''')]; '
from users where first_name = 'Ron';
data_region | range_id | lease_holder | replicas |
---|---|---|---|
aws-us-west-2 | 118 | 1 | {1,2,3} |
data_region | range_id | lease_holder | replicas |
---|---|---|---|
aws-ap-southeast-1 | 114 | 9 | {7,8,9} |
BACKUP INTO (
's3://nollen-cluster-backup-bucket/aws-default-location-folder/?COCKROACH_LOCALITY=default&AWS_ACCESS_KEY_ID={ID}&AWS_SECRET_ACCESS_KEY={secret}',
's3://nollen-cluster-backup-bucket/aws-ap-southeast-1-folder/?COCKROACH_LOCALITY=region=aws-ap-southeast-1&AWS_ACCESS_KEY_ID={ID}&AWS_SECRET_ACCESS_KEY={secret}',
's3://nollen-cluster-backup-bucket/aws-eu-central-1-folder/?COCKROACH_LOCALITY=region=aws-eu-central-1&AWS_ACCESS_KEY_ID={ID}&AWS_SECRET_ACCESS_KEY={secret}',
's3://nollen-cluster-backup-bucket/aws-us-west-2-folder/?COCKROACH_LOCALITY=region=aws-us-west-2&AWS_ACCESS_KEY_ID={ID}&AWS_SECRET_ACCESS_KEY={secret}'
)
AS OF SYSTEM TIME '-60s';
RESTORE FROM LATEST IN (
's3://nollen-cluster-backup-bucket/aws-default-location-folder/?AWS_ACCESS_KEY_ID={ID}&AWS_SECRET_ACCESS_KEY={secret}',
's3://nollen-cluster-backup-bucket/aws-ap-southeast-1-folder/?AWS_ACCESS_KEY_ID={ID}&AWS_SECRET_ACCESS_KEY={secret}',
's3://nollen-cluster-backup-bucket/aws-eu-central-1-folder/?AWS_ACCESS_KEY_ID={ID}&AWS_SECRET_ACCESS_KEY={secret}',
's3://nollen-cluster-backup-bucket/aws-us-west-2-folder/?AWS_ACCESS_KEY_ID={ID}&AWS_SECRET_ACCESS_KEY={secret}'
);
All of the SSN data used in this demo was created with faker.
alter table users add column national_id string;
update users set national_id = '100-00-0000' where id = 'a282a39f-cc9a-461c-94aa-a88fdbae651c';
update users set national_id = '200-00-0000' where id = 'c16a6f4a-cae3-4125-876c-741d858d0cbc';
update users set national_id = '300-00-0000' where id = 'da01effd-22af-4eda-8fcc-c0ace2dbec59';
update users set national_id = '400-00-0000' where id = 'e106df5a-30f4-44a4-8f17-1c57e145f78f';
update users set national_id = '500-00-0000' where id = 'd7ee5cda-570b-4633-b7e4-253fe52fe5d0';
update users set national_id = '600-00-0000' where id = 'aa4a7222-6962-44ad-ad90-301226e31128';
update users set national_id = '700-00-0000' where id = '88404e06-91d2-465d-ae41-d1a7d4b4abc9';
update users set national_id = '800-00-0000' where id = 'e6fe85c3-1113-4614-b73b-8f36f67e19f9';
update users set national_id = '900-00-0000' where id = '2d7be3f4-f104-482e-8a47-b2b9e811ccd8';
create index data_domiciling_violation on users(national_id) storing (auth_id, first_name, last_name, email);
I'm going to split the range with the ssn 554-33-7340
alter index data_domiciling_violation split at values ('aws-eu-central-1', '600-00-0000');
select start_pretty,
end_pretty,
replicas,
voting_replicas,
lease_holder,
range_size
from crdb_internal.ranges
where database_name = 'db_with_abstractions'
and table_name = 'users'
and index_name = 'data_domiciling_violation'
;
start_pretty | end_pretty | replicas | voting_replicas | lease_holder | range_size |
---|---|---|---|---|---|
/Table/109/8/"@" | /Table/109/8/"@"/PrefixEnd | {7,8,9} | {9,8,7} | 7 | 111876122 |
/Table/109/8/"@"/PrefixEnd | /Table/109/8/"`" | {1,2,3} | {2,3,1} | 2 | 23 |
/Table/109/8/"`" | /Table/109/8/"`"/PrefixEnd | {4,5,6} | {5,4,6} | 4 | 281415996 |
/Table/109/8/"`"/PrefixEnd | /Table/109/8/"\x80" | {1,2,3} | {2,3,1} | 2 | 23 |
/Table/109/8/"\x80" | /Table/109/8/"\x80"/PrefixEnd | {1,2,3} | {2,3,1} | 2 | 106869456 |
/Table/109/8/"\x80"/PrefixEnd | /Table/110 | {1,2,3} | {2,3,1} | 1 | 18 |
/Table/110/8/"@" | /Table/110/8/"@"/"555-02-2531" | {7,8,9} | {9,8,7} | 7 | 69562804 |
/Table/110/8/"@"/"555-02-2531" | /Table/110/8/"@"/PrefixEnd | {7,8,9} | {9,8,7} | 7 | 43309895 |
/Table/110/8/"@"/PrefixEnd | /Table/110/8/"`" | {1,2,3} | {2,3,1} | 1 | 0 |
/Table/110/8/"`" | /Table/110/8/"`"/PrefixEnd | {4,5,6} | {5,6,4} | 4 | 110874890 |
/Table/110/8/"`"/PrefixEnd | /Table/110/8/"\x80" | {1,2,3} | {2,3,1} | 1 | 0 |
/Table/110/8/"\x80" | /Table/110/8/"\x80"/PrefixEnd | {1,2,3} | {2,3,1} | 1 | 107869433 |
/Table/110/8/"\x80"/PrefixEnd | /Max | {1,2,3} | {2,3,1} | 1 | 0 |
The meta-data (which is what we're looking at in the table above) is distributed to all ranges in the cluster. Be aware that the national_id 555-02-2531
is clear text visible on very node in the cluster. The point is to be very careful of the data we use as PKeys or in indexes.
Multi-Region and Auto-Rehoming Feature
Locality Aware Backups - Manually restoring zone configurations