Skip to content

[DocDB] Support creating a backup in the middle of tablet splitting #29002

@yamen-haddad

Description

@yamen-haddad

Jira Link: DB-18752

Description

The new backup workflow (that supports creating backups in the middle of a DDL) doesn't handle the case of creating the backup in the middle of a tablet split correctly. Take the following example:

  • CREATE TABLE mytbl (k INT PRIMARY KEY, v INT)
  • Insert 200 rows.
  • Check we have 1 tablet only:
  • Split the tablet into 2 tablets and wait until we have 2 tablets (the parent tablet is not deleted).
  • Create a backup.
  • Restore the backup.

Restore will fail at import_snapshot with the following error:

[m-1] E1017 18:33:53.598845 77952 catalog_entity_info.cc:731] mytbl [id=00004000000030008000000000004000]: Two tablets {mytbl_tablet_id3}, {mytbl_tablet_id2} with the same partition key start and split depth: committed_consensus_state { current_term: 0 config { opid_index: -1 } } state: PREPARING table_id: "00004000000030008000000000004000" partition { partition_key_start: "" partition_key_end: "v\237" } colocated: false hosted_tables_mapped_by_parent_id: true and 

The reason for the failure is that the created snapshot will include both the parent tablet and the two child tablets.
This leads import_snapshot to surface an error that there are 2 tablets with the same partition key start and split depth.
The two tablets has been created as part of repartition step of import snapshot:

[m-1] I1017 18:33:53.598722 77952 catalog_manager_ext.cc:2199] ImportTableEntry: Found existing table 00004000000030008000000000004000 for 00004000000030008000000000000000/mytbl (old table 000034d4000030008000000000004000) with schema public
[m-1] I1017 18:33:53.598752 77952 catalog_manager_ext.cc:1937] RepartitionTable: Repartition table 00004000000030008000000000004000 using external snapshot table 000034d4000030008000000000004000
[m-1] I1017 18:33:53.598778 77952 catalog_manager_ext.cc:1994] Created tablet {mytbl_tablet_id2} to replace tablet {mytbl_tablet_id1} in repartitioning of table 00004000000030008000000000004000
[m-1] I1017 18:33:53.598784 77952 catalog_manager_ext.cc:1994] Created tablet {mytbl_tablet_id3} to replace tablet f29cbec9fea8448b808f838010e8177e in repartitioning of table 00004000000030008000000000004000
[m-1] I1017 18:33:53.598794 77952 catalog_manager_ext.cc:1994] Created tablet {mytbl_tablet_id4} to replace tablet f4d7c73a92a846dc9bb73673b54b417a in repartitioning of table 00004000000030008000000000004000
[m-1] E1017 18:33:53.598845 77952 catalog_entity_info.cc:731] mytbl [id=00004000000030008000000000004000]: Two tablets {mytbl_tablet_id3}, {mytbl_tablet_id2} with the same partition key start and split depth

We should avoid including the parent tablet in the snapshot in case the two child tablets are registered.

Issue Type

kind/bug

Warning: Please confirm that this issue does not contain any sensitive information

  • I confirm this issue does not contain any sensitive information.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions