fix: Handle deletes and updates properly in secondary index #14090

yihua · 2025-10-14T23:39:37Z

Describe the issue this Pull Request addresses

This PR addresses the correctness issue of secondary index during secondary index record generation from the update operation based on the commit metadata. There are cases where a delete and another update on the same metadata record key in secondary index happen, e.g., from partition path update. In this case, the secondary index is corrupted after the partition path update because the delete on the secondary index record takes precedence.

Summary and Changelog

The fix is to introduce reduceByKeys operation on the secondary index records before sending the records to the MDT partition for updates, similar to record index (RLI).

Refactors HoodieTableMetadataUtils#reduceByKeys so it can be reused by secondary index.
For streaming MDT writes: changes HoodieBackedTableMetadataWriter#streamWriteToMetadataPartitions to incur reduceByKeys operation on secondary index records only so the shuffling of records is limited to the secondary index records. MetadataIndexMapper is refactored and implementation classes are added for record index and secondary index. Note that before this PR, each spark task generates records for record index and secondary index. After this PR, there is one transformation with spark tasks generating records for record index without reduceByKey, with another transformation with spark tasks generating records for secondary index which requires reduceByKey. In the streaming MDT writes, the reason to incur reduceByKeys is that the secondary key can change in addition to partition path update, so we need to use reduceByKeys to determine that.
For non-streaming MDT writes: SecondaryIndexRecordGenerationUtils#convertWriteStatsToSecondaryIndexRecords is changed to apply reduceByKeys operation on the secondary index records before returning.
Cleans up unused interface and methods.
Adds TestSecondaryIndexPruning#testSecondaryIndexWithPartitionPathUpdateUsingGlobalIndex as functional tests to cover the changes.

Impact

Fixes correctness issue on handling deletes and updates in secondary index

Risk Level

low

Documentation Update

N/A

Contributor's checklist

Read through contributor's guide
Enough context is provided in the sections above
Adequate tests were added if applicable

hudi-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataPayload.java

hudi-bot · 2025-10-16T13:41:07Z

CI report:

6b4601a UNKNOWN
9054a9a Azure: SUCCESS

Bot commands

@hudi-bot supports the following commands:

@hudi-bot run azure re-run the last Azure build

nsivabalan · 2025-10-16T19:33:53Z

PR description please

nsivabalan · 2025-10-16T19:37:42Z

...asource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestSecondaryIndexPruning.scala

+   */
+  @ParameterizedTest
+  @CsvSource(Array("COPY_ON_WRITE,true", "COPY_ON_WRITE,false", "MERGE_ON_READ,true", "MERGE_ON_READ,false"))
+  def testSecondaryIndexWithPartitionPathUpdateUsingGlobalIndex(tableType: HoodieTableType,


do we need both table types?
can we just keep it to COW table.

Partition path updates for MERGE_ON_READ table would add log files for deletes and inserts after global index, which also reads the file groups. So it would be good to have test coverage on MERGE_ON_READ table type.

nsivabalan · 2025-10-16T19:44:45Z

hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/MetadataIndexMapper.java

+   * @param writeStatus the write status to process
+   * @return list of metadata records
+   */
+  protected abstract List<HoodieRecord> generateRecords(WriteStatus writeStatus);


can we file a follow up ticket to fix this to be an iterator.
not sure how much benefit we might get.
but to generate SI records for one file group, we have to read prev version of file slice, and new version of file slice and then compare to generate the SI records. not sure if we can do much by converting this to an iterator.
but we can file a follow up to attend do.

but don't expect it to give us any material gains.

I'm only refactoring the code here. We can improve the logic by using the iterator if needed in a separate PR.

nsivabalan · 2025-10-16T19:45:04Z

LGTM for the most part.
guess you plan to add more tests.

yihua · 2025-10-17T08:19:49Z

LGTM for the most part. guess you plan to add more tests.

Yes, I'll add more tests and also fill out the PR description. Thanks for the initial review.

nsivabalan · 2025-10-17T22:34:16Z

sure. lmk once the patch is ready for review.

nsivabalan

hey @yihua : what more tests you are planning to add?
for regular RLI and SI, existing test should suffice.
for update partition path, I see you have added testSecondaryIndexWithPartitionPathUpdateUsingGlobalIndex in this patch.

nsivabalan · 2025-10-20T06:00:23Z

also, please open up the PR from draft state once you feel the patch is ready

yihua · 2025-10-20T06:11:31Z

hey @yihua : what more tests you are planning to add? for regular RLI and SI, existing test should suffice. for update partition path, I see you have added testSecondaryIndexWithPartitionPathUpdateUsingGlobalIndex in this patch.

We can add more unit tests if needed; currently, the functional tests I added already cover the new changes.

yihua · 2025-10-20T06:20:56Z

hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java

+      metadataRecordsPair = metadataRecords.mapToPair(r -> Pair.of(r.getKey(), r));
    }
-    return recordIndexRecordsPair.reduceByKey((SerializableBiFunction<HoodieRecord, HoodieRecord, HoodieRecord>) (record1, record2) -> {
+    return metadataRecordsPair.reduceByKey((SerializableBiFunction<HoodieRecord, HoodieRecord, HoodieRecord>) (record1, record2) -> {


@nsivabalan actually, for secondary index, if there are a delete and update on the same metadata record key (e.g., secondary_key$record_key), we can remove both records, correct? This is because there is no other information stored outside secondary_key$record_key, unlike RLI which stores the location in the record.

github-actions bot added the size:XS PR with lines of changes in <= 10 label Oct 14, 2025

yihua added 3 commits October 15, 2025 16:47

fix!: Change how deletes are encoded in record index and secondary index

251529b

Add tests

0dda8b8

Enhance tests

de876d8

nsivabalan reviewed Oct 15, 2025

View reviewed changes

hudi-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataPayload.java Show resolved Hide resolved

yihua added 2 commits October 16, 2025 10:55

Add reduce by key

d7a224a

Enhance tests

4c4f403

yihua force-pushed the perf-si-deletes branch from 090e3b7 to 4c4f403 Compare October 16, 2025 03:10

github-actions bot added size:M PR with lines of changes in (100, 300] and removed size:XS PR with lines of changes in <= 10 labels Oct 16, 2025

yihua added 2 commits October 16, 2025 12:38

Add more test case

4dfa6db

Revert delete change

e7acf76

yihua changed the title ~~fix!: Change how deletes are encoded in record index and secondary index~~ fix!: Handle deletes and updates properly in secondary index Oct 16, 2025

Fix streaming DAG on secondary index

36bbeb1

github-actions bot added size:L PR with lines of changes in (300, 1000] and removed size:M PR with lines of changes in (100, 300] labels Oct 16, 2025

yihua added 3 commits October 16, 2025 17:30

Fix secondary index in streaming DAG properly

cfd2cfa

Fix scala style

6b4601a

Fix scalastyle

b6885ba

yihua changed the title ~~fix!: Handle deletes and updates properly in secondary index~~ fix: Handle deletes and updates properly in secondary index Oct 16, 2025

Fix Spark 4

9054a9a

nsivabalan reviewed Oct 16, 2025

View reviewed changes

nsivabalan reviewed Oct 20, 2025

View reviewed changes

yihua commented Oct 20, 2025

View reviewed changes

yihua marked this pull request as ready for review October 20, 2025 06:42

fix: Handle deletes and updates properly in secondary index #14090

Are you sure you want to change the base?

fix: Handle deletes and updates properly in secondary index #14090

Conversation

yihua commented Oct 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Describe the issue this Pull Request addresses

Summary and Changelog

Impact

Risk Level

Documentation Update

Contributor's checklist

Uh oh!

Uh oh!

hudi-bot commented Oct 16, 2025

CI report:

Uh oh!

nsivabalan commented Oct 16, 2025

Uh oh!

nsivabalan Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

yihua Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

nsivabalan Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

yihua Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

nsivabalan commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yihua commented Oct 17, 2025

Uh oh!

nsivabalan commented Oct 17, 2025

Uh oh!

nsivabalan left a comment

Choose a reason for hiding this comment

Uh oh!

nsivabalan commented Oct 20, 2025

Uh oh!

yihua commented Oct 20, 2025

Uh oh!

yihua Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

yihua commented Oct 14, 2025 •

edited

Loading

nsivabalan commented Oct 16, 2025 •

edited

Loading