Skip to content

Skip updating partition stats when adding empty stats partitions to Glue #25970

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

chenjian2664
Copy link
Contributor

@chenjian2664 chenjian2664 commented Jun 10, 2025

Description

Additional context and related issues

Release notes

(x) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text:

## Section
* Fix some things. ({issue}`issuenumber`)

@cla-bot cla-bot bot added the cla-signed label Jun 10, 2025
@github-actions github-actions bot added the hive Hive connector label Jun 10, 2025
@chenjian2664 chenjian2664 force-pushed the sep17064-sync-calls branch from 716ef21 to 87432b7 Compare June 10, 2025 08:46
@chenjian2664 chenjian2664 changed the title Skip updating partition statistics when adding partitions without sta… Skip updating partition stats when adding partitions without stats Jun 10, 2025
@chenjian2664 chenjian2664 changed the title Skip updating partition stats when adding partitions without stats Skip updating partition stats when adding empty stats partitions to Glue Jun 10, 2025
@@ -1106,6 +1106,7 @@ public void addPartitions(String databaseName, String tableName, List<PartitionW

// statistics are created after partitions because it is not clear if ordering matters in Glue
var createStatisticsTasks = partitionsWithStatistics.stream()
.filter(partitionWithStatistics -> partitionWithStatistics.getStatistics() != PartitionStatistics.empty())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this impact cases where we actually want statistics to be reset to empty ?
E.g. Insert overwrite on a partition with hive statistics collection flag disabled

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PartitionStatistics contains HiveBasicStatistics, this won't equal to the empty().

@ebyhr
Copy link
Member

ebyhr commented Jun 10, 2025

/test-with-secrets sha=87432b76aeba578904a9f573f8af77b960ec18ae

Copy link

github-actions bot commented Jun 10, 2025

The CI workflow run with tests that require additional secrets finished as failure: https://github.com/trinodb/trino/actions/runs/15571937661

@chenjian2664 chenjian2664 force-pushed the sep17064-sync-calls branch from 87432b7 to 9e0a289 Compare June 11, 2025 04:32
@@ -87,6 +95,40 @@ public void cleanUpSchema()
getQueryRunner().execute("DROP SCHEMA " + testSchema + " CASCADE");
}

@Test
void testInsertOverwriteStatisticsDisabled()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@raunaqmorarka I add the test to show case that the last commit won't influence the metastore operations

@chenjian2664
Copy link
Contributor Author

The test failures in https://github.com/trinodb/trino/actions/runs/15571937661 looks like not relates to the change

@chenjian2664 chenjian2664 force-pushed the sep17064-sync-calls branch 2 times, most recently from 4ba1e08 to 2bd5ceb Compare June 11, 2025 06:05
Add Glue metastore access operations test for insert overwrite with stats disabled
Add Glue metastore access operations test for `sync_partition_metadata` procedure
in `TestHiveGlueMetastoreAccessOperations`
@chenjian2664 chenjian2664 force-pushed the sep17064-sync-calls branch from 2bd5ceb to bef0465 Compare June 11, 2025 09:21
@ebyhr
Copy link
Member

ebyhr commented Jun 12, 2025

/test-with-secrets sha=bef046541465899f32b9300e7d041ec8e2c25c57

Copy link

github-actions bot commented Jun 12, 2025

The CI workflow run with tests that require additional secrets finished as failure: https://github.com/trinodb/trino/actions/runs/15599241399

@chenjian2664
Copy link
Contributor Author

@ebyhr
Copy link
Member

ebyhr commented Jun 12, 2025

Yes, @wendigo is working on #25942

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed hive Hive connector
Development

Successfully merging this pull request may close these issues.

4 participants