Skip to content

Conversation

@davve14
Copy link
Contributor

@davve14 davve14 commented Oct 17, 2025

Move optimizer.optimize-hash-generation from system-wide property to
JavaWorkerSessionPropertyProvider to ensure it's automatically hidden
and disabled for native execution.
Changes:

  • Move OPTIMIZE_HASH_GENERATION from SystemSessionProperties to JavaWorkerSessionPropertyProvider
  • Update HashGenerationOptimizer to read from JavaWorkerSessionPropertyProvider
  • Add conditional in PlanOptimizers to skip HashGenerationOptimizer for native execution
  • Update HashBuildAndJoinBenchmark to use new property location
  • Update tests including expected plan for tests running on native engine

Description

Motivation and Context

#25492

Impact

Removes session and config property OPTIMIZE_HASH_GENERATION when running on native worker.

Test Plan

Test to make sure that when running native worker, the above property should not be allowed to be set and not showing in the show session output. Tests checking expected vs generated plan should be without the HashGenerationOptimizer optimizer in the expected plan.

Contributor checklist

  • Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.
  • If adding new dependencies, verified they have an OpenSSF Scorecard score of 5.0 or higher (or obtained explicit TSC approval for lower scores).

Release Notes

Please follow release notes guidelines and fill in the release notes below.

== RELEASE NOTES ==

General Changes
* Fix optimizer.optimize-hash-generation configuration property to always be set to false when native execution is enabled.
* Fix session property optimize_hash_generation to prevent it from being set when native execution is enabled.

@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Oct 17, 2025

Reviewer's Guide

This PR refactors the OPTIMIZE_HASH_GENERATION session property from a system-wide setting to a Java-worker-specific property, ensuring it is hidden and disabled for native execution. The implementation moves property registration and access into JavaWorkerSessionPropertyProvider, updates the HashGenerationOptimizer and PlanOptimizers to respect native execution, and adjusts benchmarks and tests to use the new property location and skip hash generation on native workers.

Class diagram for session property refactor (OPTIMIZE_HASH_GENERATION)

classDiagram
    class SystemSessionProperties {
        - (removed) OPTIMIZE_HASH_GENERATION : String
        - (removed) isOptimizeHashGenerationEnabled(Session) : boolean
    }
    class JavaWorkerSessionPropertyProvider {
        + OPTIMIZE_HASH_GENERATION : String
        + isOptimizeHashGenerationEnabled(Session) : boolean
    }
    SystemSessionProperties <.. JavaWorkerSessionPropertyProvider : refactor property location
Loading

Class diagram for HashGenerationOptimizer and PlanOptimizers update

classDiagram
    class HashGenerationOptimizer {
        + isEnabled(Session) : boolean
        - (updated) now uses JavaWorkerSessionPropertyProvider.isOptimizeHashGenerationEnabled
    }
    class PlanOptimizers {
        - (updated) builder.add(HashGenerationOptimizer) only if !nativeExecution
    }
    HashGenerationOptimizer <.. PlanOptimizers : used conditionally
Loading

File-Level Changes

Change Details Files
Migrate optimize-hash-generation property to JavaWorkerSessionPropertyProvider
  • Removed OPTIMIZE_HASH_GENERATION constant and registration from SystemSessionProperties
  • Added OPTIMIZE_HASH_GENERATION constant, booleanProperty registration, and accessor in JavaWorkerSessionPropertyProvider
SystemSessionProperties.java
JavaWorkerSessionPropertyProvider.java
Skip HashGenerationOptimizer on native execution
  • Wrapped HashGenerationOptimizer registration in PlanOptimizers with native-execution check
PlanOptimizers.java
Update HashGenerationOptimizer to use new property accessor
  • Changed isEnabled to call JavaWorkerSessionPropertyProvider.isOptimizeHashGenerationEnabled instead of SystemSessionProperties
HashGenerationOptimizer.java
Adjust benchmarks and native test config for new property location
  • Updated imports in BenchmarkSuite and HashBuildAndJoinBenchmark to reference JavaWorkerSessionPropertyProvider
  • Overrode createFeaturesConfig in AbstractTestAggregationsNative to enable native execution
BenchmarkSuite.java
HashBuildAndJoinBenchmark.java
AbstractTestAggregationsNative.java
Modify tests to remove old property references and update expected plans
  • Removed setSystemProperty calls for OPTIMIZE_HASH_GENERATION in native tests
  • Updated test imports and expected plan assertions to reflect skipped hash generation
TestAddExchangesPlansWithFunctions.java
TestPrecomputedHashes.java
TestNativeIndexJoinLogicalPlanner.java

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@davve14
Copy link
Contributor Author

davve14 commented Oct 17, 2025

@pramodsatya please review the changes to the expected plans in the tests as I am a bit unsure on those.

@tdcmeehan tdcmeehan added the from:IBM PR from IBM label Oct 20, 2025
@prestodb-ci
Copy link
Contributor

Saved that user @davve14 is from IBM

@davve14 davve14 force-pushed the 25492-disable-native-optimize-hash-generation branch from fa93331 to 05017aa Compare October 26, 2025 11:25
@tdcmeehan tdcmeehan self-assigned this Oct 26, 2025
@davve14 davve14 force-pushed the 25492-disable-native-optimize-hash-generation branch from 05017aa to 9916172 Compare November 2, 2025 14:58
@davve14
Copy link
Contributor Author

davve14 commented Nov 3, 2025

@pramodsatya pushed a new commit. I restarted from scratch on the expected plans and it makes more sense now I believe. I should now have only removed the project node that was there to compute the hash early on. My understanding is that that will not happen in the same way now that we have optimize_hash_generation=false. Everything else should remain the same. I hope that makes sense.

@pramodsatya pramodsatya marked this pull request as ready for review November 3, 2025 23:27
@prestodb-ci prestodb-ci requested review from a team, pratyakshsharma and wanglinsong and removed request for a team November 3, 2025 23:27
Copy link
Contributor

@pramodsatya pramodsatya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @davve14, LGTM overall, please change the PR title to match the commit message, and update the release note.
@aditi-pandit, @tdcmeehan, could you please help review this change?

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes and they look great!

Prompt for AI Agents
Please address the comments from this code review:

## Individual Comments

### Comment 1
<location> `presto-tests/src/test/java/com/facebook/presto/tests/TestNativeIndexJoinLogicalPlanner.java:61` </location>
<code_context>
                 .setCatalog("tpch_indexed")
                 .setSchema(TINY_SCHEMA_NAME)
                 .setSystemProperty(NATIVE_EXECUTION_ENABLED, "true")
-                .setSystemProperty(OPTIMIZE_HASH_GENERATION, "false")
                 .build();

</code_context>

<issue_to_address>
**suggestion (testing):** Test removed setting of OPTIMIZE_HASH_GENERATION for native execution.

Verify that a test exists to confirm OPTIMIZE_HASH_GENERATION is inaccessible when native execution is enabled, and reference it if located elsewhere.

Suggested implementation:

```java
                .setCatalog("tpch_indexed")
                .setSchema(TINY_SCHEMA_NAME)
                .setSystemProperty(NATIVE_EXECUTION_ENABLED, "true")
                .build();

// Test to verify OPTIMIZE_HASH_GENERATION is inaccessible when native execution is enabled
@org.testng.annotations.Test
public void testOptimizeHashGenerationInaccessibleWithNativeExecutionEnabled()
{
    // Attempt to set OPTIMIZE_HASH_GENERATION with native execution enabled
    Map<String, String> sessionProperties = Map.of(
            NATIVE_EXECUTION_ENABLED, "true",
            "OPTIMIZE_HASH_GENERATION", "true"
    );

    // Build session with these properties
    var session = testSessionBuilder()
            .setCatalog("tpch_indexed")
            .setSchema(TINY_SCHEMA_NAME)
            .setSystemProperties(sessionProperties)
            .build();

    // Assert that OPTIMIZE_HASH_GENERATION is ignored or inaccessible
    // (Replace with actual assertion logic based on how the system handles this property)
    // For example:
    // assertFalse(session.getSystemProperty("OPTIMIZE_HASH_GENERATION").isPresent());
    // or
    // assertEquals(session.getSystemProperty("OPTIMIZE_HASH_GENERATION"), "false");
}

```

- You may need to adjust the assertion logic to match how your system exposes session properties.
- If a test already exists elsewhere, replace the new test with a comment referencing the location of the existing test, e.g.:
// See TestSessionProperties.testOptimizeHashGenerationInaccessibleWithNativeExecutionEnabled for coverage.
</issue_to_address>

### Comment 2
<location> `presto-main-base/src/test/java/com/facebook/presto/sql/query/TestPrecomputedHashes.java:21` </location>
<code_context>
 import java.util.List;
 import java.util.Map;

-import static com.facebook.presto.SystemSessionProperties.OPTIMIZE_HASH_GENERATION;
+import static com.facebook.presto.sessionpropertyproviders.JavaWorkerSessionPropertyProvider.OPTIMIZE_HASH_GENERATION;
 import static java.util.Objects.requireNonNull;

</code_context>

<issue_to_address>
**suggestion (testing):** Test updated to use new property location for OPTIMIZE_HASH_GENERATION.

Please verify that tests cover both Java worker and native execution scenarios for this property. If not, add a test to confirm the property is hidden or disabled in native execution.

Suggested implementation:

```java
public class TestPrecomputedHashes

```

```java
import static com.facebook.presto.sessionpropertyproviders.JavaWorkerSessionPropertyProvider.OPTIMIZE_HASH_GENERATION;
import static com.facebook.presto.testing.TestingSession.testSessionBuilder;

import static org.testng.Assert.assertFalse;
import static org.testng.Assert.assertTrue;

public class TestPrecomputedHashes

```

```java
public class TestPrecomputedHashes
{
    // Existing tests...

    @Test
    public void testOptimizeHashGenerationPropertyVisibleForJavaWorker()
    {
        // Simulate a session with Java worker enabled
        var session = testSessionBuilder()
                .setSystemProperty("java-worker-enabled", "true")
                .setSystemProperty(OPTIMIZE_HASH_GENERATION, "true")
                .build();

        // The property should be visible/enabled
        assertTrue(session.getSystemProperties().containsKey(OPTIMIZE_HASH_GENERATION),
                "OPTIMIZE_HASH_GENERATION should be visible for Java worker execution");
    }

    @Test
    public void testOptimizeHashGenerationPropertyHiddenForNativeExecution()
    {
        // Simulate a session with native execution enabled
        var session = testSessionBuilder()
                .setSystemProperty("native-execution-enabled", "true")
                .build();

        // The property should be hidden/disabled
        assertFalse(session.getSystemProperties().containsKey(OPTIMIZE_HASH_GENERATION),
                "OPTIMIZE_HASH_GENERATION should be hidden for native execution");
    }
}

```

- You may need to adjust the session property keys ("java-worker-enabled", "native-execution-enabled") to match your actual configuration.
- If your test framework or session builder uses a different way to simulate native vs Java worker execution, update the test setup accordingly.
- Ensure that the logic for hiding/disabling the property in native execution is implemented in the main codebase if not already present.
</issue_to_address>

### Comment 3
<location> `presto-main-base/src/test/java/com/facebook/presto/sql/planner/optimizations/TestAddExchangesPlansWithFunctions.java:233` </location>
<code_context>
     public void testJoinWithCppFunctionDoesNotGetPushedIntoSystemTableScan()
     {
         // java_bar is a java function, it is pushed down past the exchange
+        // Verify filter is below GATHER exchange, allowing it to execute in the system table scan fragment
         assertNativeDistributedPlan(
                 "SELECT c1.table_name FROM information_schema.columns c1 JOIN information_schema.columns c2 ON c1.ordinal_position = c2.ordinal_position WHERE java_bar(c1.ordinal_position) = 1",
</code_context>

<issue_to_address>
**suggestion (testing):** Added clarifying comment to test plan structure.

Please add assertions or helper methods to verify that HashGenerationOptimizer is absent from the plan during native execution, as required by this PR.

Suggested implementation:

```java
    public void testJoinWithCppFunctionDoesNotGetPushedIntoSystemTableScan()
    {
        // java_bar is a java function, it is pushed down past the exchange
        // Verify filter is below GATHER exchange, allowing it to execute in the system table scan fragment
        assertNativeDistributedPlan(
                "SELECT c1.table_name FROM information_schema.columns c1 JOIN information_schema.columns c2 ON c1.ordinal_position = c2.ordinal_position WHERE java_bar(c1.ordinal_position) = 1",
                anyTree(
                                join(INNER, ImmutableList.of(equiJoinClause("ordinal_position", "ordinal_position_4")),
                                        anyTree(
                                                exchange(REMOTE_STREAMING, GATHER,
                                                        filter("java_bar(ordinal_position) = BIGINT'1'",
                                                        // Additional assertion to verify HashGenerationOptimizer is absent
                                                        ),
                                                // Add helper to check optimizer absence
                                                noHashGenerationOptimizerInPlan()
                                        )
                                )
                        )
        );
    }

    private static PlanMatchPattern noHashGenerationOptimizerInPlan()
    {
        // This helper asserts that HashGenerationOptimizer is not present in the plan
        // (Assuming PlanMatchPattern.withoutOptimizer(String optimizerName) exists or similar utility)
        return PlanMatchPattern.noOptimizer("HashGenerationOptimizer");
    }

```

If `PlanMatchPattern.noOptimizer(String optimizerName)` or a similar utility does not exist, you will need to implement a helper that traverses the plan and asserts the absence of `HashGenerationOptimizer`. This may require additional code in your test utilities.
</issue_to_address>

### Comment 4
<location> `presto-native-tests/src/test/java/com/facebook/presto/nativetests/AbstractTestAggregationsNative.java:49-50` </location>
<code_context>
     @Override
     public boolean isEnabled(Session session)
     {
</code_context>

<issue_to_address>
**suggestion (testing):** Native test now explicitly enables native execution.

Please confirm there is a test verifying that OPTIMIZE_HASH_GENERATION is not accessible in this context, as required by the PR's impact statement.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@davve14 davve14 changed the title refactor: move optimize-hash-generation to Java-worker-specific property fix(native): move optimize-hash-generation to Java-worker-specific property Nov 4, 2025
tdcmeehan
tdcmeehan previously approved these changes Nov 14, 2025
@davve14 davve14 changed the title fix(native): move optimize-hash-generation to Java-worker-specific property Fix(native): move optimize-hash-generation to Java-worker-specific property Nov 15, 2025
@davve14 davve14 changed the title Fix(native): move optimize-hash-generation to Java-worker-specific property fix(native): move optimize-hash-generation to Java-worker-specific property Nov 15, 2025
@davve14 davve14 changed the title fix(native): move optimize-hash-generation to Java-worker-specific property fix(native): Move optimize-hash-generation to Java-worker-specific property Nov 15, 2025
@davve14
Copy link
Contributor Author

davve14 commented Nov 17, 2025

Does the documentation for these config and session properties need any update or clarification added as a result of this PR?

https://github.com/prestodb/presto/blob/master/presto-docs/src/main/sphinx/admin/properties.rst#optimizeroptimize-hash-generation

https://github.com/prestodb/presto/blob/master/presto-docs/src/main/sphinx/presto_cpp/properties.rst#optimizeroptimize-hash-generation

https://github.com/prestodb/presto/blob/master/presto-docs/src/main/sphinx/admin/properties-session.rst#optimize_hash_generation

https://github.com/prestodb/presto/blob/master/presto-docs/src/main/sphinx/presto_cpp/properties-session.rst

Yes, I believe for the presto_cpp config property one the property should be removed altogether as it's not exposed anymore neither as a config nor a session property (I see it hasn't made its way into the session property documentation so all good there I think). And the section up top at the first presto_cpp link should have it removed as well.

For the java ones we can propably just keep as is. I can make the changes but maybe @pramodsatya can just confirm that it makes sense?

@pramodsatya
Copy link
Contributor

Thanks @steveburnett, would it be better a note in https://github.com/prestodb/presto/blob/master/presto-docs/src/main/sphinx/admin/properties-session.rst#optimize_hash_generation to clarify that this session property is not supported in Presto C++, similar to the note for config https://github.com/prestodb/presto/blob/master/presto-docs/src/main/sphinx/presto_cpp/properties.rst#optimizeroptimize-hash-generation ?

This session property is not present in https://github.com/prestodb/presto/blob/master/presto-docs/src/main/sphinx/presto_cpp/properties-session.rst, so I am also fine leaving the docs unchanged as @davve14 suggested. What do you recommend?

@steveburnett
Copy link
Contributor

Thanks @pramodsatya and @davve14! I like both of your ideas: add a note in the java doc that the property isn't supported in presto_cpp, and removal from the presto_cpp doc. Let me know if I've misunderstood.

@pramodsatya
Copy link
Contributor

pramodsatya commented Nov 17, 2025

Thanks @pramodsatya and @davve14! I like both of your ideas: add a note in the java doc that the property isn't supported in presto_cpp, and removal from the presto_cpp doc. Let me know if I've misunderstood.

Thanks @steveburnett, this PR only removes the session property optimize_hash_generation and not the coordinator config optimizer.optimize-hash-generation, so the removal from presto_cpp doc is not necessary. Also, the presto_cpp doc clarifies that optimizer.optimize-hash-generation should be set to false on the coordinator for C++ clusters, which is helpful.
@davve14, could you please add a note in https://github.com/prestodb/presto/blob/master/presto-docs/src/main/sphinx/admin/properties-session.rst#optimize_hash_generation to clarify that this session property is not supported in Presto C++?

@steveburnett, maybe in a follow-up PR, we can update documentation for Presto general session properties (in https://github.com/prestodb/presto/blob/master/presto-docs/src/main/sphinx/admin/properties-session.rst#general-properties), and separate out the Java worker only session properties from JavaWorkerSessionPropertyProvider.java into a separate doc, similar to the one for C++ (https://github.com/prestodb/presto/blob/master/presto-docs/src/main/sphinx/presto_cpp/properties-session.rst) ?

@steveburnett
Copy link
Contributor

Thanks @pramodsatya and @davve14! I like both of your ideas: add a note in the java doc that the property isn't supported in presto_cpp, and removal from the presto_cpp doc. Let me know if I've misunderstood.

Thanks @steveburnett, this PR only removes the session property optimize_hash_generation and not the coordinator config optimizer.optimize-hash-generation, so the removal from presto_cpp doc is not necessary. Also, the presto_cpp doc clarifies that optimizer.optimize-hash-generation should be set to false on the coordinator for C++ clusters, which is helpful. @davve14, could you please add a note in https://github.com/prestodb/presto/blob/master/presto-docs/src/main/sphinx/admin/properties-session.rst#optimize_hash_generation to clarify that this session property is not supported in Presto C++?

Makes sense @pramodsatya. Thank you for explaining!

@steveburnett, maybe in a follow-up PR, we can update documentation for Presto general session properties (in https://github.com/prestodb/presto/blob/master/presto-docs/src/main/sphinx/admin/properties-session.rst#general-properties), and separate out the Java worker only session properties from JavaWorkerSessionPropertyProvider.java into a separate doc, similar to the one for C++ (https://github.com/prestodb/presto/blob/master/presto-docs/src/main/sphinx/presto_cpp/properties-session.rst) ?

That's an interesting idea, I think it's worth exploring. Thanks!

…operty

  Move optimizer.optimize-hash-generation from system-wide property to
  JavaWorkerSessionPropertyProvider to ensure it's automatically hidden
  and disabled for native execution.
  Changes:
  - Move OPTIMIZE_HASH_GENERATION from SystemSessionProperties to JavaWorkerSessionPropertyProvider
  - Update HashGenerationOptimizer to read from JavaWorkerSessionPropertyProvider
  - Add conditional in PlanOptimizers to skip HashGenerationOptimizer for native execution
  - Update LocalQueryRunner.getPlanOptimizers to accept both FeaturesConfig and JavaFeaturesConfig
  - Update HashBuildAndJoinBenchmark to use new property location
  - Update tests including expected plan for tests running on native engine
  - Override createFeaturesConfig in AbstractTestAggregationsNative.java to not use optimize_hash_generation in native tests
@davve14 davve14 force-pushed the 25492-disable-native-optimize-hash-generation branch from 9916172 to 22add09 Compare November 19, 2025 09:29
@github-actions
Copy link

Codenotify: Notifying subscribers in CODENOTIFY files for diff 92a6920...22add09.

Notify File(s)
@aditi-pandit presto-parser/src/main/antlr4/com/facebook/presto/sql/parser/SqlBase.g4
@elharo presto-parser/src/main/antlr4/com/facebook/presto/sql/parser/SqlBase.g4
@kaikalur presto-parser/src/main/antlr4/com/facebook/presto/sql/parser/SqlBase.g4
@rschlussel presto-parser/src/main/antlr4/com/facebook/presto/sql/parser/SqlBase.g4

Copy link
Contributor

@steveburnett steveburnett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! (docs)

Pull branch, local doc build, looks good. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

from:IBM PR from IBM

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants