-
Notifications
You must be signed in to change notification settings - Fork 5.5k
fix(native): Move optimize-hash-generation to Java-worker-specific property #26348
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
fix(native): Move optimize-hash-generation to Java-worker-specific property #26348
Conversation
Reviewer's GuideThis PR refactors the OPTIMIZE_HASH_GENERATION session property from a system-wide setting to a Java-worker-specific property, ensuring it is hidden and disabled for native execution. The implementation moves property registration and access into JavaWorkerSessionPropertyProvider, updates the HashGenerationOptimizer and PlanOptimizers to respect native execution, and adjusts benchmarks and tests to use the new property location and skip hash generation on native workers. Class diagram for session property refactor (OPTIMIZE_HASH_GENERATION)classDiagram
class SystemSessionProperties {
- (removed) OPTIMIZE_HASH_GENERATION : String
- (removed) isOptimizeHashGenerationEnabled(Session) : boolean
}
class JavaWorkerSessionPropertyProvider {
+ OPTIMIZE_HASH_GENERATION : String
+ isOptimizeHashGenerationEnabled(Session) : boolean
}
SystemSessionProperties <.. JavaWorkerSessionPropertyProvider : refactor property location
Class diagram for HashGenerationOptimizer and PlanOptimizers updateclassDiagram
class HashGenerationOptimizer {
+ isEnabled(Session) : boolean
- (updated) now uses JavaWorkerSessionPropertyProvider.isOptimizeHashGenerationEnabled
}
class PlanOptimizers {
- (updated) builder.add(HashGenerationOptimizer) only if !nativeExecution
}
HashGenerationOptimizer <.. PlanOptimizers : used conditionally
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
|
@pramodsatya please review the changes to the expected plans in the tests as I am a bit unsure on those. |
|
Saved that user @davve14 is from IBM |
fa93331 to
05017aa
Compare
05017aa to
9916172
Compare
|
@pramodsatya pushed a new commit. I restarted from scratch on the expected plans and it makes more sense now I believe. I should now have only removed the project node that was there to compute the hash early on. My understanding is that that will not happen in the same way now that we have optimize_hash_generation=false. Everything else should remain the same. I hope that makes sense. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @davve14, LGTM overall, please change the PR title to match the commit message, and update the release note.
@aditi-pandit, @tdcmeehan, could you please help review this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes and they look great!
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location> `presto-tests/src/test/java/com/facebook/presto/tests/TestNativeIndexJoinLogicalPlanner.java:61` </location>
<code_context>
.setCatalog("tpch_indexed")
.setSchema(TINY_SCHEMA_NAME)
.setSystemProperty(NATIVE_EXECUTION_ENABLED, "true")
- .setSystemProperty(OPTIMIZE_HASH_GENERATION, "false")
.build();
</code_context>
<issue_to_address>
**suggestion (testing):** Test removed setting of OPTIMIZE_HASH_GENERATION for native execution.
Verify that a test exists to confirm OPTIMIZE_HASH_GENERATION is inaccessible when native execution is enabled, and reference it if located elsewhere.
Suggested implementation:
```java
.setCatalog("tpch_indexed")
.setSchema(TINY_SCHEMA_NAME)
.setSystemProperty(NATIVE_EXECUTION_ENABLED, "true")
.build();
// Test to verify OPTIMIZE_HASH_GENERATION is inaccessible when native execution is enabled
@org.testng.annotations.Test
public void testOptimizeHashGenerationInaccessibleWithNativeExecutionEnabled()
{
// Attempt to set OPTIMIZE_HASH_GENERATION with native execution enabled
Map<String, String> sessionProperties = Map.of(
NATIVE_EXECUTION_ENABLED, "true",
"OPTIMIZE_HASH_GENERATION", "true"
);
// Build session with these properties
var session = testSessionBuilder()
.setCatalog("tpch_indexed")
.setSchema(TINY_SCHEMA_NAME)
.setSystemProperties(sessionProperties)
.build();
// Assert that OPTIMIZE_HASH_GENERATION is ignored or inaccessible
// (Replace with actual assertion logic based on how the system handles this property)
// For example:
// assertFalse(session.getSystemProperty("OPTIMIZE_HASH_GENERATION").isPresent());
// or
// assertEquals(session.getSystemProperty("OPTIMIZE_HASH_GENERATION"), "false");
}
```
- You may need to adjust the assertion logic to match how your system exposes session properties.
- If a test already exists elsewhere, replace the new test with a comment referencing the location of the existing test, e.g.:
// See TestSessionProperties.testOptimizeHashGenerationInaccessibleWithNativeExecutionEnabled for coverage.
</issue_to_address>
### Comment 2
<location> `presto-main-base/src/test/java/com/facebook/presto/sql/query/TestPrecomputedHashes.java:21` </location>
<code_context>
import java.util.List;
import java.util.Map;
-import static com.facebook.presto.SystemSessionProperties.OPTIMIZE_HASH_GENERATION;
+import static com.facebook.presto.sessionpropertyproviders.JavaWorkerSessionPropertyProvider.OPTIMIZE_HASH_GENERATION;
import static java.util.Objects.requireNonNull;
</code_context>
<issue_to_address>
**suggestion (testing):** Test updated to use new property location for OPTIMIZE_HASH_GENERATION.
Please verify that tests cover both Java worker and native execution scenarios for this property. If not, add a test to confirm the property is hidden or disabled in native execution.
Suggested implementation:
```java
public class TestPrecomputedHashes
```
```java
import static com.facebook.presto.sessionpropertyproviders.JavaWorkerSessionPropertyProvider.OPTIMIZE_HASH_GENERATION;
import static com.facebook.presto.testing.TestingSession.testSessionBuilder;
import static org.testng.Assert.assertFalse;
import static org.testng.Assert.assertTrue;
public class TestPrecomputedHashes
```
```java
public class TestPrecomputedHashes
{
// Existing tests...
@Test
public void testOptimizeHashGenerationPropertyVisibleForJavaWorker()
{
// Simulate a session with Java worker enabled
var session = testSessionBuilder()
.setSystemProperty("java-worker-enabled", "true")
.setSystemProperty(OPTIMIZE_HASH_GENERATION, "true")
.build();
// The property should be visible/enabled
assertTrue(session.getSystemProperties().containsKey(OPTIMIZE_HASH_GENERATION),
"OPTIMIZE_HASH_GENERATION should be visible for Java worker execution");
}
@Test
public void testOptimizeHashGenerationPropertyHiddenForNativeExecution()
{
// Simulate a session with native execution enabled
var session = testSessionBuilder()
.setSystemProperty("native-execution-enabled", "true")
.build();
// The property should be hidden/disabled
assertFalse(session.getSystemProperties().containsKey(OPTIMIZE_HASH_GENERATION),
"OPTIMIZE_HASH_GENERATION should be hidden for native execution");
}
}
```
- You may need to adjust the session property keys ("java-worker-enabled", "native-execution-enabled") to match your actual configuration.
- If your test framework or session builder uses a different way to simulate native vs Java worker execution, update the test setup accordingly.
- Ensure that the logic for hiding/disabling the property in native execution is implemented in the main codebase if not already present.
</issue_to_address>
### Comment 3
<location> `presto-main-base/src/test/java/com/facebook/presto/sql/planner/optimizations/TestAddExchangesPlansWithFunctions.java:233` </location>
<code_context>
public void testJoinWithCppFunctionDoesNotGetPushedIntoSystemTableScan()
{
// java_bar is a java function, it is pushed down past the exchange
+ // Verify filter is below GATHER exchange, allowing it to execute in the system table scan fragment
assertNativeDistributedPlan(
"SELECT c1.table_name FROM information_schema.columns c1 JOIN information_schema.columns c2 ON c1.ordinal_position = c2.ordinal_position WHERE java_bar(c1.ordinal_position) = 1",
</code_context>
<issue_to_address>
**suggestion (testing):** Added clarifying comment to test plan structure.
Please add assertions or helper methods to verify that HashGenerationOptimizer is absent from the plan during native execution, as required by this PR.
Suggested implementation:
```java
public void testJoinWithCppFunctionDoesNotGetPushedIntoSystemTableScan()
{
// java_bar is a java function, it is pushed down past the exchange
// Verify filter is below GATHER exchange, allowing it to execute in the system table scan fragment
assertNativeDistributedPlan(
"SELECT c1.table_name FROM information_schema.columns c1 JOIN information_schema.columns c2 ON c1.ordinal_position = c2.ordinal_position WHERE java_bar(c1.ordinal_position) = 1",
anyTree(
join(INNER, ImmutableList.of(equiJoinClause("ordinal_position", "ordinal_position_4")),
anyTree(
exchange(REMOTE_STREAMING, GATHER,
filter("java_bar(ordinal_position) = BIGINT'1'",
// Additional assertion to verify HashGenerationOptimizer is absent
),
// Add helper to check optimizer absence
noHashGenerationOptimizerInPlan()
)
)
)
);
}
private static PlanMatchPattern noHashGenerationOptimizerInPlan()
{
// This helper asserts that HashGenerationOptimizer is not present in the plan
// (Assuming PlanMatchPattern.withoutOptimizer(String optimizerName) exists or similar utility)
return PlanMatchPattern.noOptimizer("HashGenerationOptimizer");
}
```
If `PlanMatchPattern.noOptimizer(String optimizerName)` or a similar utility does not exist, you will need to implement a helper that traverses the plan and asserts the absence of `HashGenerationOptimizer`. This may require additional code in your test utilities.
</issue_to_address>
### Comment 4
<location> `presto-native-tests/src/test/java/com/facebook/presto/nativetests/AbstractTestAggregationsNative.java:49-50` </location>
<code_context>
@Override
public boolean isEnabled(Session session)
{
</code_context>
<issue_to_address>
**suggestion (testing):** Native test now explicitly enables native execution.
Please confirm there is a test verifying that OPTIMIZE_HASH_GENERATION is not accessible in this context, as required by the PR's impact statement.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
Does the documentation for these config and session properties need any update or clarification added as a result of this PR? |
Yes, I believe for the presto_cpp config property one the property should be removed altogether as it's not exposed anymore neither as a config nor a session property (I see it hasn't made its way into the session property documentation so all good there I think). And the section up top at the first presto_cpp link should have it removed as well. For the java ones we can propably just keep as is. I can make the changes but maybe @pramodsatya can just confirm that it makes sense? |
|
Thanks @steveburnett, would it be better a note in https://github.com/prestodb/presto/blob/master/presto-docs/src/main/sphinx/admin/properties-session.rst#optimize_hash_generation to clarify that this session property is not supported in Presto C++, similar to the note for config https://github.com/prestodb/presto/blob/master/presto-docs/src/main/sphinx/presto_cpp/properties.rst#optimizeroptimize-hash-generation ? This session property is not present in https://github.com/prestodb/presto/blob/master/presto-docs/src/main/sphinx/presto_cpp/properties-session.rst, so I am also fine leaving the docs unchanged as @davve14 suggested. What do you recommend? |
|
Thanks @pramodsatya and @davve14! I like both of your ideas: add a note in the java doc that the property isn't supported in presto_cpp, and removal from the presto_cpp doc. Let me know if I've misunderstood. |
Thanks @steveburnett, this PR only removes the session property @steveburnett, maybe in a follow-up PR, we can update documentation for Presto general session properties (in https://github.com/prestodb/presto/blob/master/presto-docs/src/main/sphinx/admin/properties-session.rst#general-properties), and separate out the Java worker only session properties from JavaWorkerSessionPropertyProvider.java into a separate doc, similar to the one for C++ (https://github.com/prestodb/presto/blob/master/presto-docs/src/main/sphinx/presto_cpp/properties-session.rst) ? |
Makes sense @pramodsatya. Thank you for explaining!
That's an interesting idea, I think it's worth exploring. Thanks! |
…operty Move optimizer.optimize-hash-generation from system-wide property to JavaWorkerSessionPropertyProvider to ensure it's automatically hidden and disabled for native execution. Changes: - Move OPTIMIZE_HASH_GENERATION from SystemSessionProperties to JavaWorkerSessionPropertyProvider - Update HashGenerationOptimizer to read from JavaWorkerSessionPropertyProvider - Add conditional in PlanOptimizers to skip HashGenerationOptimizer for native execution - Update LocalQueryRunner.getPlanOptimizers to accept both FeaturesConfig and JavaFeaturesConfig - Update HashBuildAndJoinBenchmark to use new property location - Update tests including expected plan for tests running on native engine - Override createFeaturesConfig in AbstractTestAggregationsNative.java to not use optimize_hash_generation in native tests
9916172 to
22add09
Compare
|
Codenotify: Notifying subscribers in CODENOTIFY files for diff 92a6920...22add09.
|
steveburnett
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! (docs)
Pull branch, local doc build, looks good. Thanks!
Move optimizer.optimize-hash-generation from system-wide property to
JavaWorkerSessionPropertyProvider to ensure it's automatically hidden
and disabled for native execution.
Changes:
Description
Motivation and Context
#25492
Impact
Removes session and config property OPTIMIZE_HASH_GENERATION when running on native worker.
Test Plan
Test to make sure that when running native worker, the above property should not be allowed to be set and not showing in the show session output. Tests checking expected vs generated plan should be without the HashGenerationOptimizer optimizer in the expected plan.
Contributor checklist
Release Notes
Please follow release notes guidelines and fill in the release notes below.