-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support to retry the host allocation for hybrid converters #12251
base: branch-25.04
Are you sure you want to change the base?
Support to retry the host allocation for hybrid converters #12251
Conversation
Signed-off-by: Firestarman <[email protected]>
Signed-off-by: Firestarman <[email protected]>
build |
Signed-off-by: Firestarman <[email protected]>
build |
tests/pom.xml
Outdated
<dependency> | ||
<groupId>com.nvidia</groupId> | ||
<artifactId>rapids-4-spark-hybrid_${scala.binary.version}</artifactId> | ||
<version>${project.version}</version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use the version of hybrid itself like other places:
<dependency>
<groupId>com.nvidia</groupId>
<artifactId>rapids-4-spark-hybrid_${scala.binary.version}</artifactId>
<version>${spark-rapids-hybrid.version}</version>
<scope>provided</scope>
</dependency>
Like private jar, it has its own version.
During the release process, the project.version
is not always equal to spark-rapids-hybrid.version
.
And set scope as <scope>test</scope>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice catch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated. I created an item for hyrid jar in the dependencyManagement
section in the root pom to unify its version for sub modules.
Signed-off-by: Firestarman <[email protected]>
Signed-off-by: Firestarman <[email protected]>
build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Signed-off-by: Firestarman <[email protected]>
Signed-off-by: Firestarman <[email protected]>
build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Contributes to #8874
This PR adds the retry protection to host memory allocations used by the C2C converters in Hybrid scans. This is done by introducing a new class named
HybridHostRetryAllocator
who implements the Hybrid host allocator interface with retry support. It will close the allocated buffers just before each retry starts, to leave more memory for higher priority tasks.This change also introduces another new trait named
HostRetryAllocator
to extract the retry code from the hybrid things to avoid loading hybrid jar when running the new unit tests.This is to try to resolve the following kind of OOMs: