Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support unspill for SpillableHostBuffer #12186

Open
wants to merge 10 commits into
base: branch-25.04
Choose a base branch
from

Conversation

binmahone
Copy link
Collaborator

@binmahone binmahone commented Feb 20, 2025

This PR closes #12184.

It only takes care of SpillableHostBuffer by enhancing SpillableHostBufferHandle. In SpillableHostBufferHandle, materialize() is replaced by materialize(unspill: Boolean = false), so that user can express the intention to guarantee the host buffer is unspilled to host (so that he won't pay the price of reading from disk for this buffer anymore).

image

Internally, if set to true, an unspill step will be added to put the buffer back to host store after reading from disk (if applies). The tricky part is when materialize(true) and spill() are called at the same time, to be more specific, materialize(true) happens after spilling is set to true but disk writing has not finished. In this PR we use a flag called spillingInterrupted to indicate that the host memory buffer has been rescued and the spilling thread should, on seeing the spillingInterrupted equals true, abandon what has been already written to disk and abort current spilling procedure.

In the PR, I also tried upgrading mockito verson from 3.12.4 to 4.11.0, in the hope of it fixing a static mocking problem for me. Unfortunately it didn't work out. However I think it's still worthwhile to keep the upgrade work to keep mockito version consistent with org.scalatestplus/mockito-4-11, which literally speaking requires mockito 4.11

Signed-off-by: Hongbin Ma (Mahone) <[email protected]>
@binmahone binmahone requested a review from a team as a code owner February 20, 2025 14:46
@binmahone binmahone marked this pull request as draft February 20, 2025 14:46
@binmahone binmahone changed the title support unspill for SpillableHostBuffer [DO NOT REVIEW] support unspill for SpillableHostBuffer Feb 20, 2025
@abellina
Copy link
Collaborator

Lets add some details on how unspill works in this PR when you get a chance. I also would like to see a benchmark that shows a performance change with this PR, what is the impact of unspill in your scenario, mostly to understand how bad things really are today.

Signed-off-by: Hongbin Ma (Mahone) <[email protected]>
Signed-off-by: Hongbin Ma (Mahone) <[email protected]>
Signed-off-by: Hongbin Ma (Mahone) <[email protected]>
@binmahone
Copy link
Collaborator Author

binmahone commented Feb 25, 2025

Lets add some details on how unspill works in this PR when you get a chance. I also would like to see a benchmark that shows a performance change with this PR, what is the impact of unspill in your scenario, mostly to understand how bad things really are today.

Hi @abellina , I have refined my PR and updated how it works in the PR description. Please take a look.

As to the benchmark, let's say the example in #12184 (comment) (we need #12215 fixed to run this example, but it in turn depends on this current issue), if you limit the host memory under a relatively low threshold, you'll find it takes forever for the shuffle concat to finish, because it's input pieces are frequently churning.

@binmahone binmahone requested a review from abellina February 25, 2025 02:21
@binmahone binmahone self-assigned this Feb 25, 2025
@binmahone binmahone changed the title [DO NOT REVIEW] support unspill for SpillableHostBuffer support unspill for SpillableHostBuffer Feb 25, 2025
Signed-off-by: Hongbin Ma (Mahone) <[email protected]>
@binmahone binmahone marked this pull request as ready for review February 25, 2025 02:26
@binmahone
Copy link
Collaborator Author

build

Signed-off-by: Hongbin Ma (Mahone) <[email protected]>
Signed-off-by: Hongbin Ma (Mahone) <[email protected]>
@abellina
Copy link
Collaborator

abellina commented Feb 25, 2025

@binmahone please see my comment here: #12184 (comment). Is there a reason why this pattern change suggestion doesn't work?

@revans2
Copy link
Collaborator

revans2 commented Feb 26, 2025

As to the benchmark, let's say the example in #12184 (comment) (we need #12215 fixed to run this example, but it in turn depends on this current issue), if you limit the host memory under a relatively low threshold, you'll find it takes forever for the shuffle concat to finish, because it's input pieces are frequently churning.

So thins brings up a very interesting point. Do we want to be able to run the concat operation when we do not have enough host memory to store the entire input shuffle table and the output?

My idea to stop the churn is in #12236 (comment) But it assumes that we want to have enough host memory to do the entire operation. If we have some kind of a requirement to do it with less memory, I would prefer it if we handled the split and retry exception and then tried to concat fewer KudoTable buffers instead.

Copy link
Collaborator

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@binmahone I think your idea of unspill and my idea of unspill are a bit different.

For me unspill implies that we can have both a DiskHandle and a HostMemoryBuffer at the same time inside of the SpillableHostBufferHandle.

So if we spill data to disk and then read it back in again we keep a copy of it cached in host.

If we go to spill and we have both disk and host, then it is simple we just release host so it can be cleaned up, no writing to disk needed.

If we run into a situation where we are in the middle of spilling and we want to materialize the buffer, then ideally we let the spill code know that it can finish writing to disk, but it should not delete the host buffer because someone grabbed it.

For me the only down side to this approach is the added complexity in the SpillFramework code and the slightly higher probability that we might need to call spill more frequently if the the life time of the SpillableHostBufferHandle is much longer than the life time of the HostMemoryBuffer that is returned by materialize.

@binmahone
Copy link
Collaborator Author

Do we want to be able to run the concat operation when we do not have enough host memory to store the entire input shuffle table and the output?

I think the answer is NO, that why the code pattern will be "lock all input buffers first" (checkout the snapshot in #12184 (comment)) , it will make sure all inputs are already in memory

@binmahone
Copy link
Collaborator Author

@binmahone I think your idea of unspill and my idea of unspill are a bit different.

For me unspill implies that we can have both a DiskHandle and a HostMemoryBuffer at the same time inside of the SpillableHostBufferHandle.

So if we spill data to disk and then read it back in again we keep a copy of it cached in host.

If we go to spill and we have both disk and host, then it is simple we just release host so it can be cleaned up, no writing to disk needed.

If we run into a situation where we are in the middle of spilling and we want to materialize the buffer, then ideally we let the spill code know that it can finish writing to disk, but it should not delete the host buffer because someone grabbed it.

For me the only down side to this approach is the added complexity in the SpillFramework code and the slightly higher probability that we might need to call spill more frequently if the the life time of the SpillableHostBufferHandle is much longer than the life time of the HostMemoryBuffer that is returned by materialize.

Sorry @revans2 I'm a little confused, it appears to me that this PR's implementation matches your description? Need some advises on the next step of this PR

@sameerz sameerz added the reliability Features to improve reliability or bugs that severly impact the reliability of the plugin label Feb 28, 2025
@revans2
Copy link
Collaborator

revans2 commented Mar 3, 2025

Sorry @revans2 I'm a little confused, it appears to me that this PR's implementation matches your description? Need some advises on the next step of this PR

There are a lot of places where it is not doing what I called out.

For me unspill implies that we can have both a DiskHandle and a HostMemoryBuffer at the same time inside of the SpillableHostBufferHandle.

So if we spill data to disk and then read it back in again we keep a copy of it cached in host.

if (!closed && disk.isDefined) {
disk = None
host = Some(materialized)
materialized.incRefCount()
SpillFramework.stores.hostStore.trackNoSpill(this)
}

If we want to keep the disk around along with the host memory, then why are we setting disk to None here? If anyone asks to "unspill" this buffer the disk is gone and cannot be used again.

If we go to spill and we have both disk and host, then it is simple we just release host so it can be cleaned up, no writing to disk needed.

That code is not implemented.

if (!closed && disk.isEmpty && host.isDefined && !spilling) {

If we have both a disk and a host, then line 413 will be false. Which in turn sets thisThreadSpills to false. Which then also means that once we are in a situation with both host and disk, the host will never be released because disk is used as the gate to see if it can be spilled.

If we run into a situation where we are in the middle of spilling and we want to materialize the buffer, then ideally we let the spill code know that it can finish writing to disk, but it should not delete the host buffer because someone grabbed it.

staging.foreach(_.close())
spillingInterrupted = false
0

Technically that is true about this code, but it also does not set the disk, so if we hit a situation where we want to spill this memory again then we need to write it out to disk yet again. We don't have both at the same time.

As for next steps with this PR. You should retest with the changes that were made in #12236 and no unspilling. See if this is even needed at all. My guess is that the changes there make it so there is much less churn and this is not really needed. If the performance shows that something like this would still help, then please look at the places that I pointed out above so that unspill does what I want/expect. Then we would need follow on issues to look at all of the handles to see if there are similar changes that would be helpful for them too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
reliability Features to improve reliability or bugs that severly impact the reliability of the plugin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Support unspill for SpillableHostBuffer
4 participants