Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation in CuPy with stream= and dl_device=cpu choices #152

Open
seberg opened this issue Oct 29, 2024 · 0 comments
Open

Implementation in CuPy with stream= and dl_device=cpu choices #152

seberg opened this issue Oct 29, 2024 · 0 comments

Comments

@seberg
Copy link
Contributor

seberg commented Oct 29, 2024

I implemented dlpack v1 for CuPy (see cupy/cupy#8683), and there are two choices that are important for other implementations and maybe the spec:

  1. We chose to export the cudaManaged device when possible even if dl_device=(CPU, 0) was requested. I.e. we promise that the data can be used on the CPU device, but cupy currently will still give you the actual (compatible) device!
    • Note: NumPy is OK with this in the case of cuda managed memory. But it may not yet be OK with it in the case of future/other similar devices. (I.e. NumPy may need to trust the producer in this case, or we just keep it a bit of a fuzzy thing where we assume the consumer should know the device, possible based on version.)
  2. If user passes dl_device=(CPU, 0), stream=.... We had discussed that the semantics must be related to the device that the data is on, I think. CuPy supports this:
    • stream=None (or nothing passed), will synchronize the device to host copy (i.e. wait until the data is CPU available).
    • stream=consumer_stream will not synchronize. The user could in theory work with the data (e.g. another cudaAsyncCopy) on consumer_stream, or synchronize themselves (e.g. if multiple copies needed).
    • REASON: One reason is that synchronizing in the second case would achieve nothing that stream=None doesn't already achieve. It would effectively do the same stream=None and also synchronize the consumer_stream. (But that stream does not need to be synchronized!)

CC @leofang.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant