-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Writing / reading to / from file descriptor or memory directly #12
Comments
Yep, I was looking at pipes for the next version :) |
Great! For my use case the best would be to be able to write to a file
descriptor, or HANDLE on Windows.
…On Wed, 10 Jul 2019, 19:05 Travers, ***@***.***> wrote:
Yep, I was looking at pipes for the next version :)
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#12>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAFBGQG6TG4Z4EOX6B5PQM3P6YJHDANCNFSM4H7PHXQQ>
.
|
I've added two new functions, Writing to R connections seems to be un-allowed by CRAN normally e.g. tidyverse/readr#856 (comment), but can be enabled when compiling). I'm going to test it out a bit more and submit CRAN. |
Thanks! Unfortunately for my use case R connections are not very good, just a simple Unix fd or a Windows HANDLE would be much better. |
I have this set up in two ways -- one way using R connections, the other way using FILE pointers created by
On the C++ side, this looks something like this:
Is that what you had in mind? Working with windows handles or even unix fd's (which I understand are wrapped by FILE * pointers) are a bit beyond my current expertise, and Google isn't being particularly helpful. But I am happy to learn if you could give some tips or pointers on implementation. |
Thanks! Well, almost. :) The best for us would be file descriptors, i.e. the integers returned by Then we could use |
Hi @gaborcsardi, I have a short toy example using file descriptors: *nix version: https://gist.github.com/traversc/e04911a86c8d581b058815d4aa7e7366 Do you mind looking it over and seeing if it's what you had in mind? Some questions for you: Since we can use file descriptors in both windows and unix-like, that would simplify things, do you think there is still a need to use windows I'm still not quite clear how |
That's a good start! Unfortunately I don't think we can use the integer file descriptors on Windows, not everything is a file on Windows, and e.g. the shared memory handles will not work. But I am actually not completely sure about this. Re. mmap, we will do this:
Then we pass the fd to subprocesses, and they do an mmap on it as well, and unserialize. Some bits of this is in r-lib/processx#201 but it needs quite some rewrite still. This has something like a serialization that only works for a list of atomic, non-character vectors. But it does have the advantage that the subprocesses do not need to unserialize, but they can create the objects "within" the serialized data. This is something we probably lose with a proper serialization, unless we design a serialization format that explicitly supports it. |
Hi @gaborcsardi, I think I've put together all the requests in the latest commit. I had to do a bunch of re-factoring to use templates instead of assuming I have the following new functions:
I also have the following helper functions:
qsave and variants also now return invisibly the number of bytes written (as a double; an int is too small for large data) Here are some examples: Data:
On Linux/Mac:
On Windows:
Serialize to raw vector:
Anyway, lmk what you think. Thanks. |
Awesome! Thanks for doing this. I'll take a good look very soon, sorry for the delay. |
Can I use this features ( |
@artemklevtsov Technically, yes. You would have to open a file descriptor in append mode: https://stackoverflow.com/questions/7136416/opening-file-in-append-mode-using-open-api But I don't recommend doing this, as I don't guarantee being able to correctly deserialize data if there are extra bytes at the end of a file. |
@traversc thank you for the explanation. Do you have any plans to add a feature like that? I look for an alternative for the |
@artemklevtsov No plans for that feature as the format isn't set up for that, sorry. I know that with the Alternatively, you can save two separate data.frame objects and use |
Do you think it would be possible to add support for this? It would be great to be able to use a pipe/socket and also memory directly.
The text was updated successfully, but these errors were encountered: