Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read scenario fixed #2

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open

Read scenario fixed #2

wants to merge 4 commits into from

Conversation

Bylon2
Copy link

@Bylon2 Bylon2 commented Dec 26, 2020

Fixed a bug in the callback (memcpy offset in the user buffer was missing) plus minor things to make it work for a "read" scenario

src/fcurl.c Outdated
return 1; /* no buffer data left and transfer complete == done */

if(!h->transfer_complete) {
do {
mc = curl_multi_wait(h->mh, NULL, 0, 5000, &numfds);
if (h->paused) {
/* Unpause + no 'wait' because it was paused with excess data */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be indented two spaces?

Copy link
Author

@Bylon2 Bylon2 Dec 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, it should! My bad, "styling" issue, I used 8 char TABs instead of spaces as in the original code. Fixed it with dec60b9

@@ -29,6 +29,7 @@ size_t fcurl_read(void *ptr, size_t size, size_t nmemb, FCURL *stream);
size_t fcurl_write(const void *ptr, size_t size, size_t nmemb,
FCURL *stream);
int fcurl_flush(FCURL *stream);
int fcurl_eof(FCURL *stream);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't really "fix the read scenario" but is unrelated.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, possibly mixed up with PR title that gets all the changes, the commit says "Provides a definition for fcurl_feof", which was missing. Should have made 2 PRs!

src/fcurl.c Outdated

return ret;
/* Still does not make difference between EOF and errors */
return (h->transfer_complete && 0 == h->b.used);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds an feof() version

Copy link
Author

@Bylon2 Bylon2 Dec 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, a basic one, still does not make difference eof/error, but is needed in a classic while(eof) { read; write; } loop. Without an "eof", the "read scenario" (in which I include the loop, so it is a question of how you define "scenario" !) does not work. But agreed, a single read would work without this addition.

@Bylon2
Copy link
Author

Bylon2 commented Dec 30, 2020

Sorry, I'm not completely at ease with PR! I should probably have made 2 different PR for the first 2 commits.
There is a 3rd commit not in the PR (or is it? it was not my intention!) since it is not "fix" but a different and shorter algorithm that does not pull all the data into the client and uses "pause/unpause" instead to let libcurl perform its magic.

@Bylon2
Copy link
Author

Bylon2 commented Dec 30, 2020

Discussion: the pause/unpause algorithm could have a negative impact when used with http/2, because current libcurl behaviour is (as far I understood the code) to close/open the http/2 window when pausing/unpausing.
Would it be an option to be able to "pause" without changing the http/2 window (CURLPAUSE_LOCAL_RECV or CURLOPT_KEEP_WINDOW for http/2)?
Is it already there and I missed it in the code?
Wouldn't then nghttp2 (or libcurl) do the buffering?
If the client reads too slowly, in the end it will be the same because the http/2 window will end up being full whether it has been closed or not. So the change would be useful only when you expect that the client's code is fast enough with the reads.

@bagder
Copy link
Member

bagder commented Dec 30, 2020

could have a negative impact when used with http/2, because current libcurl behaviour is (as far I understood the code) to close/open the http/2 window when pausing/unpausing.

I don't understand this comment. Are you saying that curl should maintain the window at the current size even though the transfer is paused? What would be the purpose of that and why?

Would it be an option to be able to "pause" without changing the http/2 window (CURLPAUSE_LOCAL_RECV or CURLOPT_KEEP_WINDOW for http/2)?

... and then do unlimited amounts of data buffered? That seems insane?

@Bylon2
Copy link
Author

Bylon2 commented Dec 30, 2020

Maybe my understanding is incorrect.
I understand the semantics of curl_easy_pause() as such:

  • The client (caller) asks the library not to send any more data at the moment, which means stop invoking the write_callback.
  • It does not say that the server itself should be ordered to stop sending data (which is currently implied with http/2).

I don't know http/2 well enough to differentiate the impact if libcurl only stops calling recv() on the socket or really sends the close window.
If you think that makes no difference, no change is required!

You are correct, buffering is another matter, and probably not needed.

@bagder
Copy link
Member

bagder commented Dec 30, 2020

The client (caller) asks the library not to send any more data at the moment, which means stop invoking the write_callback.

It also stops calling recv() on the socket if there's no other transfers going on over the same socket.

It does not say that the server itself should be ordered to stop sending data (which is currently implied with http/2).

TCP and HTTP/2 have "windows" for how much data the server is allowed to send and the server can keep on sending that amount. curl cannot change that. For TCP, that window handling and buffering is done in the kernel. For HTTP/2 and HTTP/3, the handling of that data needs to be done in the application if that data is read off the socket. curl cannot tell the server to "stop sending", but it can cease to update the window so that the server will eventually stop sending when the window is all used up.

@Bylon2
Copy link
Author

Bylon2 commented Dec 30, 2020

Understood, I remember my "old" TCP/IP class with the anticipation window to avoid network latency!

Separation of concern principle.

The caller's perspective when using curl_easy_pause() is that it does not want to be called on the callback at the moment.

The library should do "what's best", and that is not the concern of the caller... unless the library can't determine "what's best" without a clue from the caller, in which case there should be a way to give that clue.

With the current pause/unpause code, if the caller uses "reasonable" buffers like 32k on top of https, in fact pause is never called because the 16k TLS buffer fit nicely in the caller's buffers.

But if you try with "stupid buffers" like say 12371 byte, what happens is:
1- write_callback(16k)
2- caller consumes 12371 bytes
3- caller issues "pause"
4- (http/2) libcurl sends window close
5- caller writes data received from libcurl to local file (or whatever)
6- caller reads next buffer which issues "unpause"
7- (http/2) libcurl sends window re-open
8- write_callback(16k) -same as 1 because it was "paused"
9- caller consumes the 4013 bytes remaining
10- write_callback(16k) -new buffer
11- caller consumes 8358 bytes (its buffer is now full)
12- goto 3 (while not eof)

So if this close/reopen to the server would have an effect on network performance, shouldn't there be a way the caller tells libcurl how to best do it's network magic?

So indeed, as you phrase is: "cease to update the window", if given the clue. Something like CURLPAUSE_SHORT_RECV, if that could be useful for the library, when the caller anticipates step 5 above is "short".

[Edit note:] I have not made a "benchmark", when the client is writing quickly enough, to see whether this close/re-open in the case of http/2 would have a noticeable network performance effect, or none at all. I would have had to change curl's code for that, and network is the wisdom of the library, not of the client's code!

@bagder
Copy link
Member

bagder commented Dec 30, 2020

4- (http/2) libcurl sends window close

No. There's no such thing. curl has already announced to the server that the window is of size N. It can't shrink the window, it can only cease to increase it, which means the server can keep sending the number of bytes left in the window.

7- (http/2) libcurl sends window re-open

There's no "reopen". When the application starts reading from the stream again, curl will give the server more window again to start sending data.

@Bylon2
Copy link
Author

Bylon2 commented Dec 30, 2020

Perfect, so that should be all fine with http/2 too, thanks for the explanations!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants