Skip to content

Commit

Permalink
multi: event based rework
Browse files Browse the repository at this point in the history
Rework the event based handling of transfers and connections to
be "localized" into a single source file with clearer dependencies.

- add multi_ev.c and multi_ev.h
- add docs/internal/MULTI-EV.md to explain the overall workings
- only do event handling book keeping when the socket callback
  is set
- add handling for "connection only" event tracking, when internal
  easy handles are used that are not really tied to a connection.
  Used in connection pool.
- remove transfer member "last_poll" and connections "shutdown_poll"
  and keep all that internal to multi_ev.c
- add CURL_TRC_M() for tracing of "multi" related things, including
  event handling and connection pool operations. Add new trace
  feature "multi" for trace config.
  multi traces will show exactly what is going on in regard to
  event handling.
- multi: trace transfers "mstate" in every CURL_TRC_M() call
- make internal trace buffer 2048 bytes and end the silliness
  with +n here -m there. Adjust test 1652 expectations of resulting
  length and input edge cases.
- add trace feature "lib-ids" to perfix libcurl traces with transfer
  and connection ids. Useful for debugging libcurl applications.

Closes curl#16308
  • Loading branch information
icing authored and bagder committed Feb 22, 2025
1 parent 886f5de commit cfc657a
Show file tree
Hide file tree
Showing 24 changed files with 1,305 additions and 764 deletions.
1 change: 1 addition & 0 deletions docs/Makefile.am
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ INTERNALDOCS = \
internals/HASH.md \
internals/LLIST.md \
internals/MQTT.md \
internals/MULTI-EV.md \
internals/NEW-PROTOCOL.md \
internals/README.md \
internals/SPLAY.md \
Expand Down
127 changes: 127 additions & 0 deletions docs/internals/MULTI-EV.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
<!--
Copyright (C) Daniel Stenberg, <[email protected]>, et al.
SPDX-License-Identifier: curl
-->

# Multi Event Based

A libcurl multi is operating "event based" when the application uses
and event library like `libuv` to monitor the sockets and file descriptors
libcurl uses to trigger transfer operations. How that works from the
applications point of view is described in libcurl-multi(3).

This documents is about the internal handling.

## Source Locations

All code related to event based handling is found in `lib/multi_ev.c`
and `lib/multi_ev.h`. The header defines a set of internal functions
and `struct curl_multi_ev` that is embedded in each multi handle.

There is `Curl_multi_ev_init()` and `Curl_multi_ev_cleanup()` to manage
the overall life cycle, call on creation and destruction of the multi
handle.

## Tracking Events

First, the various functions in `lib/multi_ev.h` only ever really do
something when the libcurl application has registered its callback
in `multi->socket_cb`.

This is important as this callback gets informed about *changes* to sockets.
When a new socket is added, an existing is removed, or the `POLLIN/OUT`
flags change, `multi->socket_cb` needs to be invoked. `multi_ev` has to
track what it already reported to detect changes.

Most applications are expected to go "event based" right from the start,
but the libcurl API does not prohibit an application to start another
way and then go for events later on, even in the middle of a transfer.

### Transfer Events

Most event that happen are in connection with a transfer. A transfer
opens a connection, which opens a socket, and waits for this socket
to become writable (`POLLOUT`) when using TCP, for example.

The multi then calls `Curl_multi_ev_assess_xfer(multi, data)` to
let the multi event code detect what sockets the transfer is interested in.
If indeed a `multi->socket_cb` is set, the *current* transfer pollset is
retrieved via `Curl_multi_getsock()`. This current pollset is then
compared to the *previous* pollset. If relevant changes are detected,
`multi->socket_cb` gets informed about those. These can be:

* a socket is in the current set, but not the previous one
* a socket was also in the previous one, but IN/OUT flags changed
* a socket in the previous one is no longer part of the current

`multi_ev.c` keeps a `struct mev_sh_entry` for each sockets in a hash
with the socket as key. It tracks in each entry which transfers are
interested in this particular socket. How many transfer want to read
and/or write and what the summarized `POLLIN/POLLOUT` action, that
had been reported to `multi->socket_cb` was.

This is necessary as a socket may be in use by several transfers
at the same time (think HTTP/2 on the same connection). When a transfer
is done and gets removed from the socket entry, it decrements
the reader and/or writer count (depending on what it was last
interested in). This *may* result in the entry's summarized action
to change, or not.

### Connection Events

There are also events not connected to any transfer that need to be tracked.
The multi connection cache, concerned with clean shutdowns of connections,
is interested in socket events during the shutdown.

To allow use of the libcurl infrastructure, the connection cache operates
using an *internal* easy handle that is not a transfer as such. The
internal handle is used for all connection shutdown operations, being tied
to a particular connection only for a short time. This means tracking
the last pollset for an internal handle is useless.

Instead, the connection cache uses `Curl_multi_ev_assess_conn()` to have
multi event handling check the connection and track a "last pollset"
for the connection alone.

## Event Processing

When the libcurl application is informed by the event library that
a particular socket has an event, it calls `curl_multi_socket_action()`
to make libcurl react to it. This internally invokes
`Curl_multi_ev_expire_xfers()` which expires all transfers that
are interested in the given socket, so the multi handle runs them.

In addition `Curl_multi_ev_expire_xfers()` returns a `bool` to let
the multi know that connections are also interested in the socket, so
the connection pool should be informed as well.

## All Things Pass

When a transfer is done, e.g. removed from its multi handle, the
multi calls `Curl_multi_ev_xfer_done()`. This cleans up the pollset
tracking for the transfer.

When a connection is done, and before it is destroyed,
`Curl_multi_ev_conn_done()` is called. This cleans up the pollset
tracking for this connection.

When a socket is about to be closed, `Curl_multi_ev_socket_done()`
is called to cleanup the socket entry and all information kept there.

These calls do not have to happen in any particular order. A transfer's
socket may be around while the transfer is ongoing. Or it might disappear
in the middle of things. Also, a transfer might be interested in several
sockets at the same time (resolving, eye balling, ftp are all examples of
those).

### And Come Again

While transfer and connection identifier are practically unique in a
libcurl application, sockets are not. Operating systems are keen on reusing
their resources, and the next socket may get the same identifier as
one just having been closed with high likelihood.

This means that multi event handling needs to be informed *before* a close,
clean up all its tracking and be ready to see that same socket identifier
again right after.
23 changes: 22 additions & 1 deletion docs/libcurl/curl_global_trace.md
Original file line number Diff line number Diff line change
Expand Up @@ -101,13 +101,34 @@ trace.
Tracing of DNS operations to resolve hostnames and HTTPS records.
## `lib-ids`
Adds transfer and connection identifiers as prefix to every call to
CURLOPT_DEBUGFUNCTION(3). The format is `[n-m]` where `n` is the identifier
of the transfer and `m` is the identifier of the connection. A literal `x`
is used for internal transfers or when no connection is assigned.
For example, `[5-x]` is the prefix for transfer 5 that has no
connection. The command line tool `curl`uses the same format for its
`--trace-ids` option.
`lib-ids` is intended for libcurl applications that handle multiple
transfers but have no own way to identify in trace output which transfer
a trace event is connected to.
## `doh`
Former name for DNS-over-HTTP operations. Now an alias for `dns`.
## `multi`
Traces multi operations managing transfers' state changes and sockets poll
states.
## `read`
Traces reading of upload data from the application in order to send it to the server.
Traces reading of upload data from the application in order to send it to the
server.
## `ssls`
Expand Down
2 changes: 2 additions & 0 deletions lib/Makefile.inc
Original file line number Diff line number Diff line change
Expand Up @@ -198,6 +198,7 @@ LIB_CFILES = \
mprintf.c \
mqtt.c \
multi.c \
multi_ev.c \
netrc.c \
nonblock.c \
noproxy.c \
Expand Down Expand Up @@ -335,6 +336,7 @@ LIB_HFILES = \
mime.h \
mqtt.h \
multihandle.h \
multi_ev.h \
multiif.h \
netrc.h \
nonblock.h \
Expand Down
2 changes: 1 addition & 1 deletion lib/asyn-ares.c
Original file line number Diff line number Diff line change
Expand Up @@ -245,7 +245,7 @@ static void sock_state_cb(void *data, ares_socket_t socket_fd,
struct Curl_easy *easy = data;
if(!readable && !writable) {
DEBUGASSERT(easy);
Curl_multi_closed(easy, socket_fd);
Curl_multi_will_close(easy, socket_fd);
}
}

Expand Down
2 changes: 1 addition & 1 deletion lib/asyn-thread.c
Original file line number Diff line number Diff line change
Expand Up @@ -394,7 +394,7 @@ static void destroy_async_data(struct Curl_easy *data)
* ensure CURLMOPT_SOCKETFUNCTION fires CURL_POLL_REMOVE
* before the FD is invalidated to avoid EBADF on EPOLL_CTL_DEL
*/
Curl_multi_closed(data, sock_rd);
Curl_multi_will_close(data, sock_rd);
wakeup_close(sock_rd);
#endif

Expand Down
2 changes: 1 addition & 1 deletion lib/asyn.h
Original file line number Diff line number Diff line change
Expand Up @@ -177,7 +177,7 @@ void Curl_resolver_kill(struct Curl_easy *data);

/* Curl_resolver_getsock()
*
* This function is called from the multi_getsock() function. 'sock' is a
* This function is called from the Curl_multi_getsock() function. 'sock' is a
* pointer to an array to hold the file descriptors, with 'numsock' being the
* size of that array (in number of entries). This function is supposed to
* return bitmask indicating what file descriptors (referring to array indexes
Expand Down
8 changes: 4 additions & 4 deletions lib/cf-socket.c
Original file line number Diff line number Diff line change
Expand Up @@ -422,7 +422,7 @@ static int socket_close(struct Curl_easy *data, struct connectdata *conn,

if(use_callback && conn && conn->fclosesocket) {
int rc;
Curl_multi_closed(data, sock);
Curl_multi_will_close(data, sock);
Curl_set_in_callback(data, TRUE);
rc = conn->fclosesocket(conn->closesocket_client, sock);
Curl_set_in_callback(data, FALSE);
Expand All @@ -431,7 +431,7 @@ static int socket_close(struct Curl_easy *data, struct connectdata *conn,

if(conn)
/* tell the multi-socket code about this */
Curl_multi_closed(data, sock);
Curl_multi_will_close(data, sock);

sclose(sock);

Expand Down Expand Up @@ -997,7 +997,7 @@ static void cf_socket_close(struct Curl_cfilter *cf, struct Curl_easy *data)
struct cf_socket_ctx *ctx = cf->ctx;

if(ctx && CURL_SOCKET_BAD != ctx->sock) {
CURL_TRC_CF(data, cf, "cf_socket_close(%" FMT_SOCKET_T ")", ctx->sock);
CURL_TRC_CF(data, cf, "cf_socket_close, fd=%" FMT_SOCKET_T, ctx->sock);
if(ctx->sock == cf->conn->sock[cf->sockindex])
cf->conn->sock[cf->sockindex] = CURL_SOCKET_BAD;
socket_close(data, cf->conn, !ctx->accepted, ctx->sock);
Expand All @@ -1019,7 +1019,7 @@ static CURLcode cf_socket_shutdown(struct Curl_cfilter *cf,
if(cf->connected) {
struct cf_socket_ctx *ctx = cf->ctx;

CURL_TRC_CF(data, cf, "cf_socket_shutdown(%" FMT_SOCKET_T ")", ctx->sock);
CURL_TRC_CF(data, cf, "cf_socket_shutdown, fd=%" FMT_SOCKET_T, ctx->sock);
/* On TCP, and when the socket looks well and non-blocking mode
* can be enabled, receive dangling bytes before close to avoid
* entering RST states unnecessarily. */
Expand Down
13 changes: 7 additions & 6 deletions lib/cfilters.c
Original file line number Diff line number Diff line change
Expand Up @@ -200,8 +200,8 @@ CURLcode Curl_conn_shutdown(struct Curl_easy *data, int sockindex, bool *done)
*done = FALSE;
now = Curl_now();
if(!Curl_shutdown_started(data, sockindex)) {
DEBUGF(infof(data, "shutdown start on%s connection",
sockindex ? " secondary" : ""));
CURL_TRC_M(data, "shutdown start on%s connection",
sockindex ? " secondary" : "");
Curl_shutdown_start(data, sockindex, &now);
}
else {
Expand Down Expand Up @@ -476,7 +476,7 @@ CURLcode Curl_conn_connect(struct Curl_easy *data,
/* In general, we want to send after connect, wait on that. */
if(sockfd != CURL_SOCKET_BAD)
Curl_pollset_set_out_only(data, &ps, sockfd);
Curl_conn_adjust_pollset(data, &ps);
Curl_conn_adjust_pollset(data, data->conn, &ps);
result = Curl_pollfds_add_ps(&cpfds, &ps);
if(result)
goto out;
Expand Down Expand Up @@ -626,14 +626,15 @@ void Curl_conn_cf_adjust_pollset(struct Curl_cfilter *cf,
}

void Curl_conn_adjust_pollset(struct Curl_easy *data,
struct easy_pollset *ps)
struct connectdata *conn,
struct easy_pollset *ps)
{
int i;

DEBUGASSERT(data);
DEBUGASSERT(data->conn);
DEBUGASSERT(conn);
for(i = 0; i < 2; ++i) {
Curl_conn_cf_adjust_pollset(data->conn->cfilter[i], data, ps);
Curl_conn_cf_adjust_pollset(conn->cfilter[i], data, ps);
}
}

Expand Down
3 changes: 2 additions & 1 deletion lib/cfilters.h
Original file line number Diff line number Diff line change
Expand Up @@ -455,7 +455,8 @@ void Curl_conn_cf_adjust_pollset(struct Curl_cfilter *cf,
* Adjust pollset from filters installed at transfer's connection.
*/
void Curl_conn_adjust_pollset(struct Curl_easy *data,
struct easy_pollset *ps);
struct connectdata *conn,
struct easy_pollset *ps);

/**
* Curl_poll() the filter chain at `cf` with timeout `timeout_ms`.
Expand Down
Loading

0 comments on commit cfc657a

Please sign in to comment.