Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support server-sent events #664

Open
neet opened this issue Nov 3, 2022 · 8 comments
Open

Support server-sent events #664

neet opened this issue Nov 3, 2022 · 8 comments
Labels
enhancement New feature or request

Comments

@neet
Copy link
Owner

neet commented Nov 3, 2022

We need the option to choose which implementation should be used as a streaming, server-sent events or WS

https://docs.joinmastodon.org/methods/timelines/streaming/#server-sent-events-http

@neet neet added the enhancement New feature or request label Nov 5, 2022
@assaf
Copy link
Contributor

assaf commented Dec 20, 2022

Maybe this is a good opportunity to rework the streaming API to be an async iterator?

I can imagine something like:

for await (const status of masto.streaming.public) {
   ...
}

Question what happens when the connection drops:

  • Iterator ends, no error
  • Throw an exception

(If you need code that consumes SSE with fetch I have some ready)

@neet
Copy link
Owner Author

neet commented Dec 20, 2022

@assaf I'd also like to move the event interfaces to Async Iterators. We've already had Async Iterators for pagination in our code base, so it'd be nice to integrate those APIs. We can also drop eventemitter3 from our dependency.

(Just note that, If we move on to the Async Iterator, I'd like to change all the event interfaces to that one, so not only SSE but also WebSocket)

Question what happens when the connection drops

I don't know what you mean by drop, but the Web API EventSource emits an event error which I think should correspond to AsyncIterator#throw while calling close should be return because it's intended.

If you need code that consumes SSE with fetch I have some ready

I'd like to see how it's like. Is there some advantage to implementing by fetch instead of using EventSource API?

@assaf
Copy link
Contributor

assaf commented Dec 20, 2022

I don’t think it’s necessary to do both SSE and WS. It’s the same mechanism on the server, SSE just seems simpler to me.

I generally avoid EventSource — it feels incomplete, eg you can’t send HTTP headers (although Mastodon doesn’t support Last-Event-ID), Node support requires a polyfill, and error handling is opaque.

If the connection breaks, EventSource/WebSocket are supposed to reconnect. If they did, you'd lose events while they reconnect. OTOH if the iterator stops with an exception, you can make API requests to back-fill lost events.

If the server drops the connection — doesn't tell the client it's closed — then the only way the client can detect is with a timeout. Mastodon sends a heartbeat every 1 second, so it's possible to add code that detects timeout within a few seconds.

@neet
Copy link
Owner Author

neet commented Dec 22, 2022

@assaf

I don’t think it’s necessary to do both SSE and WS. It’s the same mechanism on the server, SSE just seems simpler to me.

You're right but it'd be better if we could provide an option that users can choose which implementation to use. Even though the event payloads are completely the same, WS and SSE have a bit different ways of establishing the connection. In WS we can create multiple subscriptions through a single connection:

https://docs.joinmastodon.org/methods/streaming/#websocket

it feels incomplete, eg you can’t send HTTP headers

This sounds critical because the doc says streaming a home timeline and notifications requires the Authorization header.

If the connection breaks, EventSource/WebSocket are supposed to reconnect. If they did, you'd lose events while they reconnect. OTOH if the iterator stops with an exception, you can make API requests to back-fill lost events.

I still didn't completely figure out how is it like to implement EventSource from scratch by fetch, but is it possible to get lost events even if the HTTP connection has been closed?

I'd like to see your PoC anyways 👍

@assaf
Copy link
Contributor

assaf commented Dec 22, 2022

This is the code I use for reading the public timeline:
https://gist.github.com/assaf/e76a85e1cf434b61e02079335b98b04f

This sounds critical because the doc says streaming home timelines notifications requires the Authorization header.

So says the docs, but the server does accept access_token as a query parameter since WebSocket also doesn't support HTTP headers.

but is it possible to get lost events even if the HTTP connection has been closed?

While SSE has this feature (Last-Event-Id), that's not supported in Mastodon.

The workaround is when the stream fails, you catch up by reading from the server, eg. /api/v1/streaming/public + /api/v1/timelines/public. It's not perfect, but it's better than nothing.

It would look something like:

async function readTimeline() {
  try {
    const catchup = await masto.timelines.fetchPublic();
    for (const status of catchup.value)
      doSomething(status);
    const { events } = await stream('public');
    for await (const { status } of events)
       doSomething(status);
  } catch (error) { }
  setTimeout(() => readTimeline, ms("2m"));
}

@neet
Copy link
Owner Author

neet commented Dec 26, 2022

@assaf Thank you for sharing! It’d help a lot to implement it.

access_token as a query parameter

(This is totally off-topic but) Inherently HTTPS doesn’t encrypt request URLs, so attackers can exploit access tokens by putting a honeypot in. This is one of the biggest issues of WebSocket so Mastodon allows us to use Sec-Websocket-Protocol as a workaround to send access tokens, though it was originally a header to send a list of protocols.

Avoiding using EventSource class is a good idea anyways.

workaround for Last-Event-Id

Your snippet above is a good idea, but I'm concerned that your technique can’t be applied to events other than a new status publication. There are events that you’ll never be able to fetch later if you missed the first delivery, such as delete, filters_changed, status.update .

I thought it’s worth considering simply throwing an error in event of a connection drop and delegating the catch-ups to the users, if catch-ups are incomplete and ended up leading to unnecessary confusion. It looks not that difficult to get a catch-up event if users really want to get lost events even if it's not supported by Masto.js, as you described in the snippet above.

@assaf
Copy link
Contributor

assaf commented Dec 26, 2022

I thought it’s worth considering simply throwing an error in event of a connection drop and delegating the catch-ups to the users,

That's what I'm proposing, and how the code snippet works.

As soon as it detects a problem with the stream, it bails by throwing an exception. The streaming API does not attempt to recover from errors.

The application then has to deal with the error as best as it can, which is indeed limited.

Inherently HTTPS doesn’t encrypt request URLs

FYI HTTPS is essentially HTTP over TLS over TCP, so everything in the HTTP request — GET /path?token HTTP/1, Authorization: header, body document — is encrypted by the TLS stream.

The only piece of information that's not encrypted is the hostname. TLS exposes that as Subject Alternative Name (SAN) so load balancers can route requests to the right server.

There's a different reason to avoid access tokens in URLs: the request path is often logged including query string and without filtering.

@neet
Copy link
Owner Author

neet commented Dec 26, 2022

Okay now I understand and agree with your proposal

HTTPS

Ah... I've heard there are some drawbacks to using query parameters as a way of authorisation and previously I've seen HTTP debugger Charles can detect HTTPS requests with hostname even if I don't modify certificates, so I confused everything

@neet neet added the v6 label Dec 31, 2022
@neet neet removed the v6 label Jan 21, 2023
@neet neet added this to the v6.0.0 milestone Jan 21, 2023
@neet neet removed this from the v6.0.0 milestone Jul 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants