-
Notifications
You must be signed in to change notification settings - Fork 936
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support socketio auth option in connect handler #1497
base: main
Are you sure you want to change the base?
Conversation
Here is a test html I used to test with a stand-alone browser based client. |
Thanks for the contrib @nethi. We're looking into adding HTTP only cookie support, and making that the default auth for the frontend. That would significantly improve security and potentially resolve other issues as well. Might it work for your use case, too? |
Can you share some details as to how that would look ? Assume that would work with websockets too - haven't tried. |
I don't know about socketio specifically, but in general on the FastAPI side it seems cookie-based websocket auth is a pretty standard thing: https://fastapi.tiangolo.com/advanced/websockets/#using-depends-and-others AFAIK WebSockets start out with a normal HTTP request. The advantage of HTTP only cookies is that JS never has access to the auth token, so XSS attacks which are normally a pain to deal with disappear entirely (though CSRF attacks require proper config server-side). |
Is it possible to use Server Send Event and remove socket.io? So we can have simpler architecture and more stable, scalable with stateless servers. |
No, we're not gonna do that. SSE's are, like you said, server-sent events. Websockets are bidirectional. The connection persistence are what allows a smooth UX, the socketio bus is basically the core of our functionality. |
@nethi I'm looking to migrate to httponly based cookies as the default frontend -> backend token bearer over the next 2 weeks or so. This removes XSS attack vectors and will allow us to properly secure files, which are currently served, effectively, unauthenticated. That having been said, your patch looks like a good solution and perhaps we would like to have it in as a fixup (also for the 1.3 branch, which might not get the cookie-based auth). In the description, you're suggesting patching So, could you please add your suggested changes to useChatSession to the PR? :) |
@dokterbob It turns out that for websocket only transport to work with chainlit, needs more than this fix. Initially, I thought Authorization header is the only problem so sending it as Auth option is a good workaround. But today I have been debugging this for few hours - it seems any header in extraHeaders will not come through websocket transport for browser based clients as browser websocket APi doesn't have any APIs to send these headers. In addition to Authorization header, Chainlit depends on few X-Chainlit-* headers to be sent - key among them is the session id. I am thinking of using query parameters instead and ofcourse that means, more custom changes to chainlit code :-( In summary, you mayn't want "Auth" option to be passed by default in useChatSession.ts as it mayn't work for all. Probably, leave it as a cookbook. Given all this, I think Cookie approach looks to be a good option. Please consider extending that to X-Chainlit-* headers as well. Otherwise, it will be an incomplete solution. Being able to deploy websocket only transport is key to run in lot of multi-replica environments (K8s, reverse proxy, load balancers etc). |
@dokterbob #1339? only seems to enable sending cookies and other CORS related headers to the backend. By itself, this won't address the issues I mentioned. Btw, I would be looking at these fixes for my copilot enablement in the next week or two. I have been able to get everything working end to end with these additional changes - Here, I am passing additional headers as query parameters. Cookies could be another way to pass these chainlit headers in websocket only mode. This is example working code and not PR Test Client accessToken = "Bearer eyJhbXXX-XXXX";
const socket = io('http://localhost:8080', {
transports: ['websocket'],
query: {
'X-Chainlit-Client-Type': "browser",
'X-Chainlit-Session-Id': sessionId,
'X-Chainlit-Thread-Id': '',
'X-Chainlit-Chat-Profile': "profile1",
},
path: '/api/generic-agent/agent1/ws/socket.io',
auth: {
token: accessToken
},
extraHeaders: {
Authorization: accessToken,
'X-Chainlit-Client-Type': "browser",
'X-Chainlit-Session-Id': sessionId,
'X-Chainlit-Thread-Id': '',
'X-Chainlit-Chat-Profile': "profile1",
}
}); To have headers retrieved either form Query or from headers, I made these changes locally in chainlit backend socket.py def get_ws_header_parameter(environ, header_name):
from urllib.parse import parse_qs
query_string = environ.get('QUERY_STRING', '')
query_params = parse_qs(query_string)
value = query_params.get(header_name, [None])[0]
if value is None:
value = environ.get(f"HTTP_{header_name.replace('-', '_').upper()}")
return value
@sio.on("connect")
async def connect(sid, environ, auth=None):
....
....
session_id = get_ws_header_parameter(environ, "X-Chainlit-Session-Id")
if restore_existing_session(sid, session_id, emit_fn, emit_call_fn):
return True
user_env_string = get_ws_header_parameter(environ, "user-env")
user_env = load_user_env(user_env_string)
client_type = get_ws_header_parameter(environ, "X-Chainlit-Client-Type")
http_referer = environ.get("HTTP_REFERER")
url_encoded_chat_profile = get_ws_header_parameter(environ, "X-Chainlit-Chat-Profile")
chat_profile = (
unquote(url_encoded_chat_profile) if url_encoded_chat_profile else None
)
WebsocketSession(
id=session_id,
socket_id=sid,
emit=emit_fn,
emit_call=emit_call_fn,
client_type=client_type,
user_env=user_env,
user=user,
token=token,
chat_profile=chat_profile,
thread_id= get_ws_header_parameter(environ, "X-Chainlit-Thread-Id"),
languages=environ.get("HTTP_ACCEPT_LANGUAGE"),
http_referer=http_referer,
http_path=environ.get("PATH_INFO"),
)
trace_event("connection_successful")
return True |
Thanks for the update!
There's a LOT going on in the code and the community -- so I'm not always able to go fully in-depth on every issue. But other community members involved in the same part can! I noticed that the issues which both of you seem to be experiencing are related to load balancing and websockets. So I figured they might be related and, as such, you talking directly to the other person might be conducive to solving these issues, regardless of whether they're the same..
At one point, hopefully, we'll thoroughly review the full API. There's currently some parts stateful (e.g. with context and/or session) and some parts stateless, then there's also the authentication layer. I will start going deep into this over the coming 2/3 weeks, in part to improve Chainlit's security.
Any idea why? |
socketio by default uses ["polling", "websocket"] as transports in that order. When we have multiple instances of chainlit server with a load balancer in the front, this creates a problem. The service that responds with sessionId is not necessarily the same instance when subsequent request is made. While there are other complex strategies such as an external cache (redis etc), given websocket is supported broadly, a simple strategy is to use just "websocket" transport alone.
However, passing Authorization header based token mechanism doesn't work with browser based socketio clients, as described here, and here
chainlit/libs/react-client/src/useChatSession.ts
Line 103 in 67de9c7
Recommended solution with socketio is to use "auth" option
However, doing this in a custom client, throws following error in chainlit backend.
Basically, if auth option is enabled in client, python_socketio calls "connect" callback with a third argument "auth". But chainlit callback handler doesn't define a third argument.
chainlit/backend/chainlit/socket.py
Line 102 in 67de9c7
A simple solution is to add the third argument, auth, with default value of None and use auth["token"] if HTTP_AUTHORIZTION is not present. This is exactly done by this PR and works great.