Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically generate documentation for Server Process options #508

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

manics
Copy link
Member

@manics manics commented Oct 27, 2024

Currently the documentation for Server Process options and the documentation in the code comments is redundant:

command
An optional list of strings that should be the full command to be executed.
The optional template arguments {{port}}, {{unix_socket}} and {{base_url}}
will be substituted with the port or Unix socket path the process should
listen on and the base-url of the notebook.
Could also be a callable. It should return a list.
If the command is not specified or is an empty list, the server
process is assumed to be started ahead of time and already available
to be proxied to.
environment
A dictionary of environment variable mappings. As with the command
traitlet, {{port}}, {{unix_socket}} and {{base_url}} will be substituted.
Could also be a callable. It should return a dictionary.
timeout
Timeout in seconds for the process to become ready, default 5s.
absolute_url
Proxy requests default to being rewritten to '/'. If this is True,
the absolute URL will be sent to the backend instead.
port
Set the port that the service will listen on. The default is to automatically select an unused port.
unix_socket
If set, the service will listen on a Unix socket instead of a TCP port.
Set to True to use a socket in a new temporary folder, or a string
path to a socket. This overrides port.
Proxying websockets over a Unix socket requires Tornado >= 6.3.
mappath
Map request paths to proxied paths.
Either a dictionary of request paths to proxied paths,
or a callable that takes parameter ``path`` and returns the proxied path.
launcher_entry
A dictionary of various options for entries in classic notebook / jupyterlab launchers.
Keys recognized are:
enabled
Set to True (default) to make an entry in the launchers. Set to False to have no
explicit entry.
icon_path
Full path to an svg icon that could be used with a launcher. Currently only used by the
JupyterLab launcher
title
Title to be used for the launcher entry. Defaults to the name of the server if missing.
path_info
The trailing path that is appended to the user's server URL to access the proxied server.
By default it is the name of the server followed by a trailing slash.
category
The category for the launcher item. Currently only used by the JupyterLab launcher.
By default it is "Notebook".
new_browser_tab
Set to True (default) to make the proxied server interface opened as a new browser tab. Set to False
to have it open a new JupyterLab tab. This has no effect in classic notebook.
request_headers_override
A dictionary of additional HTTP headers for the proxy request. As with
the command traitlet, {{port}}, {{unix_socket}} and {{base_url}} will be substituted.
rewrite_response
An optional function to rewrite the response for the given service.
Input is a RewritableResponse object which is an argument that MUST be named
``response``. The function should modify one or more of the attributes
``.body``, ``.headers``, ``.code``, or ``.reason`` of the ``response``
argument. For example:
def dog_to_cat(response):
response.headers["I-Like"] = "tacos"
response.body = response.body.replace(b'dog', b'cat')
c.ServerProxy.servers['my_server']['rewrite_response'] = dog_to_cat
The ``rewrite_response`` function can also accept several optional
positional arguments. Arguments named ``host``, ``port``, and ``path`` will
receive values corresponding to the URL ``/proxy/<host>:<port><path>``. In
addition, the original Tornado ``HTTPRequest`` and ``HTTPResponse`` objects
are available as arguments named ``request`` and ``orig_response``. (These
objects should not be modified.)
A list or tuple of functions can also be specified for chaining multiple
rewrites. For example:
def cats_only(response, path):
if path.startswith("/cat-club"):
response.code = 403
response.body = b"dogs not allowed"
c.ServerProxy.servers['my_server']['rewrite_response'] = [dog_to_cat, cats_only]
Note that if the order is reversed to ``[cats_only, dog_to_cat]``, then accessing
``/cat-club`` will produce a "403 Forbidden" response with body "cats not allowed"
instead of "dogs not allowed".
Defaults to the empty tuple ``tuple()``.
update_last_activity
Will cause the proxy to report activity back to jupyter server.
raw_socket_proxy
Proxy websocket requests as a raw TCP (or unix socket) stream.
In this mode, only websockets are handled, and messages are sent to the backend,
similar to running a websockify layer (https://github.com/novnc/websockify).
All other HTTP requests return 405 (and thus this will also bypass rewrite_response).

### `command`
One of:
- A list of strings that is the command used to start the
process. The following template strings will be replaced:
- `{port}` the port that the process should listen on. This will be 0 if it
should use a Unix socket instead.
- `{unix_socket}` the path at which the process should listen on a Unix
socket. This will be an empty string if it should use a TCP port.
- `{base_url}` the base URL of the notebook. For example, if the application
needs to know its full path it can be constructed from
`{base_url}/proxy/{port}`
- A callable that takes any {ref}`callable arguments <server-process:callable-arguments>`,
and returns a list of strings that are used & treated same as above.
If the command is not specified or is an empty list, the server process is
assumed to be started ahead of time and already available to be proxied to.
### `timeout`
Timeout in seconds for the process to become ready, default `5`.
A process is considered 'ready' when it can return a valid HTTP response on the
port it is supposed to start at.
### `environment`
One of:
- A dictionary of strings that are passed in as the environment to
the started process, in addition to the environment of the notebook
process itself. The strings `{port}`, `{unix_socket}` and
`{base_url}` will be replaced as for **command**.
- A callable that takes any {ref}`callable arguments <server-process:callable-arguments>`,
and returns a dictionary of strings that are used & treated same as above.
### `absolute_url`
_True_ if the URL as seen by the proxied application should be the full URL
sent by the user. _False_ if the URL as seen by the proxied application should
see the URL after the parts specific to jupyter-server-proxy have been stripped.
For example, with the following config:
```python
c.ServerProxy.servers = {
"test-server": {
"command": ["python3", "-m", "http.server", "{port}"],
"absolute_url": False
}
}
```
When a user requests `/test-server/some-url`, the proxied server will see it
as a request for `/some-url` - the `/test-server` part is stripped out.
If `absolute_url` is set to `True` instead, the proxied server will see it
as a request for `/test-server/some-url` instead - without any stripping.
This is very useful with applications that require a `base_url` to be set.
Defaults to _False_.
### `port`
Set the port that the service will listen on. The default is to
automatically select an unused port.
(server-process:unix-socket)=
### `unix_socket`
This option uses a Unix socket on a filesystem path, instead of a TCP
port. It can be passed as a string specifying the socket path, or _True_ for
Jupyter Server Proxy to create a temporary directory to hold the socket,
ensuring that only the user running Jupyter can connect to it.
If this is used, the `{unix_socket}` argument in the command template
(see {ref}`server-process:cmd`) will be a filesystem path. The server should
create a Unix socket bound to this path and listen for HTTP requests on it.
The `port` configuration key will be ignored.
```{note}
Proxying websockets over a Unix socket requires Tornado >= 6.3.
```
### `mappath`
Map request paths to proxied paths.
Either a dictionary of request paths to proxied paths,
or a callable that takes parameter `path` and returns the proxied path.
### `launcher_entry`
A dictionary with options on if / how an entry in the classic Jupyter Notebook
'New' dropdown or the JupyterLab launcher should be added. It can contain
the following keys:
1. **enabled**
Set to True (default) to make an entry in the launchers. Set to False to have no
explicit entry.
2. **icon_path**
Full path to an svg icon that could be used with a launcher. Currently only used by the
JupyterLab launcher, when category is "Notebook" (default) or "Console".
3. **title**
Title to be used for the launcher entry. Defaults to the name of the server if missing.
4. **path_info**
The trailing path that is appended to the user's server URL to access the proxied server.
By default it is the name of the server followed by a trailing slash.
5. **category**
The category for the launcher item. Currently only used by the JupyterLab launcher.
By default it is "Notebook".
### `new_browser_tab`
_JupyterLab only_ - _True_ (default) if the proxied server URL should be opened in a new browser tab.
_False_ if the proxied server URL should be opened in a new JupyterLab tab.
If _False_, the proxied server needs to allow its pages to be rendered in an iframe. This
is generally done by configuring the web server `X-Frame-Options` to `SAMEORIGIN`.
For more information, refer to
[MDN Web docs on X-Frame-Options](https://developer.mozilla.org/docs/Web/HTTP/Headers/X-Frame-Options).
Note that applications might use a different terminology to refer to frame options.
For example, RStudio uses the term _frame origin_ and require the flag
`--www-frame-origin=same` to allow rendering of its pages in an iframe.
### `request_headers_override`
One of:
- A dictionary of strings that are passed in as HTTP headers to the proxy
request. The strings `{port}`, `{unix_socket}` and `{base_url}` will be
replaced as for **command**.
- A callable that takes any {ref}`callable arguments <server-process:callable-arguments>`,
and returns a dictionary of strings that are used & treated same as above.
### `update_last_activity`
Whether to report activity from the proxy to Jupyter Server. If _True_, Jupyter Server
will be notified of new activity. This is primarily used by JupyterHub for idle detection and culling.
Useful if you want to have a seperate way of determining activity through a
proxied application.
Defaults to _True_.
(server-process:callable-arguments)=
### `raw_socket_proxy`
_True_ to proxy only websocket connections into raw stream connections.
_False_ (default) if the proxied server speaks full HTTP.
If _True_, the proxied server is treated a raw TCP (or unix socket) server that
does not use HTTP.
In this mode, only websockets are handled, and messages are sent to the backend
server as raw stream data. This is similar to running a
[websockify](https://github.com/novnc/websockify) wrapper.
All other HTTP requests return 405.
### Callable arguments
Certain config options accept callables, as documented above. This should return
the same type of object that the option normally expects.
When you use a callable this way, it can ask for any arguments it needs
by simply declaring it - only arguments the callable asks for will be passed to it.
For example, with the following config:
```python
def _cmd_callback():
return ["some-command"]
server_config = {
"command": _cmd_callback
}
```
No arguments will be passed to `_cmd_callback`, since it doesn't ask for any. However,
with:
```python
def _cmd_callback(port):
return ["some-command", "--port=" + str(port)]
server_config = {
"command": _cmd_callback
}
```
The `port` argument will be passed to the callable. This is a simple form of dependency
injection that helps us add more parameters in the future without breaking backwards
compatibility.
#### Available arguments
Unless otherwise documented for specific options, the arguments available for
callables are:
1. **port**
The TCP port on which the server should listen, or is listening.
This is 0 if a Unix socket is used instead of TCP.
2. **unix_socket**
The path of a Unix socket on which the server should listen, or is listening.
This is an empty string if a TCP socket is used.
3. **base_url**
The base URL of the notebook
If any of the returned strings, lists or dictionaries contain strings
of form `{<argument-name>}`, they will be replaced with the value
of the argument. For example, if your function is:
```python
def _openrefine_cmd():
return ["openrefine", "-p", "{port}"]
```
The `{port}` will be replaced with the appropriate port before
the command is started

This PR builds on top of #507 to autogenerate the documentation using a custom Sphinx Directive.

@jwindgassen
Copy link

Instead of writing the custom extension, we could also use autodoc-traits, which does almost the same. The output looks almost identical.

@manics
Copy link
Member Author

manics commented Nov 18, 2024

I tried autodoc traits originally, but it's not suitable for generating https://jupyter-server-proxy.readthedocs.io/en/latest/server-process.html because although the underlying class uses Traitlets in this particular context we're not configuring it using Traitlets. Instead we're using a dict which is converted to the Traitlets object.

@jwindgassen
Copy link

Is there any particular reason we do not want to use Union([Instance(ServerProcess), Dict()]) as the value for the dictionary? We could just add a small note to the docs, that it can be a dictionary as well with the same keys.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants