Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Time-to-live support not working anymore? #315

Closed
asteppke opened this issue May 17, 2024 · 6 comments · Fixed by #322
Closed

Time-to-live support not working anymore? #315

asteppke opened this issue May 17, 2024 · 6 comments · Fixed by #322
Labels
bug Something isn't working

Comments

@asteppke
Copy link
Contributor

For notebooks that create a lot of output the .jupyter_ystore.db can get rather large and unfortunately overflows our users' quota easily.

In this example here

import time
for i in range(100_000_000):
    print(f"{i}, ", end="")
    time.sleep(0.05)

the size of the notebook file after a runtime of a few minutes grows to 1.5 MB. On the other hand the corresponding .jupyter_ystore.db grows to 580 MB.

I have read parts of the discussions around the database and I have a rough understanding of the the complications that makes solving this quite challenging. For now the time-to-live option seems like a suitable workaround to limit the growth to some extend. At the moment this does not seem to work (anymore?) though.

When starting a new session with

jupyter lab --SQLiteYStore.document_ttl=600

I only receive the following error messages:

[I 2024-05-17 16:08:01.659 ServerApp] Creating new notebook in
[I 2024-05-17 16:08:01.722 ServerApp] Request for Y document 'Untitled10.ipynb' with room ID: 780564de-e0da-492a-9d14-af545441c896
[I 2024-05-17 16:08:01.913 YDocExtension] Creating FileLoader for: Untitled10.ipynb
[I 2024-05-17 16:08:01.914 YDocExtension] Watching file: Untitled10.ipynb
[I 2024-05-17 16:08:01.915 ServerApp] Initializing room json:notebook:780564de-e0da-492a-9d14-af545441c896
[I 2024-05-17 16:08:01.935 ServerApp] Content in room json:notebook:780564de-e0da-492a-9d14-af545441c896 loaded from file Untitled10.ipynb
[E 2024-05-17 16:08:01.937 ServerApp] Error initializing: Untitled10.ipynb
    TypeError("'>' not supported between instances of 'int' and 'DeferredConfigString'")
    Traceback (most recent call last):
      File "C:\tools\miniconda3\envs\data\Lib\site-packages\jupyter_collaboration\handlers.py", line 233, in open
        await self.room.initialize()
      File "C:\tools\miniconda3\envs\data\Lib\site-packages\jupyter_collaboration\rooms.py", line 151, in initialize
        await self.ystore.encode_state_as_update(self.ydoc)
      File "C:\tools\miniconda3\envs\data\Lib\site-packages\pycrdt_websocket\ystore.py", line 145, in encode_state_as_update
        await self.write(update)
      File "C:\tools\miniconda3\envs\data\Lib\site-packages\pycrdt_websocket\ystore.py", line 473, in write
        if self.document_ttl is not None and diff > self.document_ttl:
                                             ^^^^^^^^^^^^^^^^^^^^^^^^
    TypeError: '>' not supported between instances of 'int' and 'DeferredConfigString'
[I 2024-05-17 16:08:01.940 ServerApp] Deleting Y document from memory: json:notebook:780564de-e0da-492a-9d14-af545441c896
[I 2024-05-17 16:08:01.940 ServerApp] Room json:notebook:780564de-e0da-492a-9d14-af545441c896 deleted
[I 2024-05-17 16:08:01.941 ServerApp] Deleting file Untitled10.ipynb
[E 2024-05-17 16:08:01.943 ServerApp] Exception in callback functools.partial(<function WebSocketProtocol._run_callback.<locals>.<lambda> at 0x0000023308FE4A40>, <Task finished name='Task-734' coro=<YDocWebSocketHandler.on_message() done, defined at C:\tools\miniconda3\envs\data\Lib\site-packages\jupyter_collaboration\handlers.py:277> exception=AttributeError("'YDocWebSocketHandler' object has no attribute 'room'")>)
    Traceback (most recent call last):
      File "C:\tools\miniconda3\envs\data\Lib\site-packages\tornado\ioloop.py", line 750, in _run_callback
        ret = callback()
              ^^^^^^^^^^
      File "C:\tools\miniconda3\envs\data\Lib\site-packages\tornado\websocket.py", line 640, in <lambda>
        self.stream.io_loop.add_future(result, lambda f: f.result())
                                                         ^^^^^^^^^^
      File "C:\tools\miniconda3\envs\data\Lib\site-packages\jupyter_collaboration\handlers.py", line 286, in on_message
        changes = self.room.awareness.get_changes(message[1:])
                  ^^^^^^^^^
    AttributeError: 'YDocWebSocketHandler' object has no attribute 'room'
[E 2024-05-17 16:08:01.945 ServerApp] Uncaught exception GET /api/collaboration/room/json:notebook:780564de-e0da-492a-9d14-af545441c896?sessionId=19a409eb-52ee-46c9-9d32-d39d007e0a9a (::1)
    HTTPServerRequest(protocol='http', host='localhost:8888', method='GET', uri='/api/collaboration/room/json:notebook:780564de-e0da-492a-9d14-af545441c896?sessionId=19a409eb-52ee-46c9-9d32-d39d007e0a9a', version='HTTP/1.1', remote_ip='::1')
    Traceback (most recent call last):
      File "C:\tools\miniconda3\envs\data\Lib\site-packages\tornado\web.py", line 1790, in _execute
        result = await result
                 ^^^^^^^^^^^^
      File "C:\tools\miniconda3\envs\data\Lib\site-packages\jupyter_collaboration\handlers.py", line 209, in get
        return await super().get(*args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "C:\tools\miniconda3\envs\data\Lib\site-packages\tornado\websocket.py", line 273, in get
        await self.ws_connection.accept_connection(self)
      File "C:\tools\miniconda3\envs\data\Lib\site-packages\tornado\websocket.py", line 863, in accept_connection
        await self._accept_connection(handler)
      File "C:\tools\miniconda3\envs\data\Lib\site-packages\tornado\websocket.py", line 946, in _accept_connection
        await self._receive_frame_loop()
      File "C:\tools\miniconda3\envs\data\Lib\site-packages\tornado\websocket.py", line 1102, in _receive_frame_loop
        await self._receive_frame()
      File "C:\tools\miniconda3\envs\data\Lib\site-packages\tornado\websocket.py", line 1193, in _receive_frame
        await handled_future
    AttributeError: 'YDocWebSocketHandler' object has no attribute 'room'
Traceback (most recent call last):
  File "C:\tools\miniconda3\envs\data\Lib\collections\__init__.py", line 449, in _make
    result = tuple_new(cls, iterable)
             ^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: pycrdt::map::Map is unsendable, but is being dropped on another thread
Traceback (most recent call last):
  File "C:\tools\miniconda3\envs\data\Lib\collections\__init__.py", line 449, in _make
    result = tuple_new(cls, iterable)
             ^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: pycrdt::map::Map is unsendable, but is being dropped on another thread
Traceback (most recent call last):
  File "C:\tools\miniconda3\envs\data\Lib\collections\__init__.py", line 449, in _make
    result = tuple_new(cls, iterable)
             ^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: pycrdt::doc::Doc is unsendable, but is being dropped on another thread
Traceback (most recent call last):
  File "C:\tools\miniconda3\envs\data\Lib\collections\__init__.py", line 449, in _make
    result = tuple_new(cls, iterable)
             ^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: pycrdt::array::Array is unsendable, but is being dropped on another thread

Is the the ttl-option still supported or is there another or better way to limit the size of the database?

@asteppke asteppke added the bug Something isn't working label May 17, 2024
@asteppke
Copy link
Contributor Author

After looking around a bit it turns out that this is connected to the special treatment of traitlets within jupyter-collaboration. Traitlets processes arguments in two steps and the second step is not executed here, so instead of an Integer traitlet we only obtain a DeferredConfigString.

As a workaround one can cast this simply with a

self.document_tll = int(self.document_ttl) if self.document_ttl is not None else None

without negative consequences. I do not know what the plan is regarding the traitlets integration, but for now that restores the document time-to-live functionality.

@davidbrochart
Copy link
Collaborator

Thanks @asteppke for following up on this. Would you like to send a PR?

@asteppke
Copy link
Contributor Author

@davidbrochart: I put together a small PR that addresses this issue.

@krassowski
Copy link
Member

this is connected to the special treatment of traitlets within jupyter-collaboration

do you mean the below lines?

# Set configurable parameters to YStore class
for k, v in self.config.get(self.ystore_class.__name__, {}).items():
setattr(self.ystore_class, k, v)

This looks like an incorrect usage of traitlets at first glance. Instead passing a partial with config= could help.

@asteppke
Copy link
Contributor Author

@krassowski Yes, these are the lines that I mean. As far as I can see traitlets is indeed not meant to be used like that. I did not want to change more than absolutely necessary in this pull request here though.

@krassowski
Copy link
Member

I opened #322 with a clean fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants