-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add recovery mode setup #543
Conversation
Can you also add a unit test for this functionality? It can be pretty simple, setting the env variable then calling script, and checking current mamba env. |
Added p0 sanity test |
template/v2/Dockerfile
Outdated
# Setup the Recovery Mode home directory and micromamba environment | ||
mkdir -p $RECOVERY_MODE_HOME && \ | ||
chown $MAMBA_USER:$MAMBA_USER $RECOVERY_MODE_HOME && \ | ||
micromamba create -n recovery-mode && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you verified image size changes with these lines? I'm concerning if setting up a separate python venv will significantly increase the image size
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think image size will be reduced even more in the latest commit, since I move the install part before micromamba clean statement so that some file will be cleaned after that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, can you verify the latest size increase with your latest DockerFile once?
Image size impact is < 5% to compare to 2.3.0 vanilla version, should be safe to merge
I ran the same script and build an SMD 2.3.0 with no changes, this is the size
(7.66 - 7.33) / 7.33 = 4% See also image size after compression on ECR |
Summary
In Dockerfile
/tmp/recovery-mode
, which will be used as home directory in recovery moderecovery-mode
, which only contains 2 basic jupyterlab extension required for recovery modeIn Entrypoint scripts:
Motivation
From the customer’s perspective, Recovery Mode allows them to quickly recover from issues on their own, without waiting more than a day (sometimes even longer) for AWS support. This minimal environment provides immediate access to their data and lets them continue AI/ML tasks with minimal downtime, improving both productivity and overall experience.
Design Doc
https://quip-amazon.com/5M9NAedp2jHN/LLD-StudioV2-Recovery-Mode
Test
Added new unit test to test jupyterlab entrypoint