Skip to content

Commit

Permalink
fix new versions for installation
Browse files Browse the repository at this point in the history
  • Loading branch information
ChristophSchranz committed Mar 6, 2023
1 parent efded96 commit 5539500
Show file tree
Hide file tree
Showing 5 changed files with 103 additions and 57 deletions.
42 changes: 24 additions & 18 deletions .build/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -469,6 +469,11 @@ RUN git clone https://github.com/Syllo/nvtop.git /run/nvtop && \
(make install 2> /dev/null || echo "make install was not successful") && \
cd /tmp && rm -rf /tmp/nvtop

# reinstall nvcc with cuda-nvcc to install ptax
RUN /opt/conda/bin/conda install -c nvidia cuda-nvcc -y
RUN ln -s /opt/conda/bin/ptxas /usr/bin/ptxas

# fix permissions of conda
RUN fix-permissions /home/$NB_USER

USER $NB_UID
Expand All @@ -481,35 +486,36 @@ LABEL authors="Christoph Schranz <[email protected]>, Mathem

USER root

RUN pip install --no-cache-dir ipyleaflet "plotly>=4.14.3" "ipywidgets>=7.5"
# fix permissions of conda
RUN fix-permissions $CONDA_DIR

# Install important packages and Graphviz
# Install usefull packages and Graphviz
RUN set -ex \
&& buildDeps=' \
graphviz==0.19.1 \
' \
&& apt-get update \
&& apt-get -y install htop apt-utils iputils-ping graphviz libgraphviz-dev openssh-client \
&& apt-get -y install htop apt-utils iputils-ping graphviz libgraphviz-dev openssh-client pytest \
&& pip install --no-cache-dir $buildDeps

# Install various extensions
RUN fix-permissions $CONDA_DIR
# install extension manager
RUN pip install --no-cache-dir jupyter_contrib_nbextensions \
jupyter_nbextensions_configurator

# RUN jupyter labextension install @jupyterlab/github
# RUN pip install jupyterlab-git
# install git extension
RUN pip install jupyterlab-git

# install plotly extension
RUN pip install plotly

# install drawio and graphical extensions
RUN pip install jupyterlab-drawio
RUN jupyter nbextension enable --py --sys-prefix ipyleaflet
RUN jupyter labextension install jupyterlab-plotly
RUN jupyter labextension install @jupyter-widgets/jupyterlab-manager plotlywidget
# RUN pip install --no-cache-dir jupyter-tabnine --user && \
# jupyter nbextension install --py jupyter_tabnine --user && \
# jupyter nbextension enable --py jupyter_tabnine --user && \
# jupyter serverextension enable --py jupyter_tabnine --user
RUN pip install --no-cache-dir jupyter_contrib_nbextensions \
jupyter_nbextensions_configurator rise
# jupyter nbextension enable codefolding/main
RUN jupyter labextension install @ijmbarr/jupyterlab_spellchecker
RUN pip install --no-cache-dir ipyleaflet ipywidgets rise

# install spell checker
RUN pip install jupyterlab-spellchecker

# fix permissions of conda
RUN fix-permissions /home/$NB_USER

# Switch back to jovyan to avoid accidental container runs as root
Expand Down
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -115,8 +115,9 @@ venv.bak/
# Added config to hide hash of changed password
src/jupyter_notebook_config.json
.idea
/scripts

/Deployment-notes.md
/build_push_full.sh
/build_push_python-only.sh
/build_push_slim.sh
extra/Getting_Started/tensorboard/runs
101 changes: 65 additions & 36 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,13 @@ This project composes a GPU-capable environment, based on NVIDIA's CUDA Docker i
and a toolstack on top of it to enable GPU calculations in the Jupyter notebooks.
The images of this repository are available on [Dockerhub](https://hub.docker.com/r/cschranz/gpu-jupyter).
A big thank you to [docker-stacks](https://github.com/jupyter/docker-stacks)
for creating and maintaining a robust Python, R, and Julia toolstack for Data Science applications.
for creating and maintaining a robust Python, R, and Julia toolstack for Data Science.


## Contents

1. [Quickstart](#quickstart)
2. [Build Your image](#build-your-own-image)
2. [Build Your image](#build-your-image)
3. [Tracing](#tracing)
4. [Configuration](#configuration)
5. [Deployment](#deployment-in-the-docker-swarm)
Expand Down Expand Up @@ -57,10 +57,10 @@ for creating and maintaining a robust Python, R, and Julia toolstack for Data Sc
| 1 N/A N/A 1918 G 6MiB |
+-----------------------------------------------------------------------------+
```
If you don't get an output similar, follow the installation steps in this
If you don't get a similar output, follow the installation steps in this
[Medium article](https://medium.com/@christoph.schranz/set-up-your-own-gpu-based-jupyterlab-e0d45fcacf43).
It is important to keep your installed CUDA version in mind when you pull images.
**You can't run images based on `nvidia/cuda:11.2` if you have only CUDA version 10.1 installed.**
**Note that you can't run images based on `nvidia/cuda:11.2` if you have only CUDA version 10.1 installed.**

4. Pull and run the GPU-Jupyter image. This may take some time as the whole environment for data science will be downloaded:
```bash
Expand All @@ -69,7 +69,9 @@ for creating and maintaining a robust Python, R, and Julia toolstack for Data Sc
docker run --gpus all -d -it -p 8848:8888 -v $(pwd)/data:/home/jovyan/work -e GRANT_SUDO=yes -e JUPYTER_ENABLE_LAB=yes --user root cschranz/gpu-jupyter:v1.5_cuda-11.6_ubuntu-20.04_python-only
```
This starts an instance of *GPU-Jupyter* with the tag `v1.5_cuda-11.6_ubuntu-20.04_python-only` at [http://localhost:8848](http://localhost:8848) (port `8848`).
The default password is `gpu-jupyter` (previously `asdf`) which should be changed as described [below](#set-password). For new installations, you have to access the token, which you can obtain using `docker exec -it [container-name/ID] jupyter server list`.
If the image is pulled, you have to specify the jupyter-token that you get from `docker exec -it [container-name/ID] jupyter server list`.
After that you can change the password or save the token for your browser.
The default password for a build image is `gpu-jupyter` which should be changed as described [below](#change-the-password).
Additionally, data within the host's `data` directory is shared with the container.
Note that the following images of GPU-Jupyter are available on [Dockerhub](https://hub.docker.com/r/cschranz/gpu-jupyter):
Expand Down Expand Up @@ -98,29 +100,26 @@ Additionally, data within the host's `data` directory is shared with the contain
In case you are using another version or the GPU libraries don't work on your hardware, please try to build the image on your own as described below.
Note that the images built for Ubuntu 20.04 LTS work also on Ubuntu 22.04 LTS is currently not supported.

To ensure that you can access your GPU within the JupyterLab instance, open a new terminal tab in the Jupyterlab UI ([http://localhost:8848](http://localhost:8848)) and run the command `nvidia-smi`. This will display information about your GPU, similar as in step 2.
To help you get started with using the GPU, the repository includes some sample code located in extra/Getting_Started. Take a look at this code to see how you can use the GPU in your projects.

Within the Jupyterlab instance, you can check if you can access your GPU by opening a new terminal window and running
`nvidia-smi`. In terminal windows, you can also install new packages for your projects.
Some example code can be found in the repository under `extra/Getting_Started`.
Within the Jupyterlab UI, ensure you can access your GPU by opening a new Terminal window and running `nvidia-smi`.
Here, you can also install additional packages on top of the built image.
To help you get started with using the GPU, the repository includes some sample code located in `extra/Getting_Started`.
If you're new to JupyterLab or want to learn more about its features,
we recommend checking out this [tutorial](https://www.youtube.com/watch?v=7wfPqAyYADY).
## Build Your Image
Building a custom Docker image is the recommended option if you have a different GPU architecture or if you want to customize the pre-installed libraries. The Dockerfiles in `src/` can be modified to achieve this. To use a custom base image, modify `src/Dockerfile.header`. To install specific GPU-related libraries, modify `src/Dockerfile.gpulibs`, and to add specific libraries, append them to `src/Dockerfile.usefulpackages`.
Building a custom Docker image is the recommended option if you have a different GPU architecture or if you want to customize the pre-installed packages. The Dockerfiles in `src/` can be modified to achieve this. To use a custom base image, modify `src/Dockerfile.header`. To install specific GPU-related libraries, modify `src/Dockerfile.gpulibs`, and to add specific libraries, append them to `src/Dockerfile.usefulpackages`.
After making the necessary modifications, regenerate the `Dockerfile` in `/.build`. Once you have confirmed that your GPU is accessible within Docker containers by running `docker run --gpus all nvidia/cuda:11.2.2-cudnn8-runtime-ubuntu20.04 nvidia-smi` and seeing the GPU statistics, you can generate, build, and run the Docker image.
The following commands will start *GPU-Jupyter* on [localhost:8848](http://localhost:8848) with the default password `gpu-jupyter` (previously `asdf`).
The following commands will start *GPU-Jupyter* on [localhost:8848](http://localhost:8848) with the default password `gpu-jupyter`.
```bash
git clone https://github.com/iot-salzburg/gpu-jupyter.git
cd gpu-jupyter
git branch # Check for supported branches with different versions
git checkout v1.5_cuda-11.6_ubuntu-20.04 # select the desired version
# generate the Dockerfile with python and without Julia and R (see options: ./generate-Dockerfile.sh --help)
git branch # Check for extisting branches
git checkout v1.5_cuda-11.6_ubuntu-20.04 # select or create a new version
# generate the Dockerfile with python and without Julia and R (see options: --help)
./generate-Dockerfile.sh --python-only
docker build -t gpu-jupyter .build/ # will take a while
docker run --gpus all -d -it -p 8848:8888 -v $(pwd)/data:/home/jovyan/work -e GRANT_SUDO=yes -e JUPYTER_ENABLE_LAB=yes -e NB_UID="$(id -u)" -e NB_GID="$(id -g)" --user root --restart always --name gpu-jupyter_1 gpu-jupyter
Expand All @@ -137,6 +136,7 @@ To start *GPU-Jupyter* using `docker-compose.yml`, run the following command:
```bash
docker-compose up --build -d # build and run in detached mode
docker-compose ps # check if was successful
docker-compose logs -f # view the logs
docker-compose down # stop the container
```
Expand All @@ -152,16 +152,14 @@ With these commands we can investigate the container:
```bash
docker ps # use the flat '-a' to view all
docker stats
docker logs [service-name] # or
docker-compose logs -f
docker exec -it [container-name/ID] bash
docker logs [service-name | UID] -f # view the logs
docker exec -it [service-name | UID] bash # open bash in the container
```
To stop the local deployment, run:
```bash
docker rm -f [service-name | UID] # or
docker-compose down
```
Expand Down Expand Up @@ -196,7 +194,7 @@ on top of the `scipy-notebook`, but the packages within `src/Dockerfile.usefulpa
* `--no-useful-packages`: On top of the `docker-stack` `datascience-notebook` (Julia and R),
the essential `gpulibs` are installed, but not the packages within `src/Dockerfile.usefulpackages`.
Note that the parameters `--slim`, `--python-only`, and `--no-useful-packages` are mutually exclusive:
Note that only one of the parameters `--slim`, `--python-only`, and `--no-useful-packages` can be used at the same time:
### Custom Installations
Expand All @@ -217,12 +215,12 @@ There are two ways to set a password for GPU-Jupyter:
```bash
bash generate-Dockerfile.sh --password [your_password]
```
This will update the salted hashed token in the `src/jupyter_notebook_config.json` file.
This will update the salted hashed token in the `src/jupyter_notebook_config.json` file. Note that the specified password may be visible in your account's bash history.
2. Manually update the token in the `src/jupyter_notebook_config.json` file.
Therefore, hash your password in the form (password)(salt) using a sha1 hash generator, e.g.,
the sha1 generator of [sha1-online.com](http://www.sha1-online.com/). The input with the
default password `gpu-jupyter` (previously `asdf`) is concatenated by an arbitrary salt
default password `gpu-jupyter` is concatenated by an arbitrary salt
`3b4b6378355` to `gpu-jupyter3b4b6378355` and is hashed to `642693b20f0a33bcad27b94293d0ed7db3408322`.
**Note: Never give away your own unhashed password!**
Expand All @@ -238,11 +236,31 @@ default password `gpu-jupyter` (previously `asdf`) is concatenated by an arbitra
```
### Adaptions for using Tensorboard
Both TensorFlow and PyTorch support [tensorboard](https://www.tensorflow.org/tensorboard/get_started).
This packages is already installed in the GPU-packages and can be used with these settings:
1. Forward the port in the docker command using `-p 6006:6006` (only for usage outside of Juypterlab).
2. Starting tensorboad with port binding within a container or Jupyterlab UI. Make sure the parameter `--bind_all` is set.
```bash
docker exec -it [container-name/ID] bash
root@749eb1a06d60:~# tensorboard --logdir mylogdir --bind_all
```
```jupyter
%tensorboard --logdir logs/[logdir] --bind_all
```
4. Writing the states and results in the tensorboard log-dir, as described in the tutorials for [TensorFlow](https://www.tensorflow.org/tensorboard/get_started) and [PyTorch](https://pytorch.org/tutorials/recipes/recipes/tensorboard_with_pytorch.html) or in the Getting Started section `data/Getting_Started`.
If the port is exposed, tensorboard can be accessed in the browser on [localhost:6006](http://localhost:6006).
### Updates
#### Update CUDA to another version
The GPU-libraries such as PyTorch and Tensorflow in `src/Docker.gpulibs` must support the CUDA version and NVIDIA drivers on the host machine. Check out these compatibility lists for [PyTorch](https://pytorch.org/get-started/locally/) and [Tensorflow](https://www.tensorflow.org/install/source#gpu) or search online for the explicit versions. In my setup, the NVIDIA Driver has version 510.47.03 and CUDA version 11.6.2 is used, which is compatible with Tensorflow 2.10 and PyTorch 1.12.
The GPU-libraries such as PyTorch and Tensorflow in `src/Docker.gpulibs` must support the CUDA version and NVIDIA drivers on the host machine. Check out the compatibility lists for [PyTorch](https://pytorch.org/get-started/locally/) and [Tensorflow](https://www.tensorflow.org/install/source#gpu) or search online for the explicit versions. In this setup, the NVIDIA Driver has version 530.30.02 and CUDA version 11.6.2 is used, which is compatible with Tensorflow 2.10 and PyTorch 1.12.
The host's CUDA version must be equal to or higher than that of the container itself (in `Dockerfile.header`).
Check the host's version with `nvcc --version` and the version compatibilities
Expand All @@ -264,10 +282,10 @@ docker run --gpus all -d -it -p 8848:8888 -v $(pwd)/data:/home/jovyan/work -e GR
```
#### Update the Docker-Stack
#### Upgrade / Downgrade the Docker-Stacks
The [docker-stacks](https://github.com/jupyter/docker-stacks) is used as a submodule within `.build/docker-stacks`. Per default, the head of the commit is reset to a commit on which `gpu-jupyter` runs stable.
To update the generated Dockerfile to a specific commit, run:
To set the docker-stacks to a specific version generate the Dockerfile with a specific [docker-stacks commit](https://github.com/jupyter/docker-stacks/commits/main), run:
```bash
./generate-Dockerfile.sh --commit c1c32938438151c7e2a22b5aa338caba2ec01da2
Expand All @@ -283,20 +301,31 @@ A new build can last some time and may consume a lot of data traffic. Note, that
a version conflict, as some files have to be adapted. Here are some examples of solutions:
- **Some file is not found:**
```Step 22/64 : COPY --chown="${NB_UID}:${NB_GID}" initial-condarc "${CONDA_DIR}/.condarc"
COPY failed: file not found in build context or excluded by .dockerignore: stat initial-condarc: file does not exist```
-> Adapt `nano generate-Dockerfile.sh` so that it copies `initial-condarc` into the working directory as it does with other files. Renamed files result in a similar issue and solution.
```
Step 22/64 : COPY --chown="${NB_UID}:${NB_GID}" initial-condarc "${CONDA_DIR}/.condarc"
COPY failed: file not found in build context or excluded by .dockerignore: stat initial-condarc: file does not exist
```
&rarr; Adapt `generate-Dockerfile.sh` so that it copies `initial-condarc` into the working directory as it does with other files. Renamed files result in a similar issue and solution.
- **The specified package version is not compatible with the drivers.**
```Step 56/64 : RUN pip install --upgrade pip && pip install --no-cache-dir "tensorflow==2.6.2" && pip install --no-cache-dir keras
```
Step 56/64 : RUN pip install --upgrade pip && pip install --no-cache-dir "tensorflow==2.6.2" && pip install --no-cache-dir keras
---> Running in 7c5701a3d780
Requirement already satisfied: pip in /opt/conda/lib/python3.10/site-packages (22.1.2)
ERROR: Could not find a version that satisfies the requirement tensorflow==2.6.2 (from versions: 2.8.0rc0, 2.8.0rc1, 2.8.0, 2.8.1, 2.8.2, 2.9.0rc0, 2.9.0rc1, 2.9.0rc2, 2.9.0, 2.9.1)
ERROR: No matching distribution found for tensorflow==2.6.2```
-> Just update the package to a version that is compatible, here tensorflow 2.8.2 was.
ERROR: No matching distribution found for tensorflow==2.6.2
```
More info on submodules can be found in
[this tutorial](https://www.vogella.com/tutorials/GitSubmodules/article.html).
&rarr; Just update the package to a version that is compatible, here tensorflow 2.8.2 was.
- **Branch `master` is not known:**
```
error: pathspec 'master' did not match any file(s) known to git.
```
&rarr; The default branch of [docker-stacks](https://github.com/jupyter/docker-stacks) were
renamed from `master` to `main`. Delete the subdirectory `.build/docker-stacks` and regenerate the the Dockerfile.
General information on submodules can be found in
[this tutorial](https://www.vogella.com/tutorials/GitSubmodules/article.html).
Expand Down Expand Up @@ -367,8 +396,8 @@ where:
e.g., here it is **elk_datastack**.
* **-r:** registry port is the port that is published by the registry service, the default is `5000`.
Now, *gpu-jupyter* will be accessible here on [localhost:8848](http://localhost:8848)
with the default password `gpu-jupyter` (previously `asdf`) and shares the network with the other data source, i.e.,
Now, *GPU-jupyter* will be accessible here on [localhost:8848](http://localhost:8848)
with the default password `gpu-jupyter` and shares the network with the other data source, i.e.,
all ports of the data source will be accessible within *GPU-Jupyter*,
even if they aren't routed it the source's `docker-compose` file.
Expand Down
13 changes: 11 additions & 2 deletions build_push_all.sh
Original file line number Diff line number Diff line change
@@ -1,8 +1,17 @@
#!/usr/bin/env bash
cd $(cd -P -- "$(dirname -- "$0")" && pwd -P)

export TAGNAME="v1.4_cuda-11.6_ubuntu-20.04"

# extract the branch-name that is built and pushed
export TAGNAME=$(git symbolic-ref -q HEAD)
export TAGNAME=${TAGNAME##refs/heads/}
export TAGNAME=${TAGNAME:-HEAD}
# manually set tag
# export TAGNAME="v1.5_cuda-11.6_ubuntu-20.04"
echo "Build and push images full, python-only & slim for branch '$TAGNAME'."
if [[ "$TAGNAME" != "v"*"_cuda-"*"_ubuntu-"* ]]; then
echo "ERROR, build_push_all.sh only possible within branches of shape 'v'*'_cuda-'*'_ubuntu-'*."
exit 1
fi

###################### build, run and push full image ##########################
echo
Expand Down
1 change: 1 addition & 0 deletions src/Dockerfile.gpulibs
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ RUN git clone https://github.com/Syllo/nvtop.git /run/nvtop && \
RUN /opt/conda/bin/conda install -c nvidia cuda-nvcc -y
RUN ln -s /opt/conda/bin/ptxas /usr/bin/ptxas

# fix permissions of conda
RUN fix-permissions /home/$NB_USER

USER $NB_UID

0 comments on commit 5539500

Please sign in to comment.