docker: downloads model-*.safetensors on every run. How can I cache them? #807
Unanswered
spicedreams
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
model-00001-of-00004.safetensors (and model-00002 etc) downloads every time I run the Dockerfile.
Typically they end up like this one:
7214446 4 lrwxrwxrwx 1 root root 76 Jun 10 17:45 /var/lib/docker/overlay2/4e9a81cd4a4e22416acd0b9cfbd64636c784e749fe126bd16ec3fca6a29f5e82/diff/models/models--meta-llama--Meta-Llama-3-8B-Instruct/snapshots/e1945c40cd546c78e41f1151f4db032b271faeaa/model-00001-of-00004.safetensors -> ../../blobs/d8cf9c4d0dd972e1a2131bfe656235ee98221679711a3beef6d46dadf0f20b5c
So there are now many of them and all using up disk space on my / volume, which is running out...
Please can someone help me get the right incantation so it runs these from cache, instead of downloading afresh each run?
Then I enter a query, get an error and it crashes out. The error is
Truncation was not explicitly activated but
max_length
is provided a specific value, please usetruncation=True
to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy totruncation
.Setting
pad_token_id
toeos_token_id
:128001 for open-end generation.So I am back to 'docker run etc' all over again, including the massive downloads.
Host is Linux Mint (upstream is Ubuntu 22.04; Jammy Jellyfish LTS).
Beta Was this translation helpful? Give feedback.
All reactions