docker: downloads model-*.safetensors on every run. How can I cache them? #807

spicedreams · 2024-06-10T17:07:56Z

spicedreams
Jun 10, 2024

model-00001-of-00004.safetensors (and model-00002 etc) downloads every time I run the Dockerfile.

Typically they end up like this one:
7214446 4 lrwxrwxrwx 1 root root 76 Jun 10 17:45 /var/lib/docker/overlay2/4e9a81cd4a4e22416acd0b9cfbd64636c784e749fe126bd16ec3fca6a29f5e82/diff/models/models--meta-llama--Meta-Llama-3-8B-Instruct/snapshots/e1945c40cd546c78e41f1151f4db032b271faeaa/model-00001-of-00004.safetensors -> ../../blobs/d8cf9c4d0dd972e1a2131bfe656235ee98221679711a3beef6d46dadf0f20b5c

So there are now many of them and all using up disk space on my / volume, which is running out...

Please can someone help me get the right incantation so it runs these from cache, instead of downloading afresh each run?

Then I enter a query, get an error and it crashes out. The error is
Truncation was not explicitly activated but max_length is provided a specific value, please use truncation=True to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to truncation.
Setting pad_token_id to eos_token_id:128001 for open-end generation.
So I am back to 'docker run etc' all over again, including the massive downloads.

Host is Linux Mint (upstream is Ubuntu 22.04; Jammy Jellyfish LTS).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docker: downloads model-*.safetensors on every run. How can I cache them? #807

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

docker: downloads model-*.safetensors on every run. How can I cache them? #807

spicedreams Jun 10, 2024

Replies: 0 comments

spicedreams
Jun 10, 2024