-
Notifications
You must be signed in to change notification settings - Fork 28.5k
Issues: huggingface/transformers
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Swinv2Model reports an error when using the parameter use_obsolute_embeddings
#37161
opened Apr 1, 2025 by
SCP-KAKA
Loading DeepSeek R1 model took extremely long time
bug
#37160
opened Apr 1, 2025 by
Neo9061
2 of 4 tasks
Error when using trainer with default data parallelism enabled: RuntimeError: chunk expects at least a 1-dimensional tensor
bug
#37151
opened Mar 31, 2025 by
Mekadrom
2 of 4 tasks
Warnings when loading Deepseek-V3 without custom code
bug
#37134
opened Mar 31, 2025 by
Rocketknight1
Whether transformers Trainer support pipeline parallelism?
#37129
opened Mar 31, 2025 by
liuheng0111
Possible to move HybridCache from GPU to CPU?
Cache
Feature request
Request for a new feature
#37125
opened Mar 31, 2025 by
tianhaoz95
FastAPI with LLM inference does not release accumulated VRAM
bug
#37118
opened Mar 31, 2025 by
variable
4 tasks
Add Sdpa Support for Request for a new feature
Electra
Feature request
#37105
opened Mar 29, 2025 by
nnilayy
Feature Request: Support Canary Models
Feature request
Request for a new feature
#37098
opened Mar 29, 2025 by
fakerybakery
Release Tag Changed, Breaking Checksums, and AUR Package Building
#37090
opened Mar 28, 2025 by
daskol
LLaVa_mistral models are unrecognized
bug
New model
#37087
opened Mar 28, 2025 by
darshpatel1052
2 of 4 tasks
Do not update cache when use_cache=False and past_key_values are provided?
Feature request
Request for a new feature
#37078
opened Mar 28, 2025 by
PheelaV
A TypeError in modeling_utils.caching_allocator_warmup function
bug
#37074
opened Mar 28, 2025 by
ZeroMakesAll
2 of 4 tasks
a logic error in _preprocess function of Qwen2VLImageProcessor Class
bug
#37064
opened Mar 28, 2025 by
InsaneGe
4 tasks
Incorrect calculation of strides leading to loss of param data upon tensor parallel use while sliced model loading
bug
Tensor Parallel
#37051
opened Mar 27, 2025 by
kmehant
1 of 4 tasks
Persistent generation issues with MT5 models (base and fine-tuned) across environments
#37048
opened Mar 27, 2025 by
Elpharran
Optionality of
attention_mask
argument in Attention classes/functions.
#37046
opened Mar 27, 2025 by
Godofnothing
run_mim.py script from image-pretraining example is not working
bug
#37020
opened Mar 26, 2025 by
jafraustro
1 of 4 tasks
SwitchTransformer: Initialization of tensor to collect expert results is incorrect for dropped tokens (from ML POV)
bug
#37017
opened Mar 26, 2025 by
mario-aws
2 of 4 tasks
Previous Next
ProTip!
Exclude everything labeled
bug
with -label:bug.