-
Notifications
You must be signed in to change notification settings - Fork 29.1k
Pull requests: huggingface/transformers
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[chat] improvements for thinking models and reduce default verbosity
#38322
opened May 23, 2025 by
gante
Loading…
[
FlexAttention
] Reenable flex for encoder-decoder and make the test more robust
#38321
opened May 23, 2025 by
vasqu
Loading…
Fix some tests (especially compile with fullgraph=True on Python<3.11)
#38319
opened May 23, 2025 by
Cyrilvallez
Loading…
5 tasks
Use Gradient Checkpointing Layer in Jamba & Blip Related Models
#38310
opened May 22, 2025 by
alex-jw-brooks
Loading…
[custom_generate] don't forward
custom_generate
and trust_remote_code
#38304
opened May 22, 2025 by
gante
Loading…
fix total batch size calculation in trainer
#38286
opened May 22, 2025 by
inkcherry
Loading…
5 tasks
align xpu's autocast behavior w/ cuda by using device agnostic torch.autocast
#38284
opened May 22, 2025 by
yao-matrix
Loading…
Add zero dim tensor check when using flash_attention
#38280
opened May 22, 2025 by
ranzhejiang
Loading…
[performance_optim] reduce frequency of declaring attention_mask in Ascend NPU flash attention
#38278
opened May 22, 2025 by
FightingZhen
Loading…
1 of 5 tasks
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.