-
Notifications
You must be signed in to change notification settings - Fork 525
Pull requests: InternLM/lmdeploy
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Refactor turbomind (low-level abstractions)
improvement
#3423
opened Apr 11, 2025 by
lzhangzz
Loading…
Launch multiple api servers for dp > 1
improvement
#3414
opened Apr 10, 2025 by
RunningLeon
Loading…
add twomicrobatch support
enhancement
New feature or request
#3381
opened Apr 1, 2025 by
SHshenhao
Loading…
Add Gloo communication to turobmind
enhancement
New feature or request
#3362
opened Mar 28, 2025 by
irexyc
Loading…
Improve turbomind's prefix cache
BC-breaking
improvement
#3332
opened Mar 25, 2025 by
lvhan028
Loading…
6 of 8 tasks
add deepseekv3 doc
documentation
Improvements or additions to documentation
WIP
#3265
opened Mar 17, 2025 by
CUHKSZzxy
Loading…
support loading model with user input params (turbomind)
enhancement
New feature or request
#3204
opened Mar 3, 2025 by
irexyc
Loading…
support setting devices for turbomind backend
improvement
#3203
opened Mar 3, 2025 by
irexyc
Loading…
fix: replace inf with max or min finite value, then do softmax
#3059
opened Jan 21, 2025 by
KenForever1
Loading…
support Turbomind ep
enhancement
New feature or request
#2883
opened Dec 12, 2024 by
irexyc
Loading…
Support Medusa speculative decoding
enhancement
New feature or request
#2859
opened Dec 5, 2024 by
AllentDan
Loading…
Refactor turbomind attention by precomputing rotary embed
improvement
#2801
opened Nov 25, 2024 by
irexyc
Loading…
[Feature] Support llava onevision
enhancement
New feature or request
#2783
opened Nov 21, 2024 by
deepindeed2022
Loading…
Previous Next
ProTip!
no:milestone will show everything without a milestone.