Skip to content

Pull requests: NVIDIA/TensorRT-LLM

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

test: add kv cache event tests for disagg workers
#3602 opened Apr 16, 2025 by zhengd-nv Loading…
fix: fix cublas_scaled_mm
#3600 opened Apr 16, 2025 by dc3671 Draft
test: add multinode test case for deepseek-v3
#3599 opened Apr 16, 2025 by crazydemo Loading…
feat: add etcd dependency and interface
#3597 opened Apr 16, 2025 by Shunkangz Loading…
test: add quickstart test for nemotron-ultra
#3596 opened Apr 16, 2025 by crazydemo Loading…
test:restore fp8 kv cache testing for L0
#3595 opened Apr 16, 2025 by nv-guomingz Loading…
test: Get Eagle tests working
#3593 opened Apr 16, 2025 by brb-nv Loading…
feat: trtllm-serve multimodal support
#3590 opened Apr 16, 2025 by yechank-nvidia Loading…
feat: Disaggregated router class
#3584 opened Apr 15, 2025 by pcastonguay Loading…
fix: add SM90 guard for FP8 Blockscale GEMM
#3575 opened Apr 15, 2025 by lucifer1004 Loading…
fix: Remove unnecessary max call
#3574 opened Apr 15, 2025 by kaiyux Loading…
feat: support kv cache reuse for MLA
#3571 opened Apr 15, 2025 by zhhuang-nv Loading…
fix: FP8 quantized lm_head (NvBug 5214229)
#3567 opened Apr 15, 2025 by syuoni Loading…
Test release/0.19 CI
#3556 opened Apr 15, 2025 by ZhanruiSunCh Draft
Clean up linear.py, mlp.py, gated_mlp.py
#3553 opened Apr 15, 2025 by hlu1 Loading…
test: Unwaive test for nvbug_5150466
#3552 opened Apr 15, 2025 by hchings Loading…
infra: Install Triton in TRT-LLM container
#3549 opened Apr 14, 2025 by Tabrizian Loading…
feat: [AutoDeploy] Llama-4 support
#3547 opened Apr 14, 2025 by lucaslie Draft
ProTip! Follow long discussions with comments:>50.