Skip to content

Navigation Menu

Explore
By company size
By use case
By industry
View all solutions
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

Sign up

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

NVIDIA / TensorRT-LLM Public

Notifications You must be signed in to change notification settings
Fork 1.4k
Star 10.2k

Code
Issues 510
Pull requests 176
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: NVIDIA/TensorRT-LLM

Labels 38 Milestones 0

Labels 38 Milestones 0

New pull request New

176 Open 889 Closed

176 Open 889 Closed

Author

Filter by author

Loading

Label

Filter by label

Loading

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Loading

Milestones

Filter by milestone

Loading

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Loading

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

test: add kv cache event tests for disagg workers

#3602 opened Apr 16, 2025 by zhengd-nv

Loading…

fix: fix cublas_scaled_mm

#3600 opened Apr 16, 2025 by dc3671 • Draft

5

test: add multinode test case for deepseek-v3

#3599 opened Apr 16, 2025 by crazydemo

Loading…

feat: add etcd dependency and interface

#3597 opened Apr 16, 2025 by Shunkangz

Loading…

3

test: add quickstart test for nemotron-ultra

#3596 opened Apr 16, 2025 by crazydemo

Loading…

test:restore fp8 kv cache testing for L0

#3595 opened Apr 16, 2025 by nv-guomingz

Loading…

2

test: Get Eagle tests working

#3593 opened Apr 16, 2025 by brb-nv

Loading…

3

feat: trtllm-serve multimodal support

#3590 opened Apr 16, 2025 by yechank-nvidia

Loading…

3

feat: [AutoDeploy] generalizing cudagraph to multiple dynamic inputs

#3589 opened Apr 16, 2025 by lucaslie

Loading…

4

fix: check for config architectures in model config

#3586 opened Apr 15, 2025 by vanshilshah97

Loading…

4

feat: Disaggregated router class

#3584 opened Apr 15, 2025 by pcastonguay

Loading…

feat: Integrate GPUDirect Storage (GDS) into Executor API

#3582 opened Apr 15, 2025 by DomBrown

Loading…

3

fix : release torch-managed memory as soon as it's not needed

#3579 opened Apr 15, 2025 by peaceh-nv

Loading…

3

[TRTLLM-4051] Support only run some backend type test

#3578 opened Apr 15, 2025 by ZhanruiSunCh • Draft

3

fix: add SM90 guard for FP8 Blockscale GEMM

#3575 opened Apr 15, 2025 by lucifer1004

Loading…

fix: Remove unnecessary max call

#3574 opened Apr 15, 2025 by kaiyux

Loading…

7

feat: support kv cache reuse for MLA

#3571 opened Apr 15, 2025 by zhhuang-nv

Loading…

3

fix: FP8 quantized lm_head (NvBug 5214229)

#3567 opened Apr 15, 2025 by syuoni

Loading…

chore: add assertion for devices to avoid underlying errors

#3558 opened Apr 15, 2025 by Superjomn

Loading…

Test release/0.19 CI

#3556 opened Apr 15, 2025 by ZhanruiSunCh • Draft

6

move the reset models into examples/models/core directory

#3555 opened Apr 15, 2025 by QiJune

Loading…

5

Clean up linear.py, mlp.py, gated_mlp.py

#3553 opened Apr 15, 2025 by hlu1

Loading…

9

test: Unwaive test for nvbug_5150466

#3552 opened Apr 15, 2025 by hchings

Loading…

infra: Install Triton in TRT-LLM container

#3549 opened Apr 14, 2025 by Tabrizian

Loading…

5

feat: [AutoDeploy] Llama-4 support

#3547 opened Apr 14, 2025 by lucaslie • Draft

Previous 1 2 3 4 5 6 7 8 Next

Previous Next

ProTip! Follow long discussions with comments:>50.

Footer

© 2025 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.