Skip to content

Update TensorRT-LLM backend #726

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 18, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 0 additions & 3 deletions ci/L0_backend_trtllm/generate_engines.sh
Original file line number Diff line number Diff line change
@@ -61,9 +61,6 @@ function build_tensorrt_engine_inflight_batcher {
cd ${BASE_DIR}
}

# Downgrade to legacy version to accommodate Triton CI runners
pip install pynvml==11.4.0

# Generate the TRT_LLM model engines
NUM_GPUS_TO_TEST=("1" "2" "4")
for NUM_GPU in "${NUM_GPUS_TO_TEST[@]}"; do
2 changes: 1 addition & 1 deletion tensorrt_llm
2 changes: 1 addition & 1 deletion tools/version.txt
Original file line number Diff line number Diff line change
@@ -1 +1 @@
cf950ea521ca66fa3e3383545e60e4212d2de547
85bc9eaf686c2a77c364b30c51e7f844ad6b47eb