Skip to content

TensorRT-LLM backend v0.18.1 release #734

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 9, 2025
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 7 additions & 7 deletions dockerfile/Dockerfile.trt_llm_backend
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
ARG BASE_IMAGE=nvcr.io/nvidia/tritonserver
ARG BASE_TAG=25.02-py3
ARG BASE_TAG=25.03-py3
ARG PYTORCH_IMAGE=nvcr.io/nvidia/pytorch:25.03-py3

FROM ${PYTORCH_IMAGE} as pytorch_image
FROM ${BASE_IMAGE}:${BASE_TAG} as base
FROM ${PYTORCH_IMAGE} AS pytorch_image
FROM ${BASE_IMAGE}:${BASE_TAG} AS base

# Copy PyTorch package from PyTorch image
COPY --from=pytorch_image /usr/local/lib/lib* /usr/local/lib/
Expand Down Expand Up @@ -37,7 +37,7 @@ RUN pip3 install -r /tmp/requirements.txt
RUN apt-get remove --purge -y tensorrt*
RUN pip uninstall -y tensorrt

FROM base as dev
FROM base AS dev

# Download & install internal TRT release
COPY tensorrt_llm/docker/common/install_tensorrt.sh /tmp/
Expand All @@ -63,20 +63,20 @@ ARG TORCH_INSTALL_TYPE="skip"
COPY tensorrt_llm/docker/common/install_pytorch.sh install_pytorch.sh
RUN bash ./install_pytorch.sh $TORCH_INSTALL_TYPE && rm install_pytorch.sh

FROM dev as trt_llm_builder
FROM dev AS trt_llm_builder

WORKDIR /app
COPY scripts scripts
COPY tensorrt_llm tensorrt_llm
RUN cd tensorrt_llm && python3 scripts/build_wheel.py --trt_root="${TRT_ROOT}" -i -c && cd ..

FROM trt_llm_builder as trt_llm_backend_builder
FROM trt_llm_builder AS trt_llm_backend_builder

WORKDIR /app/
COPY inflight_batcher_llm inflight_batcher_llm
RUN cd inflight_batcher_llm && bash scripts/build.sh && cd ..

FROM trt_llm_backend_builder as final
FROM trt_llm_backend_builder AS final

# Install TensorRT-LLM
WORKDIR /app/
Expand Down
2 changes: 1 addition & 1 deletion tensorrt_llm
Submodule tensorrt_llm updated 56 files
+1 −1 README.md
+1 −1 cpp/tensorrt_llm/batch_manager/aarch64-linux-gnu/version.txt
+1 −1 cpp/tensorrt_llm/batch_manager/x86_64-linux-gnu/libtensorrt_llm_batch_manager_static.a
+1 −1 cpp/tensorrt_llm/batch_manager/x86_64-linux-gnu/libtensorrt_llm_batch_manager_static.pre_cxx11.a
+1 −1 cpp/tensorrt_llm/batch_manager/x86_64-linux-gnu/version.txt
+1 −1 cpp/tensorrt_llm/executor/aarch64-linux-gnu/libtensorrt_llm_executor_static.a
+1 −1 cpp/tensorrt_llm/executor/aarch64-linux-gnu/libtensorrt_llm_executor_static.pre_cxx11.a
+3 −3 cpp/tensorrt_llm/executor/aarch64-linux-gnu/version.txt
+1 −1 cpp/tensorrt_llm/executor/x86_64-linux-gnu/libtensorrt_llm_executor_static.a
+1 −1 cpp/tensorrt_llm/executor/x86_64-linux-gnu/libtensorrt_llm_executor_static.pre_cxx11.a
+3 −3 cpp/tensorrt_llm/executor/x86_64-linux-gnu/version.txt
+1 −1 ...rt_llm/kernels/decoderMaskedMultiheadAttention/decoderXQAImplJIT/nvrtcWrapper/aarch64-linux-gnu/version.txt
+1 −1 ...rrt_llm/kernels/decoderMaskedMultiheadAttention/decoderXQAImplJIT/nvrtcWrapper/x86_64-linux-gnu/version.txt
+1 −1 cpp/tensorrt_llm/kernels/internal_cutlass_kernels/aarch64-linux-gnu/version.txt
+1 −1 ...rrt_llm/kernels/internal_cutlass_kernels/x86_64-linux-gnu/libtensorrt_llm_internal_cutlass_kernels_static.a
+1 −1 ...rnels/internal_cutlass_kernels/x86_64-linux-gnu/libtensorrt_llm_internal_cutlass_kernels_static.pre_cxx11.a
+3 −3 cpp/tensorrt_llm/kernels/internal_cutlass_kernels/x86_64-linux-gnu/version.txt
+9 −0 docs/source/release-notes.md
+1 −1 examples/baichuan/requirements.txt
+1 −1 examples/bloom/requirements.txt
+1 −1 examples/chatglm/requirements.txt
+1 −1 examples/commandr/requirements.txt
+1 −1 examples/dbrx/requirements.txt
+1 −1 examples/deepseek_v1/requirements.txt
+1 −1 examples/draft_target_model/requirements.txt
+1 −1 examples/eagle/requirements.txt
+1 −1 examples/falcon/requirements.txt
+1 −1 examples/gemma/requirements.txt
+1 −1 examples/gpt/requirements.txt
+1 −1 examples/gptj/requirements.txt
+1 −1 examples/gptneox/requirements.txt
+1 −1 examples/grok/requirements.txt
+1 −1 examples/internlm/requirements.txt
+1 −1 examples/jais/requirements.txt
+1 −1 examples/llama/requirements.txt
+1 −1 examples/lookahead/requirements.txt
+1 −1 examples/mamba/requirements.txt
+1 −1 examples/medusa/requirements.txt
+1 −1 examples/mixtral/requirements.txt
+1 −1 examples/mpt/requirements.txt
+1 −1 examples/nemotron/requirements.txt
+1 −1 examples/opt/requirements.txt
+1 −1 examples/phi/requirements.txt
+1 −1 examples/prompt_lookup/requirements.txt
+1 −1 examples/quantization/requirements.txt
+1 −1 examples/qwen/requirements.txt
+1 −1 examples/qwenvl/requirements.txt
+1 −1 examples/recurrentgemma/requirements.txt
+1 −1 examples/redrafter/requirements.txt
+1 −1 examples/skywork/requirements.txt
+1 −1 examples/smaug/requirements.txt
+1 −1 examples/whisper/requirements.txt
+1 −2 requirements.txt
+23 −18 tensorrt_llm/models/gemma/smoothquant.py
+1 −1 tensorrt_llm/version.py
+1 −9 tests/llmapi/apps/_test_openai_multi_chat.py
2 changes: 1 addition & 1 deletion tools/version.txt
Original file line number Diff line number Diff line change
@@ -1 +1 @@
2dd0858fe1937169135547faab56b438c28a7132
307752db60d431d6885b4fcf66f9aa189cad5e64