File tree
12 files changed
+40
-20
lines changed- all_models/gpt/tensorrt_llm/1
- dockerfile
- inflight_batcher_llm
- src
- scripts
- tools
12 files changed
+40
-20
lines changedDiff for: .gitmodules
+1-1
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
1 | 1 |
| |
2 | 2 |
| |
3 |
| - | |
| 3 | + |
Diff for: .pre-commit-config.yaml
+6
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
40 | 40 |
| |
41 | 41 |
| |
42 | 42 |
| |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + |
Diff for: README.md
+1-1
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
363 | 363 |
| |
364 | 364 |
| |
365 | 365 |
| |
366 |
| - | |
| 366 | + | |
367 | 367 |
| |
368 | 368 |
| |
369 | 369 |
| |
|
Diff for: all_models/gpt/tensorrt_llm/1/model.py
+1-1
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
242 | 242 |
| |
243 | 243 |
| |
244 | 244 |
| |
245 |
| - | |
| 245 | + | |
246 | 246 |
| |
247 | 247 |
| |
248 | 248 |
| |
|
Diff for: dockerfile/Dockerfile.trt_llm_backend
+1-1
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
8 | 8 |
| |
9 | 9 |
| |
10 | 10 |
| |
11 |
| - | |
| 11 | + | |
12 | 12 |
| |
13 | 13 |
| |
14 | 14 |
| |
|
Diff for: inflight_batcher_llm/CMakeLists.txt
+1-1
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
30 | 30 |
| |
31 | 31 |
| |
32 | 32 |
| |
33 |
| - | |
| 33 | + | |
34 | 34 |
| |
35 | 35 |
| |
36 | 36 |
| |
|
+6-4
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
858 | 858 |
| |
859 | 859 |
| |
860 | 860 |
| |
861 |
| - | |
862 | 861 |
| |
863 | 862 |
| |
864 | 863 |
| |
| |||
1128 | 1127 |
| |
1129 | 1128 |
| |
1130 | 1129 |
| |
1131 |
| - | |
| 1130 | + | |
1132 | 1131 |
| |
1133 | 1132 |
| |
1134 | 1133 |
| |
| |||
1139 | 1138 |
| |
1140 | 1139 |
| |
1141 | 1140 |
| |
1142 |
| - | |
1143 |
| - | |
| 1141 | + | |
| 1142 | + | |
| 1143 | + | |
| 1144 | + | |
| 1145 | + | |
1144 | 1146 |
| |
1145 | 1147 |
| |
1146 | 1148 |
| |
|
Diff for: scripts/launch_triton_server.py
+19-7
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
1 | 1 |
| |
2 | 2 |
| |
| 3 | + | |
3 | 4 |
| |
4 | 5 |
| |
5 | 6 |
| |
| |||
9 | 10 |
| |
10 | 11 |
| |
11 | 12 |
| |
12 |
| - | |
13 |
| - | |
14 |
| - | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
15 | 25 |
| |
16 | 26 |
| |
17 | 27 |
| |
| |||
30 | 40 |
| |
31 | 41 |
| |
32 | 42 |
| |
33 |
| - | |
| 43 | + | |
34 | 44 |
| |
35 | 45 |
| |
36 | 46 |
| |
37 | 47 |
| |
38 |
| - | |
39 |
| - | |
40 |
| - | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
41 | 53 |
| |
42 | 54 |
|
Diff for: tensorrt_llm
Submodule tensorrt_llm updated 43 files
- benchmarks/cpp/gptManagerBenchmark.cpp+36-38
- benchmarks/cpp/gptSessionBenchmark.cpp+46-20
- cpp/include/tensorrt_llm/batch_manager/kvCacheConfig.h+45
- cpp/include/tensorrt_llm/batch_manager/kvCacheManager.h+9-3
- cpp/include/tensorrt_llm/batch_manager/trtGptModelOptionalParams.h+11-39
- cpp/include/tensorrt_llm/runtime/gptDecoderBatch.h+5-5
- cpp/include/tensorrt_llm/runtime/gptSession.h+61-47
- cpp/include/tensorrt_llm/runtime/iGptDecoderBatch.h+1-1
- cpp/include/tensorrt_llm/runtime/iStatefulGptDecoder.h+7-5
- cpp/include/tensorrt_llm/runtime/worldConfig.h+6
- cpp/tensorrt_llm/batch_manager/aarch64-linux-gnu/libtensorrt_llm_batch_manager_static.a+2-2
- cpp/tensorrt_llm/batch_manager/aarch64-linux-gnu/libtensorrt_llm_batch_manager_static.pre_cxx11.a+2-2
- cpp/tensorrt_llm/batch_manager/aarch64-linux-gnu/version.txt+3-3
- cpp/tensorrt_llm/batch_manager/x86_64-linux-gnu/libtensorrt_llm_batch_manager_static.a+2-2
- cpp/tensorrt_llm/batch_manager/x86_64-linux-gnu/libtensorrt_llm_batch_manager_static.pre_cxx11.a+2-2
- cpp/tensorrt_llm/kernels/decoderMaskedMultiheadAttention/decoderMaskedMultiheadAttentionTemplate.h+16-9
- cpp/tensorrt_llm/kernels/decoderMaskedMultiheadAttentionUtils.h+11-1
- cpp/tensorrt_llm/runtime/gptDecoder.cpp+38-43
- cpp/tensorrt_llm/runtime/gptDecoderBatch.cpp+2-5
- cpp/tensorrt_llm/runtime/gptSession.cpp+246-301
- cpp/tensorrt_llm/runtime/runtimeBuffers.cpp+18-18
- cpp/tensorrt_llm/runtime/runtimeBuffers.h+10-4
- cpp/tensorrt_llm/runtime/statefulGptDecoder.cpp+1-8
- cpp/tensorrt_llm/runtime/statefulGptDecoder.h+1-1
- cpp/tests/resources/scripts/test_cpp.py+183-92
- cpp/tests/runtime/gptSessionTest.cpp+14-11
- docker/Makefile+3
- docker/common/install_base.sh+9-1
- docs/source/batch_manager.md+16-4
- examples/chatglm6b/convert.py-186
- examples/chatglm6b/exportLM.py-25
- examples/chatglm6b/hf_chatglm6b_convert.py-212
- examples/chatglm6b/modeling_chatglm.py-1.6k
- examples/falcon/build.py+5-1
- examples/gpt/build.py+6-1
- examples/gpt/run.py+3
- examples/gpt/summarize.py+3
- examples/gptj/build.py+5-1
- examples/llama/build.py+5-1
- examples/llama/summarize.py+1-1
- examples/mpt/build.py+5-1
- tensorrt_llm/runtime/session.py+2-4
- tests/attention/test_gpt_attention.py+69-56
Diff for: tools/environment_setup.sh
+1-1
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
33 | 33 |
| |
34 | 34 |
| |
35 | 35 |
| |
36 |
| - | |
| 36 | + | |
37 | 37 |
| |
38 | 38 |
| |
39 | 39 |
| |
|
Diff for: tools/fill_template.py
+1-1
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
27 | 27 |
| |
28 | 28 |
| |
29 | 29 |
| |
30 |
| - | |
| 30 | + | |
31 | 31 |
| |
32 | 32 |
| |
33 | 33 |
| |
|
Diff for: tools/gen_trtllm_dockerfile.py
+1-1
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
33 | 33 |
| |
34 | 34 |
| |
35 | 35 |
| |
36 |
| - | |
| 36 | + | |
37 | 37 |
| |
38 | 38 |
| |
39 | 39 |
| |
|
0 commit comments