GPTAttentionPlugin missing declaration of fields #2685

idantene · 2025-01-13T10:17:00Z

Hey,

We've been experimenting with the GPTAttentionPlugin and noticed that the plugin is missing some fields in its creator class.
In the python side, this has little effect, as the fields are provided directly, but when loading from an onnx model, the field names are used to populate the fields from the node attributes.

More specifically, in GPTAttentionPluginCreatorCommon here and in GPTAttentionPluginCreator here, we noticed several fields are missing: layer_idx, use_logn_scaling, layer_idx_in_cache_pool, block_sparse_block_size, block_sparse_homo_head_pattern, block_sparse_num_local_blocks and block_sparse_vertical_stride (there might be more, these were the obvious ones).

As mentioned, in the onnx parser code, when loading the fields, the field names specified in the creator classes are used to load the attributes.

As a result, when we try to load an onnx model that has the GPTAttentionPlugin as a node, it fails on creating the plugin class, since those fields are missing, and the FieldParser class then returns a std::nullopt (here).

Sample log trying to load an onnx file with GPTAttentionPlugin

config.get_memory_pool_limit(trt.MemoryPoolType.WORKSPACE) = 42298834944 Byte (39.4 GiB)
[01/10/2025-05:45:04] [TRT] [I] ----------------------------------------------------------------
[01/10/2025-05:45:04] [TRT] [I] Input filename: model.onnx
[01/10/2025-05:45:04] [TRT] [I] ONNX IR version: 0.0.8
[01/10/2025-05:45:04] [TRT] [I] Opset version: 17
[01/10/2025-05:45:04] [TRT] [I] Producer name: pytorch
[01/10/2025-05:45:04] [TRT] [I] Producer version: 2.4.1
[01/10/2025-05:45:04] [TRT] [I] Domain:
[01/10/2025-05:45:04] [TRT] [I] Model version: 0
[01/10/2025-05:45:04] [TRT] [I] Doc string:
[01/10/2025-05:45:04] [TRT] [I] ----------------------------------------------------------------
[01/10/2025-05:45:04] [TRT] [W] ModelImporter.cpp:459: Make sure input input_ids has Int64 binding.
[01/10/2025-05:45:04] [TRT] [W] ModelImporter.cpp:459: Make sure input host_kv_cache_pool_pointers has Int64 binding.
[01/10/2025-05:45:04] [TRT] [W] ModelImporter.cpp:459: Make sure input host_runtime_perf_knobs has Int64 binding.
[01/10/2025-05:45:04] [TRT] [W] ModelImporter.cpp:459: Make sure input host_context_progress has Int64 binding.
[01/10/2025-05:45:12] [TRT] [I] No checker registered for op: GPTAttention. Attempting to check as plugin.
[01/10/2025-05:45:12] [TRT] [I] No importer registered for op: GPTAttention. Attempting to import as plugin.
[01/10/2025-05:45:12] [TRT] [I] Searching for plugin: GPTAttention, plugin_version: 1, plugin_namespace: tensorrt_llm
[TensorRT-LLM][ERROR] std::bad_optional_access: bad optional access
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: in_flight_batching
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: cp_group
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: skip_attn
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: qk_rope_head_dim
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: kv_lora_rank
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: q_lora_rank
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: is_mla_enabled
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: spec_decoding_max_generation_length
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: qk_nope_head_dim
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: spec_decoding_is_generation_length_variable
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: is_spec_decoding_enabled
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: use_fp8_context_fmha
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: max_distance
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: use_cache
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: do_cross_attention
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: use_paged_context_fmha
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: rotary_embedding_scale_type
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: rotary_embedding_max_positions
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: rotary_embedding_original_max_positions
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: rotary_embedding_base
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: position_embedding_type
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: attn_logit_softcapping_scale
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: cp_size
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: unidirectional
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: head_size
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: qkv_bias_enabled
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: dense_context_fmha
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: type_id
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: rotary_embedding_dim
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: has_full_attention_mask
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: num_heads
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: rotary_embedding_short_m_scale
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: rotary_embedding_long_m_scale
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: rotary_embedding_scale
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: mask_type
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: cp_rank
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: tp_rank
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: tp_size
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: kv_cache_quant_mode
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: v_head_dim
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: unfuse_qkv_gemm
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: num_kv_heads
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: context_fmha_type
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: pos_shift_enabled
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: paged_kv_cache
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: vision_length
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: max_context_length
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: vision_start
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: q_scaling
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: remove_input_padding
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: tokens_per_block
[01/10/2025-05:45:12] [TRT] [E] ModelImporter.cpp:963: While parsing node number 66 [GPTAttention -> "/model/layers.0/self_attn/Reshape_7_output_0"]:
[01/10/2025-05:45:12] [TRT] [E] ModelImporter.cpp:964: --- Begin node ---
input: "/model/layers.0/self_attn/_plugin_input"
input: "sequence_length"
input: "host_past_key_value_lengths"
input: "host_max_attention_window_sizes"
input: "host_sink_token_length"
input: "context_lengths"
input: "cache_indirection"
input: "host_request_types"
input: "kv_cache_block_offsets"
input: "host_kv_cache_block_offsets"
input: "host_kv_cache_pool_pointers"
input: "host_kv_cache_pool_mapping"
input: "rotary_inv_freq"
input: "rotary_cos_sin"
input: "host_runtime_perf_knobs"
input: "host_context_progress"
output: "/model/layers.0/self_attn/Reshape_7_output_0"
name: "/model/layers.0/self_attn/_GPTAttention"
op_type: "GPTAttention"
...

On an unrelated note, we also noticed that the GPTAttentionPluginCreatorCommon defines the plugin fields with unexpected lengths, such as -1, 0, and occasionally float lengths (0.0, 1.0), contrary to the documentation for the PluginField struct. We're aware the length attribute is not used directly for GPTAttentionPlugin, but are left wondering if there's any specific meaning/intent behind these lengths?

Thanks in advance!

The text was updated successfully, but these errors were encountered:

nv-guomingz · 2025-01-14T14:32:15Z

Hi @MartinMarciniszyn could u please take a look this question?

MartinMarciniszyn · 2025-01-21T09:11:02Z

@yuxianq will look into this.

yuxianq · 2025-01-22T01:25:12Z

@idantene I can help to fix it. Could you provide your test case so I can reproduce it and validate my fix locally.
As for the PluginField issue, these length attributes are unused since the data attribute is nullptr. Some developer may mistakenly treat the length attribute as default value. I will fix it.

jl749 · 2025-01-23T03:50:51Z

@yuxianq
I have added a PR with the fix. Could you please review it?

yuxianq · 2025-01-23T05:29:21Z

@jl749 Thanks for your contribution. LGTM, I will test it in our CI.

idantene · 2025-01-23T06:03:49Z

Thank you for the explanation regarding the length attributes @yuxianq.
My colleague @jl749 has attached a test case in the PR with the patch fixing this issue.

yuxianq · 2025-02-14T03:01:33Z

Thanks for your contribution to TRT-LLM. We've merged your changes into our internal branch for the upcoming weekly release. You will see the change in main branch next week.

nv-guomingz assigned MartinMarciniszyn Jan 14, 2025

jl749 mentioned this issue Jan 23, 2025

fix: gptattentionplugin onnxparser compatability #2712

Closed

yuxianq closed this as completed Feb 14, 2025

kaiyux mentioned this issue Feb 18, 2025

Update TensorRT-LLM #2792

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GPTAttentionPlugin missing declaration of fields #2685

GPTAttentionPlugin missing declaration of fields #2685

idantene commented Jan 13, 2025

nv-guomingz commented Jan 14, 2025

Uh oh!

MartinMarciniszyn commented Jan 21, 2025

Uh oh!

yuxianq commented Jan 22, 2025

Uh oh!

jl749 commented Jan 23, 2025

Uh oh!

yuxianq commented Jan 23, 2025

Uh oh!

idantene commented Jan 23, 2025

Uh oh!

yuxianq commented Feb 14, 2025

Uh oh!

GPTAttentionPlugin missing declaration of fields #2685

GPTAttentionPlugin missing declaration of fields #2685

Comments

idantene commented Jan 13, 2025

nv-guomingz commented Jan 14, 2025

Uh oh!

MartinMarciniszyn commented Jan 21, 2025

Uh oh!

yuxianq commented Jan 22, 2025

Uh oh!

jl749 commented Jan 23, 2025

Uh oh!

yuxianq commented Jan 23, 2025

Uh oh!

idantene commented Jan 23, 2025

Uh oh!

yuxianq commented Feb 14, 2025

Uh oh!