Skip to content

GPTAttentionPlugin missing declaration of fields #2685

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
idantene opened this issue Jan 13, 2025 · 7 comments · Fixed by #2792
Closed

GPTAttentionPlugin missing declaration of fields #2685

idantene opened this issue Jan 13, 2025 · 7 comments · Fixed by #2792
Assignees

Comments

@idantene
Copy link

Hey,

We've been experimenting with the GPTAttentionPlugin and noticed that the plugin is missing some fields in its creator class.
In the python side, this has little effect, as the fields are provided directly, but when loading from an onnx model, the field names are used to populate the fields from the node attributes.

More specifically, in GPTAttentionPluginCreatorCommon here and in GPTAttentionPluginCreator here, we noticed several fields are missing: layer_idx, use_logn_scaling, layer_idx_in_cache_pool, block_sparse_block_size, block_sparse_homo_head_pattern, block_sparse_num_local_blocks and block_sparse_vertical_stride (there might be more, these were the obvious ones).

As mentioned, in the onnx parser code, when loading the fields, the field names specified in the creator classes are used to load the attributes.

As a result, when we try to load an onnx model that has the GPTAttentionPlugin as a node, it fails on creating the plugin class, since those fields are missing, and the FieldParser class then returns a std::nullopt (here).

Sample log trying to load an onnx file with GPTAttentionPlugin

config.get_memory_pool_limit(trt.MemoryPoolType.WORKSPACE) = 42298834944 Byte (39.4 GiB)
[01/10/2025-05:45:04] [TRT] [I] ----------------------------------------------------------------
[01/10/2025-05:45:04] [TRT] [I] Input filename: model.onnx
[01/10/2025-05:45:04] [TRT] [I] ONNX IR version: 0.0.8
[01/10/2025-05:45:04] [TRT] [I] Opset version: 17
[01/10/2025-05:45:04] [TRT] [I] Producer name: pytorch
[01/10/2025-05:45:04] [TRT] [I] Producer version: 2.4.1
[01/10/2025-05:45:04] [TRT] [I] Domain:
[01/10/2025-05:45:04] [TRT] [I] Model version: 0
[01/10/2025-05:45:04] [TRT] [I] Doc string:
[01/10/2025-05:45:04] [TRT] [I] ----------------------------------------------------------------
[01/10/2025-05:45:04] [TRT] [W] ModelImporter.cpp:459: Make sure input input_ids has Int64 binding.
[01/10/2025-05:45:04] [TRT] [W] ModelImporter.cpp:459: Make sure input host_kv_cache_pool_pointers has Int64 binding.
[01/10/2025-05:45:04] [TRT] [W] ModelImporter.cpp:459: Make sure input host_runtime_perf_knobs has Int64 binding.
[01/10/2025-05:45:04] [TRT] [W] ModelImporter.cpp:459: Make sure input host_context_progress has Int64 binding.
[01/10/2025-05:45:12] [TRT] [I] No checker registered for op: GPTAttention. Attempting to check as plugin.
[01/10/2025-05:45:12] [TRT] [I] No importer registered for op: GPTAttention. Attempting to import as plugin.
[01/10/2025-05:45:12] [TRT] [I] Searching for plugin: GPTAttention, plugin_version: 1, plugin_namespace: tensorrt_llm
[TensorRT-LLM][ERROR] std::bad_optional_access: bad optional access
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: in_flight_batching
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: cp_group
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: skip_attn
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: qk_rope_head_dim
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: kv_lora_rank
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: q_lora_rank
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: is_mla_enabled
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: spec_decoding_max_generation_length
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: qk_nope_head_dim
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: spec_decoding_is_generation_length_variable
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: is_spec_decoding_enabled
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: use_fp8_context_fmha
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: max_distance
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: use_cache
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: do_cross_attention
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: use_paged_context_fmha
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: rotary_embedding_scale_type
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: rotary_embedding_max_positions
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: rotary_embedding_original_max_positions
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: rotary_embedding_base
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: position_embedding_type
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: attn_logit_softcapping_scale
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: cp_size
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: unidirectional
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: head_size
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: qkv_bias_enabled
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: dense_context_fmha
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: type_id
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: rotary_embedding_dim
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: has_full_attention_mask
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: num_heads
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: rotary_embedding_short_m_scale
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: rotary_embedding_long_m_scale
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: rotary_embedding_scale
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: mask_type
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: cp_rank
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: tp_rank
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: tp_size
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: kv_cache_quant_mode
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: v_head_dim
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: unfuse_qkv_gemm
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: num_kv_heads
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: context_fmha_type
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: pos_shift_enabled
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: paged_kv_cache
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: vision_length
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: max_context_length
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: vision_start
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: q_scaling
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: remove_input_padding
[TensorRT-LLM][ERROR] Parameter check failed at: /home/jenkins/agent/workspace/LLM/main/L0_Test-x86_64/tensorrt_llm/cpp/tensorrt_llm/plugins/common/plugin.cpp::~PluginFieldParser::349, condition: unused plugin field with name: tokens_per_block
[01/10/2025-05:45:12] [TRT] [E] ModelImporter.cpp:963: While parsing node number 66 [GPTAttention -> "/model/layers.0/self_attn/Reshape_7_output_0"]:
[01/10/2025-05:45:12] [TRT] [E] ModelImporter.cpp:964: --- Begin node ---
input: "/model/layers.0/self_attn/_plugin_input"
input: "sequence_length"
input: "host_past_key_value_lengths"
input: "host_max_attention_window_sizes"
input: "host_sink_token_length"
input: "context_lengths"
input: "cache_indirection"
input: "host_request_types"
input: "kv_cache_block_offsets"
input: "host_kv_cache_block_offsets"
input: "host_kv_cache_pool_pointers"
input: "host_kv_cache_pool_mapping"
input: "rotary_inv_freq"
input: "rotary_cos_sin"
input: "host_runtime_perf_knobs"
input: "host_context_progress"
output: "/model/layers.0/self_attn/Reshape_7_output_0"
name: "/model/layers.0/self_attn/_GPTAttention"
op_type: "GPTAttention"
...


On an unrelated note, we also noticed that the GPTAttentionPluginCreatorCommon defines the plugin fields with unexpected lengths, such as -1, 0, and occasionally float lengths (0.0, 1.0), contrary to the documentation for the PluginField struct. We're aware the length attribute is not used directly for GPTAttentionPlugin, but are left wondering if there's any specific meaning/intent behind these lengths?

Thanks in advance!

@nv-guomingz
Copy link
Collaborator

Hi @MartinMarciniszyn could u please take a look this question?

@MartinMarciniszyn
Copy link
Collaborator

@yuxianq will look into this.

@yuxianq
Copy link
Collaborator

yuxianq commented Jan 22, 2025

@idantene I can help to fix it. Could you provide your test case so I can reproduce it and validate my fix locally.
As for the PluginField issue, these length attributes are unused since the data attribute is nullptr. Some developer may mistakenly treat the length attribute as default value. I will fix it.

@jl749
Copy link
Contributor

jl749 commented Jan 23, 2025

@yuxianq
I have added a PR with the fix. Could you please review it?

@yuxianq
Copy link
Collaborator

yuxianq commented Jan 23, 2025

@jl749 Thanks for your contribution. LGTM, I will test it in our CI.

@idantene
Copy link
Author

Thank you for the explanation regarding the length attributes @yuxianq.
My colleague @jl749 has attached a test case in the PR with the patch fixing this issue.

@yuxianq
Copy link
Collaborator

yuxianq commented Feb 14, 2025

Thanks for your contribution to TRT-LLM. We've merged your changes into our internal branch for the upcoming weekly release. You will see the change in main branch next week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants