-
Notifications
You must be signed in to change notification settings - Fork 1.5k
GPTAttentionPlugin missing declaration of fields #2685
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi @MartinMarciniszyn could u please take a look this question? |
@yuxianq will look into this. |
@idantene I can help to fix it. Could you provide your test case so I can reproduce it and validate my fix locally. |
@yuxianq |
@jl749 Thanks for your contribution. LGTM, I will test it in our CI. |
Thanks for your contribution to TRT-LLM. We've merged your changes into our internal branch for the upcoming weekly release. You will see the change in main branch next week. |
Hey,
We've been experimenting with the
GPTAttentionPlugin
and noticed that the plugin is missing some fields in its creator class.In the python side, this has little effect, as the fields are provided directly, but when loading from an onnx model, the field names are used to populate the fields from the node attributes.
More specifically, in
GPTAttentionPluginCreatorCommon
here and inGPTAttentionPluginCreator
here, we noticed several fields are missing:layer_idx
,use_logn_scaling
,layer_idx_in_cache_pool
,block_sparse_block_size
,block_sparse_homo_head_pattern
,block_sparse_num_local_blocks
andblock_sparse_vertical_stride
(there might be more, these were the obvious ones).As mentioned, in the onnx parser code, when loading the fields, the field names specified in the creator classes are used to load the attributes.
As a result, when we try to load an onnx model that has the
GPTAttentionPlugin
as a node, it fails on creating the plugin class, since those fields are missing, and theFieldParser
class then returns astd::nullopt
(here).Sample log trying to load an onnx file with GPTAttentionPlugin
On an unrelated note, we also noticed that the
GPTAttentionPluginCreatorCommon
defines the plugin fields with unexpected lengths, such as-1
,0
, and occasionally float lengths (0.0
,1.0
), contrary to the documentation for thePluginField
struct. We're aware thelength
attribute is not used directly forGPTAttentionPlugin
, but are left wondering if there's any specific meaning/intent behind these lengths?Thanks in advance!
The text was updated successfully, but these errors were encountered: