Skip to content

[Feature]: Add support for attention score output #11365

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
1 task done
WoutDeRijck opened this issue Dec 20, 2024 · 7 comments
Open
1 task done

[Feature]: Add support for attention score output #11365

WoutDeRijck opened this issue Dec 20, 2024 · 7 comments
Labels
feature request New feature or request unstale Recieved activity after being labelled stale

Comments

@WoutDeRijck
Copy link

WoutDeRijck commented Dec 20, 2024

🚀 The feature, motivation and pitch

Problem

vLLM currently doesn't provide access to attention scores during inference, which are essential for model analysis and interpretability research. #11862

Feature Request

Add the ability to retrieve attention scores during model inference, similar to HuggingFace's output_attentions=True parameter.

Motivation

Need to analyze token-level relationships in model outputs
Required for building visualization tools and debugging model behavior
Critical for research into attention mechanisms

Alternatives

No response

Additional context

No response

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
@WoutDeRijck WoutDeRijck added the feature request New feature or request label Dec 20, 2024
@Dineshkumar-Anandan-ZS0367

You are asking output_attentions=True or return_cross_attentions=True for getting coordinates right.

These only given by vision encoder decoder models or cross encoder models.

Which model?

@WoutDeRijck
Copy link
Author

WoutDeRijck commented Jan 9, 2025

I don't mean to get coordinates. I am using Llama-3.1-8b, let's say I want to extract data out of the input context, then I need the attention scores to be able to visualize where the model is looking. (Pure text-based, no vision)

These are ofcourse also present in decoder-only models.

@Dineshkumar-Anandan-ZS0367

Apologise by mistakes. I have integrated score using tensor logits already. Thanks!

@WoutDeRijck
Copy link
Author

I do not need the logits as well. I need the attention scores.

@HuiSiqi
Copy link

HuiSiqi commented Jan 15, 2025

Any update of this? I also need to visualize the attention scores of decoder-based models.

Copy link

This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!

@github-actions github-actions bot added the stale Over 90 days of inactivity label Apr 16, 2025
@oneonlee
Copy link

Any update on this?

@github-actions github-actions bot added unstale Recieved activity after being labelled stale and removed stale Over 90 days of inactivity labels Apr 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request unstale Recieved activity after being labelled stale
Projects
None yet
Development

No branches or pull requests

4 participants