Skip to content

Issue while loading MPT-7B-8K-instruct #832

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ramkrithik opened this issue Aug 22, 2023 · 1 comment · Fixed by #834
Closed

Issue while loading MPT-7B-8K-instruct #832

ramkrithik opened this issue Aug 22, 2023 · 1 comment · Fixed by #834
Assignees
Labels
bug Something isn't working

Comments

@ramkrithik
Copy link

Newbie Question:

I thought support for all MPT models is enabled by default, but when I try to load 8K variant of the MPT-7B model. I'm running into the following error while instantiation:

ValueError: Invalid shape for attention bias: torch.Size([32, 10, 10]) (expected (1, 32, 10, 10))
  query.shape: torch.Size([1, 10, 32, 128])
  key.shape  : torch.Size([1, 10, 32, 128])
  value.shape: torch.Size([1, 10, 32, 128])

Can someone help/ guide me on this issue?

@WoosukKwon WoosukKwon added the bug Something isn't working label Aug 22, 2023
@WoosukKwon WoosukKwon self-assigned this Aug 22, 2023
@WoosukKwon
Copy link
Collaborator

Hi @ramkrithik, thanks for reporting the bug. It seems the bug is due to the new release of xformers, which includes some breaking changes. I'm fixing this issue.

yma11 pushed a commit to yma11/vllm that referenced this issue Mar 4, 2025
https://docs.python.org/3/library/gc.html#gc.set_threshold

We see every X calls a gap of 100-1s depending on the benchmark when
garbage collector is called and increasing the default of this
multiplier is fixing the issue.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants