Issue while loading MPT-7B-8K-instruct #832

ramkrithik · 2023-08-22T20:39:54Z

Newbie Question:

I thought support for all MPT models is enabled by default, but when I try to load 8K variant of the MPT-7B model. I'm running into the following error while instantiation:

ValueError: Invalid shape for attention bias: torch.Size([32, 10, 10]) (expected (1, 32, 10, 10))
  query.shape: torch.Size([1, 10, 32, 128])
  key.shape  : torch.Size([1, 10, 32, 128])
  value.shape: torch.Size([1, 10, 32, 128])

Can someone help/ guide me on this issue?

The text was updated successfully, but these errors were encountered:

WoosukKwon · 2023-08-22T23:03:04Z

Hi @ramkrithik, thanks for reporting the bug. It seems the bug is due to the new release of xformers, which includes some breaking changes. I'm fixing this issue.

https://docs.python.org/3/library/gc.html#gc.set_threshold We see every X calls a gap of 100-1s depending on the benchmark when garbage collector is called and increasing the default of this multiplier is fixing the issue.

WoosukKwon added the bug Something isn't working label Aug 22, 2023

WoosukKwon self-assigned this Aug 22, 2023

WoosukKwon mentioned this issue Aug 23, 2023

Fix for breaking changes in xformers 0.0.21 #834

Merged

WoosukKwon closed this as completed in #834 Aug 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue while loading MPT-7B-8K-instruct #832

Issue while loading MPT-7B-8K-instruct #832

ramkrithik commented Aug 22, 2023

WoosukKwon commented Aug 22, 2023

Issue while loading MPT-7B-8K-instruct #832

Issue while loading MPT-7B-8K-instruct #832

Comments

ramkrithik commented Aug 22, 2023

WoosukKwon commented Aug 22, 2023