Skip to content

attn_bias not aligned & some questions regarding float16 #468

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
MM-IR opened this issue Jul 15, 2023 · 1 comment · Fixed by #834
Closed

attn_bias not aligned & some questions regarding float16 #468

MM-IR opened this issue Jul 15, 2023 · 1 comment · Fixed by #834
Labels
bug Something isn't working

Comments

@MM-IR
Copy link

MM-IR commented Jul 15, 2023

Hi,

  1. When playing with MPT-7b models, I frequently meet the issues of "attn_bias not aligned", with tensor_parallel_size - 2, how do alleviate this issue?

  2. Besides, I just find that your default model loading scripts load float16 versions, for fair evaluation, is it necessary to switch to float32?

Thanks very much in advance!

@WoosukKwon
Copy link
Collaborator

Hi @MM-IR, sorry for the very late response. I believe the bug you reported was fixed by #834.

For the second question, I think it's pretty common to use FP16 for evaluating models, because its impact on model accuracy is negligible. For example, HF open LLM leaderboard does not support FP32, but only supports FP16, BF16, and 4/8-bit quantized formats.

pi314ever pushed a commit to pi314ever/vllm that referenced this issue Nov 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants