Skip to content

RoPE in float32 precision #870

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 0 commits into from
Closed

RoPE in float32 precision #870

wants to merge 0 commits into from

Conversation

imoneoi
Copy link
Contributor

@imoneoi imoneoi commented Aug 25, 2023

Fixes #863

@esmeetu
Copy link
Member

esmeetu commented Sep 5, 2023

@imoneoi Many thanks for this PR.

@WoosukKwon Hope you can consider this PR, this will truely support codellama with float16 dtype, and truely support Codellama family.
And also solve many issues related with codellama gibberish output.

@esmeetu
Copy link
Member

esmeetu commented Sep 5, 2023

The changes indeed will affect other model's' performance. I think it's better to change kernel dynamically by 'base' value. If the model base is larger than 10000, we should use this PR strategy, otherwise as usual.

@WoosukKwon
Copy link
Collaborator

WoosukKwon commented Sep 10, 2023

Hi @imoneoi, I've modified and closed the PR by mistake. I'm really sorry. Because the PR is closed, I don't have a permission to revert back your repo. Your original PR can be found here.

I created a new PR #1004, which uses FP32 for initializing RoPE but does not use FP32 in the kernel. I find this more aligned with the HF's implementation. If the precision issue persists after the PR, we will revisit your PR and modify the kernel. Thanks again for your issue and PR. For sure, you will be recognized as an author of #1004.

@imoneoi
Copy link
Contributor Author

imoneoi commented Sep 16, 2023

@WoosukKwon It doesn't matter. I re-opened the FP32 RoPE kernel PR in #1061. I'm doing an evaluation of code-llama, you might consider this if precision issues persist.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RoPE should be applied with float32
3 participants