Skip to content

Set torch default dtype in a context manager #971

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 7, 2023

Conversation

Yard1
Copy link
Collaborator

@Yard1 Yard1 commented Sep 6, 2023

Modifying the global state is unexpected and can lead to issues later (regarding eg. precision or when testing).

@Yard1
Copy link
Collaborator Author

Yard1 commented Sep 6, 2023

cc @zhuohan123 @WoosukKwon lmk if this makes sense!

@WoosukKwon WoosukKwon self-requested a review September 7, 2023 06:19
Copy link
Collaborator

@WoosukKwon WoosukKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I didn't notice this problem. Thanks for the fix.

@WoosukKwon WoosukKwon merged commit 005ba45 into vllm-project:main Sep 7, 2023
wanmok pushed a commit to wanmok/vllm that referenced this pull request Sep 7, 2023
@Yard1 Yard1 deleted the set_default_dtype_context branch September 7, 2023 23:00
@chu-tianxiang
Copy link
Contributor

This commit breaks PagedAttentionWithALiBi, where the bias from set_attn_bias is initialized to a wrong dtype, causing RuntimeError: invalid dtype for bias - should match query's dtype

liuyanyi pushed a commit to liuyanyi/vllm that referenced this pull request Sep 12, 2023
hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024
pi314ever pushed a commit to pi314ever/vllm that referenced this pull request Apr 3, 2025
vLLM Llama3.1-70B FP8 2K/2K throughput measurements shows good
improvement with blocksize 256 , hence adding this as an option to the
argument list
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants