How to dequantize a specific layer from gguf into torch tensor? #12467

SweetGUOguo · 2025-03-19T13:46:08Z

SweetGUOguo
Mar 19, 2025

Hello, I am a beginner, and I appreciate your work. I would like to ask, if I want to dequantize a specific layer of a GGUF model and save it as a PyTorch tensor, how should I proceed? I tried step-by-step debugging of llama.cpp/examples/main/main.cpp, but it was difficult to achieve the desired result. Thank you!

Answered by compilade

Mar 23, 2025

I would like to ask, if I want to dequantize a specific layer of a GGUF model and save it as a PyTorch tensor, how should I proceed?

You can use the gguf-py library, which has dequantization support for most of the quantization types.

$ pip install gguf

(Assuming you also have PyTorch in your environment)

import gguf
import torch

reader: gguf.GGUFReader = gguf.GGUFReader("path/to/model.gguf")
# Assuming you know which tensor it is
index: int = 3
tensor: gguf.ReaderTensor = reader.get_tensor(index)
torch_tensor = torch.from_numpy(gguf.dequantize(tensor.data, tensor.tensor_type))

I did not test this, but hopefully this is close enough to what is needed.

Hmm, I think this should be made m…

View full answer

ejrydhfs · 2025-03-19T19:04:50Z

ejrydhfs
Mar 19, 2025

Do you mean extract? Quantization is a lossy process because detail is lost so it is not reversible.

2 replies

SweetGUOguo Mar 20, 2025
Author

Hi, thank you. And I want to see the gap between the quantization value and the before value.

ejrydhfs Mar 22, 2025

This could be a feature request here: https://github.com/gpustack/gguf-parser-go/issues

compilade · 2025-03-23T03:10:13Z

compilade
Mar 23, 2025
Collaborator

I would like to ask, if I want to dequantize a specific layer of a GGUF model and save it as a PyTorch tensor, how should I proceed?

You can use the gguf-py library, which has dequantization support for most of the quantization types.

$ pip install gguf

(Assuming you also have PyTorch in your environment)

import gguf
import torch

reader: gguf.GGUFReader = gguf.GGUFReader("path/to/model.gguf")
# Assuming you know which tensor it is
index: int = 3
tensor: gguf.ReaderTensor = reader.get_tensor(index)
torch_tensor = torch.from_numpy(gguf.dequantize(tensor.data, tensor.tensor_type))

I did not test this, but hopefully this is close enough to what is needed.

Hmm, I think this should be made more convenient eventually. Notably, searching a tensor by name could be made easier (ideally there should be a way to get a dict[str, np.ndarray] or something).

2 replies

SweetGUOguo Mar 24, 2025
Author

Thanks a lot!!!! It does help!

SweetGUOguo Mar 24, 2025
Author

Hello I have one more question, I wonder if there is a similar method for quantizing a specific layer of a .npy to quantize into q4k. Because when I dive into 'gguf/quants.py', there is no implements for q4k? Will this be supported in the future?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to dequantize a specific layer from gguf into torch tensor? #12467

{{title}}

Replies: 2 comments 4 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

How to dequantize a specific layer from gguf into torch tensor? #12467

SweetGUOguo Mar 19, 2025

Replies: 2 comments · 4 replies

ejrydhfs Mar 19, 2025

SweetGUOguo Mar 20, 2025 Author

ejrydhfs Mar 22, 2025

compilade Mar 23, 2025 Collaborator

SweetGUOguo Mar 24, 2025 Author

SweetGUOguo Mar 24, 2025 Author

SweetGUOguo
Mar 19, 2025

Replies: 2 comments 4 replies

ejrydhfs
Mar 19, 2025

SweetGUOguo Mar 20, 2025
Author

compilade
Mar 23, 2025
Collaborator

SweetGUOguo Mar 24, 2025
Author

SweetGUOguo Mar 24, 2025
Author