How to dequantize a specific layer from gguf into torch tensor? #12467
-
Hello, I am a beginner, and I appreciate your work. I would like to ask, if I want to dequantize a specific layer of a GGUF model and save it as a PyTorch tensor, how should I proceed? I tried step-by-step debugging of llama.cpp/examples/main/main.cpp, but it was difficult to achieve the desired result. Thank you! |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 4 replies
-
Do you mean extract? Quantization is a lossy process because detail is lost so it is not reversible. |
Beta Was this translation helpful? Give feedback.
-
You can use the $ pip install gguf (Assuming you also have PyTorch in your environment) import gguf
import torch
reader: gguf.GGUFReader = gguf.GGUFReader("path/to/model.gguf")
# Assuming you know which tensor it is
index: int = 3
tensor: gguf.ReaderTensor = reader.get_tensor(index)
torch_tensor = torch.from_numpy(gguf.dequantize(tensor.data, tensor.tensor_type)) I did not test this, but hopefully this is close enough to what is needed. Hmm, I think this should be made more convenient eventually. Notably, searching a tensor by name could be made easier (ideally there should be a way to get a |
Beta Was this translation helpful? Give feedback.
You can use the
gguf-py
library, which has dequantization support for most of the quantization types.$ pip install gguf
(Assuming you also have PyTorch in your environment)
I did not test this, but hopefully this is close enough to what is needed.
Hmm, I think this should be made m…