Issue search results

Filter by

107 results

(91 ms)inVahe1994/AQLM (press backspace or delete to remove)

Vahe1994/AQLM
peak memery

Why does peak memory decrease? After quantization, during the forward computation, the values are still restored (to fp16) for matrix calculations. So why does peak memory decrease?

maoshanwen

Opened
8 days ago

#170

Vahe1994/AQLM
Configurations for quantizing gemma-2b

Can you please provide the configurations for quantizing Gemma-2B? I only got PPL about 20 when I use the default configurations.

stale

shuangyichen

Opened
on Feb 24

#168

Vahe1994/AQLM
run goooooooooooood

https://github.com/werruww/run-Qwen2-72B-Instruct-on-16gb-vram/blob/main/succ_Qwen2_72B_AQLM.ipynb

stale

werruww

Opened
on Feb 16

#166

Vahe1994/AQLM
run but baaaaaaaaaaaad

https://github.com/werruww/run-Qwen2-72B-Instruct-on-16gb-vram/blob/main/suc_Qwen2_72B_Instruct_AQLM_PV_1bit_1x16%20(2).ipynb It worked but the results were disastrous and the answers were very bad and ...

stale

werruww

Opened
on Feb 16

#165

Vahe1994/AQLM
CPU function for the kernel

Hello, I am not sure if you have a CPU function that is equivalent to the GPU kernel Code1x16MatVec. This would help understand the kernel. Thanks.

stale

jinz2014

Opened
on Jan 16

#164

Vahe1994/AQLM
Is PV tuned Llama-3-8B 2bit quantization actually 2.27bit?

I am referring to checkpoint: https://huggingface.co/ISTA-DASLab/Meta-Llama-3-8B-AQLM-PV-2Bit-1x16, which gets 6.99 perplexity, referred to as 2-bit quantization in the PV tuning paper. Llama3-8B has ...

stale

usamec

Opened
on Dec 29, 2024

#163

Vahe1994/AQLM
Question regarding hyperparameter `delta_decay` and STE baseline in PV-Tuning paper

@Vahe1994 @galqiwi @BlackSamorez @justheuristic Hello, thank you for the awesome work, and actively engaging in answering the issues. I have two major questions, which is as follows: 1. According to ...

stale

jusjinuk

Opened
on Dec 27, 2024

#162

Vahe1994/AQLM
benchmark/matmul_benchmark.py Running Error

Hello! When I running the benchmark file matmul_benchmark.py, It produces the error in the line 105: matmul = CUDA_KERNEL.code1x16_matmat if args.nbits_per_codebook == 16 else CUDA_KERNEL.code2x8_matmat ...

KellyGong

Opened
on Dec 25, 2024

#161

Vahe1994/AQLM
Llama-3.1-70B-Instruct-AQLM-PV-2Bit run in colab t4

from transformers import pipeline import os Set environment variable for PyTorch memory management os.environ[ PYTORCH_CUDA_ALLOC_CONF ] = expandable_segments:True messages = [ { role : user , content ...

stale

kim90000

Opened
on Dec 21, 2024

#160

Vahe1994/AQLM
1bit run good

from transformers import pipeline messages = [ { role : user , content : Who are you? }, ] pipe = pipeline( text-generation , model= ISTA-DASLab/Qwen2-72B-AQLM-PV-1bit-1x16 , trust_remote_code=True, device_map= ...

stale

werruww

Opened
on Dec 19, 2024

#159

Learn how you can use GitHub Issues to plan and track your work.

Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub Issues

ProTip!

Press the

key to activate the search input again and adjust your query.

Learn how you can use GitHub Issues to plan and track your work.

Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub Issues

ProTip!

Press the

key to activate the search input again and adjust your query.

Languages

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Filter by

State

Advanced

Vahe1994/AQLM
peak memery

Vahe1994/AQLM
Configurations for quantizing gemma-2b

Vahe1994/AQLM
run goooooooooooood

Vahe1994/AQLM
run but baaaaaaaaaaaad

Vahe1994/AQLM
CPU function for the kernel

Vahe1994/AQLM
Is PV tuned Llama-3-8B 2bit quantization actually 2.27bit?

Vahe1994/AQLM
Question regarding hyperparameter `delta_decay` and STE baseline in PV-Tuning paper

Vahe1994/AQLM
benchmark/matmul_benchmark.py Running Error

Vahe1994/AQLM
Llama-3.1-70B-Instruct-AQLM-PV-2Bit run in colab t4

Vahe1994/AQLM
1bit run good

Learn how you can use GitHub Issues to plan and track your work.

Learn how you can use GitHub Issues to plan and track your work.

issues Search Results · repo:Vahe1994/AQLM language:Python

Filter by

State

Advanced

107 results

Learn how you can use GitHub Issues to plan and track your work.

Learn how you can use GitHub Issues to plan and track your work.