IPAD

IPAD, iteratively pruning and distillation to shrink model size.

News or Update 🔥

[2024/05] We relase our code for IPAD.

Models we support

LLAMA
GLM
OPT

Introduction

Installation

Clone this repository and navigate to PainlessInferenceAcceleration

git clone https://github.com/alipay/PainlessInferenceAcceleration.git
cd PainlessInferenceAcceleration/ipad

Install Package

python setup.py install

Quick Start

Examples can be found in examples.

Citations

@inproceedings{10.1145/3589335.3648321, author = {Wang, Maolin and Zhao, Yao and Liu, Jiajia and Chen, Jingdong and Zhuang, Chenyi and Gu, Jinjie and Guo, Ruocheng and Zhao, Xiangyu}, title = {Large Multimodal Model Compression via Iterative Efficient Pruning and Distillation}, year = {2024}, isbn = {9798400701726}, publisher = {Association for Computing Machinery}, doi = {10.1145/3589335.3648321}, booktitle = {Companion Proceedings of the ACM Web Conference 2024}, pages = {235–244}, series = {WWW '24} }

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

IPAD

News or Update 🔥

Models we support

Introduction

Installation

Quick Start

Citations

Files

README.md

Latest commit

History

README.md

File metadata and controls

IPAD

News or Update 🔥

Models we support

Introduction

Installation

Quick Start

Citations