Name	Name	Last commit message	Last commit date
parent directory ..
bitsandbytes	bitsandbytes	update requirements for llm inference and bitsandbytes. (#5498)	Apr 2, 2025
fine-tuning	fine-tuning	fix deepspeed import (#5473)	Mar 26, 2025
inference	inference	sync doc update from v2.7.10+xpu (#5586)	May 6, 2025
tools	tools	update dependency version json file with os specific pytorch versions…	Apr 23, 2025
training	training	sync doc update from v2.7.10+xpu (#5586)	May 6, 2025
Dockerfile	Dockerfile	enable os-specific oneapi dependency versioning (#5467) + compile bun…	Mar 31, 2025
README.md	README.md	sync doc update from v2.7.10+xpu (#5586)	May 6, 2025
requirements.txt	requirements.txt	sync 2.6.10 code changes back to master (#5410)	Mar 12, 2025

LLM Optimization Overview

Here you can find examples for large language models (LLM) text generation. These scripts:

Note

New Llama models like Llama3.2-1B, Llama3.2-3B and Llama3.3-7B are also supported from release v2.7.10+xpu.

Include both inference/finetuning(lora)/bitsandbytes(qlora-finetuning).
Include both single instance and distributed (DeepSpeed) use cases for FP16 optimization.
Support Llama, GPT-J, Qwen, OPT, Bloom model families and some other models such as Baichuan2-13B and Phi3-mini.
Cover model generation inference with low precision cases for different models with best performance and accuracy (fp16 AMP and weight only quantization)

Environment Setup

[Recommended] Docker-based environment setup with prebuilt release wheel files

# Get the Intel® Extension for PyTorch* source code
git clone https://github.com/intel/intel-extension-for-pytorch.git
cd intel-extension-for-pytorch
git checkout release/xpu/2.7.10
git submodule sync
git submodule update --init --recursive

# Build an image with the provided Dockerfile by installing Intel® Extension for PyTorch* with prebuilt wheels
docker build -f examples/gpu/llm/Dockerfile -t ipex-llm:xpu .

# Run the container with command below
docker run -it --rm --privileged -v /dev/dri/by-path:/dev/dri/by-path ipex-llm:xpu bash

# When the command prompt shows inside the docker container, enter llm examples directory
cd llm

# Activate environment variables
source ./tools/env_activate.sh  [inference|fine-tuning|bitsandbytes]
# on Windows, use env_activate.bat instead
call .\tools\env_activate.bat [inference|fine-tuning|bitsandbytes]

Conda-based environment setup with prebuilt release wheel files

Make sure the driver packages are installed. Refer to Installation Guide.

# Get the Intel® Extension for PyTorch* source code
git clone https://github.com/intel/intel-extension-for-pytorch.git
cd intel-extension-for-pytorch
git checkout release/xpu/2.7.10
git submodule sync
git submodule update --init --recursive

# Make sure you have GCC >= 11 is installed on your system.
# Create a conda environment
conda create -n llm python=3.10 -y
conda activate llm
# Setup the environment with the provided script
cd examples/gpu/llm
# If you want to install Intel® Extension for PyTorch\* with prebuilt wheels, use the commands below:
python ./tools/env_setup.py --setup --deploy
conda deactivate
conda activate llm
source ./tools/env_activate.sh  [inference|fine-tuning|bitsandbytes]
# on Windows, use env_activate.bat instead
call .\tools\env_activate.bat [inference|fine-tuning|bitsandbytes]

Docker-based environment setup with compilation from source

# Get the Intel® Extension for PyTorch* source code
git clone https://github.com/intel/intel-extension-for-pytorch.git
cd intel-extension-for-pytorch
git checkout xpu-main
git submodule sync
git submodule update --init --recursive

# Build an image with the provided Dockerfile by compiling Intel® Extension for PyTorch* from source
docker build -f examples/gpu/llm/Dockerfile --build-arg COMPILE=ON -t ipex-llm:xpu .

# Run the container with command below
docker run -it --rm --privileged -v /dev/dri/by-path:/dev/dri/by-path ipex-llm:xpu bash

# When the command prompt shows inside the docker container, enter llm examples directory
cd llm

# Activate environment variables
source ./tools/env_activate.sh [inference|fine-tuning|bitsandbytes]
# on Windows, use env_activate.bat instead
call .\tools\env_activate.bat [inference|fine-tuning|bitsandbytes]

Conda-based environment setup with compilation from source

Make sure the driver and Base Toolkit are installed. Refer to Installation Guide. Select the package labeled 'source' for the build from source instructions.

# Get the Intel® Extension for PyTorch* source code
git clone https://github.com/intel/intel-extension-for-pytorch.git
cd intel-extension-for-pytorch
git checkout xpu-main
git submodule sync
git submodule update --init --recursive

# Make sure you have GCC >= 11 is installed on your system.
# Create a conda environment
conda create -n llm python=3.10 -y
conda activate llm
# Setup the environment with the provided script
cd examples/gpu/llm
# If you want to install Intel® Extension for PyTorch\* from source, use the commands below:
# e.g. python ./tools/env_setup.py --setup --install-pytorch compile --aot pvc --oneapi-root-dir /opt/intel/oneapi --deploy
python ./tools/env_setup.py --setup --install-pytorch compile --aot <AOT> --oneapi-root-dir <ONEAPI_ROOT_DIR> --deploy

conda deactivate
conda activate llm
source ./tools/env_activate.sh [inference|fine-tuning|bitsandbytes]
# on Windows, use env_activate.bat instead
call .\tools\env_activate.bat [inference|fine-tuning|bitsandbytes]

where

AOT is a text string to enable Ahead-Of-Time compilation for specific GPU models. For example 'pvc,ats-m150' for the Platform Intel® Data Center GPU Max Series, Intel® Data Center GPU Flex Series and Intel® Arc™ A-Series Graphics (A770). Check tutorial for details.

How To Run LLM with ipex.llm

Inference and fine-tuning are supported in individual directories.

For inference example scripts, visit the inference directory.

For fine-tuning example scripts, visit the fine-tuning directory.

For fine-tuning with quantized model, visit the bitsandbytes directory.

For training example scripts, visit the training directory.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Files

llm

llm

README.md

LLM Optimization Overview

Environment Setup

[Recommended] Docker-based environment setup with prebuilt release wheel files

Conda-based environment setup with prebuilt release wheel files

Docker-based environment setup with compilation from source

Conda-based environment setup with compilation from source

How To Run LLM with ipex.llm

Files

llm

Directory actions

More options

Directory actions

More options

Latest commit

History

llm

Folders and files

parent directory

README.md

LLM Optimization Overview

Environment Setup

[Recommended] Docker-based environment setup with prebuilt release wheel files

Conda-based environment setup with prebuilt release wheel files

Docker-based environment setup with compilation from source

Conda-based environment setup with compilation from source

How To Run LLM with ipex.llm