Skip to content

Commit ecb3aee

Browse files
committed
olive example
1 parent 8ba5054 commit ecb3aee

File tree

6 files changed

+1660
-0
lines changed

6 files changed

+1660
-0
lines changed

.gitignore

+4
Original file line numberDiff line numberDiff line change
@@ -397,3 +397,7 @@ FodyWeavers.xsd
397397

398398
# JetBrains Rider
399399
*.sln.iml
400+
401+
# Olive folders
402+
cache
403+
models
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
# Fine-tune Phi3 using Olive
2+
3+
In this example you'll use Olive to:
4+
5+
1. Fine-tune a LoRA adapter to classify phrases into Sad, Joy, Fear, Surprise.
6+
1. Merge the adapter weights into the base model.
7+
1. Optimize and Quantize the model into `int4`.
8+
9+
We'll also show you how to inference the fine-tuned model using the ONNX Runtime (ORT) Generate API.
10+
11+
> **⚠️ For Fine-tuning, you'll need to have a suitable GPU available - for example, an A10, V100, A100.**
12+
13+
## 💾 Install
14+
15+
Create a new Python virtual environment (for example, using `conda`):
16+
17+
```bash
18+
conda create -n olive-ai python=3.11
19+
conda activate olive-ai
20+
```
21+
22+
Next, install the Olive and the dependencies for a fine-tuning workflow:
23+
24+
```bash
25+
cd Phi-3CookBook/code/04.Finetuning/olive-ort-example
26+
pip install olive-ai[gpu]
27+
pip install -r requirements.txt
28+
```
29+
30+
## 🧪 Fine-tune Phi3 using Olive
31+
The [Olive configuration file](./phrase-classification.json) contains a *workflow* with the following *passes*:
32+
33+
Phi3 -> LoRA -> MergeAdapterWeights -> ModelBuilder
34+
35+
At a high-level, this workflow will:
36+
37+
1. Fine-tune Phi3 (for 150 steps, which you can modify) using the [dataset/data-classification.json](./dataset/dataset-classification.json) data.
38+
1. Merge the LoRA adapter weights into the base model. This will give you a single model artifact in the ONNX format.
39+
1. Model Builder will optimize the model for the ONNX runtime *and* quantize the model into `int4`.
40+
41+
To execute the workflow, run:
42+
43+
```bash
44+
olive run --config phrase-classification.json
45+
```
46+
47+
When Olive has completed, you're optimized `int4` fine-tuned Phi3 model is available in: `code/04.Finetuning/olive-ort-example/models/lora-merge-mb/gpu-cuda_model`.
48+
49+
## 🧑‍💻 Integrate fine-tuned Phi3 into your application
50+
51+
To run the app:
52+
53+
```bash
54+
python app/app.py --phrase "cricket is a wonderful sport!" --model-path models/lora-merge-mb/gpu-cuda_model
55+
```
56+
57+
This response should be a single word classification of the phrase (Sad/Joy/Fear/Surprise).
58+
59+
60+
61+
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
import argparse
2+
import onnxruntime_genai as og
3+
4+
parser = argparse.ArgumentParser()
5+
parser.add_argument("--phrase", type=str)
6+
parser.add_argument("--model-path", type=str)
7+
args = parser.parse_args()
8+
9+
prompt = f"<|user|>\n{args.phrase}<|end|>\n<|assistant|>\n"
10+
11+
model=og.Model(f'{args.model_path}')
12+
13+
tokenizer = og.Tokenizer(model)
14+
15+
tokens = tokenizer.encode(prompt)
16+
17+
params=og.GeneratorParams(model)
18+
params.set_search_options(max_length=100)
19+
params.input_ids = tokens
20+
21+
generator=og.Generator(model, params)
22+
tokenizer_stream=tokenizer.create_stream()
23+
24+
while not generator.is_done():
25+
generator.compute_logits()
26+
generator.generate_next_token()
27+
print(tokenizer_stream.decode(generator.get_next_tokens()[0]), end='', flush=True)
28+

0 commit comments

Comments
 (0)