Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add lora finetuning + example #113

Open
wants to merge 13 commits into
base: main
Choose a base branch
from
84 changes: 84 additions & 0 deletions training/lora/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
## Fine-tuning with DeeperSpeed
### Install dependencies

`mamba install -c conda-forge cudatoolkit-dev`

`export CUDA_HOME=$CONDA_PREFIX`

`pip install evaluate datasets peft transformers git+https://github.com/EleutherAI/DeeperSpeed.git`

`pip install 'transformers[sklearn]'`

#### Install bitsandbytes if loading in 8-bit
`pip install bitsandbytes`

### Start...

`cd training/lora`

## Examples
#### From HuggingFace dataset:
```
deepspeed --num_gpus=1 finetune.py \
--deepspeed example/config.json \
--model_name_or_path togethercomputer/RedPajama-INCITE-Base-3B-v1 \
--dataset_name imdb \
--do_train \
--do_eval \
--fp16 \
--overwrite_cache \
--evaluation_strategy="steps" \
--output_dir finetuned \
--num_train_epochs 1 \
--eval_steps 15 \
--gradient_accumulation_steps 1 \
--per_device_train_batch_size 4 \
--use_fast_tokenizer True \
--learning_rate 1e-5 \
--warmup_steps 10
```
#### From train and validation files:
```
deepspeed --num_gpus=1 finetune.py \
--deepspeed example/config.json \
--model_name_or_path togethercomputer/RedPajama-INCITE-Base-3B-v1 \
--train_file train.csv \
--validation_file validation.csv \
--do_train \
--do_eval \
--fp16 \
--overwrite_cache \
--evaluation_strategy="steps" \
--output_dir finetuned \
--num_train_epochs 1 \
--eval_steps 15 \
--gradient_accumulation_steps 1 \
--per_device_train_batch_size 4 \
--use_fast_tokenizer True \
--learning_rate 1e-5 \
--warmup_steps 10
```

#### In 8-bit
** Change `fp16.enabled` to `false` in `example/config.json` **
```
deepspeed --num_gpus=1 finetune.py \
--deepspeed example/config.json \
--model_name_or_path togethercomputer/RedPajama-INCITE-Base-3B-v1 \
--dataset_name imdb \
--do_train \
--do_eval \
--int8 \
--low_cpu_mem_usage \
--overwrite_cache \
--evaluation_strategy="steps" \
--output_dir finetuned \
--num_train_epochs 1 \
--eval_steps 15 \
--gradient_accumulation_steps 1 \
--per_device_train_batch_size 4 \
--use_fast_tokenizer True \
--learning_rate 1e-5 \
--warmup_steps 10 \
--no_cache
```
39 changes: 39 additions & 0 deletions training/lora/example/config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
{
"train_batch_size": "auto",
"fp16": {
"enabled": true,
"min_loss_scale": 1,
"opt_level": "O2"
},
"zero_optimization": {
"stage": 2,
"offload_param": {
"device": "cpu"
},
"offload_optimizer": {
"device": "cpu"
},
"allgather_partitions": true,
"allgather_bucket_size": 5e8,
"contiguous_gradients": true
},
"optimizer": {
"type": "AdamW",
"params": {
"lr": "auto",
"betas": [
0.9,
0.999
],
"eps": 1e-08
}
},
"scheduler": {
"type": "WarmupLR",
"params": {
"warmup_min_lr": 0,
"warmup_max_lr": "auto",
"warmup_num_steps": "auto"
}
}
}
250 changes: 250 additions & 0 deletions training/lora/example/finetuning.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,250 @@
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"provenance": [],
"gpuType": "T4",
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"language_info": {
"name": "python"
},
"accelerator": "GPU",
"gpuClass": "standard"
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/github/orangetin/OpenChatKit/blob/peft/training/lora/example/finetuning.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"source": [
"# OpenChatKit - Fine-tuning"
],
"metadata": {
"id": "sLrKqm0BULlD"
}
},
{
"cell_type": "markdown",
"source": [
"### Check GPU availability"
],
"metadata": {
"id": "eZsgPnayURrc"
}
},
{
"cell_type": "code",
"source": [
"!nvidia-smi"
],
"metadata": {
"id": "qy_ENUlFgG4a"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"### Install conda"
],
"metadata": {
"id": "0gy7ssnoT_SI"
}
},
{
"cell_type": "code",
"source": [
"!wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh && chmod +x Miniconda3-latest-Linux-x86_64.sh && ./Miniconda3-latest-Linux-x86_64.sh -b -f -p /usr/local"
],
"metadata": {
"id": "11MMVFkAKtyg"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"### Setting up conda environment"
],
"metadata": {
"id": "CD7yF4rvT3Y8"
}
},
{
"cell_type": "code",
"source": [
"!conda install mamba -n base -c conda-forge -y"
],
"metadata": {
"id": "-W6PrOSILQoc"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"!git clone https://github.com/orangetin/OpenChatKit.git --branch peft && cd OpenChatKit && mamba create -n OpenChatKit python=3.10.9 -y"
],
"metadata": {
"id": "hC8ob6kuLSn2"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"!source activate OpenChatKit && mamba install pytorch torchvision torchaudio cudatoolkit-dev pytorch-cuda=11.6 -c pytorch -c nvidia -c conda-forge -y"
],
"metadata": {
"id": "waQdRff3Dee4"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"!source activate OpenChatKit && export CUDA_HOME=$CONDA_PREFIX && pip install accelerate evaluate datasets peft chardet cchardet transformers git+https://github.com/EleutherAI/DeeperSpeed.git bitsandbytes && pip install 'transformers[sklearn]'"
],
"metadata": {
"id": "T_K3hXCVz7I1"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"### Download dataset and convert jsonl to json"
],
"metadata": {
"id": "cVc_deb3O9q1"
}
},
{
"cell_type": "code",
"source": [
"!cd OpenChatKit/training/lora && mkdir data && mkdir data_jsonl"
],
"metadata": {
"id": "RoNQGlepO-Uj"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"!cd OpenChatKit/training/lora/data_jsonl && wget https://huggingface.co/datasets/laion/OIG/resolve/main/unified_chip2.jsonl"
],
"metadata": {
"id": "2xZJ3uSdO_xT"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"import json\n",
"\n",
"with open('OpenChatKit/training/lora/data_jsonl/unified_chip2.jsonl', 'r') as in_file:\n",
" lines = [json.loads(line) for line in in_file.readlines()]\n",
"\n",
"with open('OpenChatKit/training/lora/data/unified_chip2.json', 'w') as out_file:\n",
" json.dump(lines, out_file)"
],
"metadata": {
"id": "peZQbFRXPA4q"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"### Initialize training in 8-bit"
],
"metadata": {
"id": "jOKRM0VVUjwk"
}
},
{
"cell_type": "markdown",
"source": [
"Edits config to disable fp16"
],
"metadata": {
"id": "RLk6ghH1PgZ8"
}
},
{
"cell_type": "code",
"source": [
"!cd OpenChatKit/training/lora && sed -i -e 's/\"enabled\": true,/\"enabled\": false,/g' example/config.json"
],
"metadata": {
"id": "AzkcI5ll-mDt"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"To change to fp16, replace `--int8 \\ --low_cpu_mem_usage \\` with `--fp16 \\`"
],
"metadata": {
"id": "0kmhEjGlPjzZ"
}
},
{
"cell_type": "code",
"source": [
"!source activate OpenChatKit && export CUDA_HOME=$CONDA_PREFIX && cd OpenChatKit/training/lora && deepspeed --num_gpus=1 finetune.py \\\n",
"--deepspeed example/config.json \\\n",
"--model_name_or_path togethercomputer/RedPajama-INCITE-Chat-3B-v1 \\\n",
"--train_file data/unified_chip2.json \\\n",
"--validation_split_percentage 10 \\\n",
"--do_train \\\n",
"--do_eval \\\n",
"--overwrite_cache \\\n",
"--evaluation_strategy=\"steps\" \\\n",
"--output_dir finetuned \\\n",
"--num_train_epochs 1 \\\n",
"--eval_steps 15 \\\n",
"--gradient_accumulation_steps 2 \\\n",
"--per_device_train_batch_size 4 \\\n",
"--use_fast_tokenizer True \\\n",
"--learning_rate 1e-5 \\\n",
"--warmup_steps 10 \\\n",
"--int8 \\\n",
"--low_cpu_mem_usage \\\n",
"--no_cache"
],
"metadata": {
"id": "82cyWiyi8y9f"
},
"execution_count": null,
"outputs": []
}
]
}
Loading