Skip to content

Commit 8b4a89a

Browse files
authored
added results for 16-bit fine-tuning in readme
1 parent 00ccfd0 commit 8b4a89a

File tree

1 file changed

+28
-0
lines changed

1 file changed

+28
-0
lines changed

README.md

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -236,3 +236,31 @@ python ./run_squad.py \
236236
--gradient_accumulation_steps 2 \
237237
--optimize_on_cpu
238238
```
239+
240+
If you have a recent GPU (starting from NVIDIA Volta series), you should try **16-bit fine-tuning** (FP16).
241+
242+
Here is an example of hyper-parameters for a FP16 run we tried:
243+
```bash
244+
python ./run_squad.py \
245+
--vocab_file $BERT_LARGE_DIR/vocab.txt \
246+
--bert_config_file $BERT_LARGE_DIR/bert_config.json \
247+
--init_checkpoint $BERT_LARGE_DIR/pytorch_model.bin \
248+
--do_lower_case \
249+
--do_train \
250+
--do_predict \
251+
--train_file $SQUAD_TRAIN \
252+
--predict_file $SQUAD_EVAL \
253+
--learning_rate 3e-5 \
254+
--num_train_epochs 2 \
255+
--max_seq_length 384 \
256+
--doc_stride 128 \
257+
--output_dir $OUTPUT_DIR \
258+
--train_batch_size 24 \
259+
--fp16 \
260+
--loss_scale 128
261+
```
262+
263+
The results were similar to the above FP32 results (actually slightly higher):
264+
```bash
265+
{"exact_match": 84.65468306527909, "f1": 91.238669287002}
266+
```

0 commit comments

Comments
 (0)