Skip to content

Commit 1f3acef

Browse files
authored
Better clarity on SFT dataset attributes (#1970)
1 parent fb697bd commit 1f3acef

File tree

3 files changed

+9
-9
lines changed

3 files changed

+9
-9
lines changed

tutorials/finetune_adapter.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -120,14 +120,14 @@ If your GPU supports `bfloat16`, the script will automatically use it.
120120
You can easily train on your own instruction dataset saved in JSON format.
121121

122122
1. Create a JSON file in which each row holds one instruction-response pair.
123-
A row has an entry for 'instruction', 'input', and 'output', where 'input' is optional and can be
124-
the empty string if the instruction doesn't require a context. Below is an example json file:
123+
A row has an entry for 'instruction' and 'output', and optionally 'input'. Note that currently, the 'input' field is only used in the Alpaca chat template. If you are using the Alpaca template, 'input' can be the empty string if the instruction doesn't require a context.
124+
Below is an example json file:
125125

126126
```text
127127
[
128128
{
129129
"instruction": "Arrange the given numbers in ascending order.",
130-
"input": "2, 4, 0, 8, 3",
130+
"input": "2, 4, 0, 8, 3", // Optional: only used in Alpaca chat template
131131
"output": "0, 2, 3, 4, 8"
132132
},
133133
...

tutorials/finetune_full.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -68,14 +68,14 @@ If your GPU supports `bfloat16`, the script will automatically use it.
6868
You can easily train on your own instruction dataset saved in JSON format.
6969

7070
1. Create a JSON file in which each row holds one instruction-response pair.
71-
A row has an entry for 'instruction', 'input', and 'output', where 'input' is optional and can be
72-
the empty string if the instruction doesn't require a context. Below is an example json file:
71+
A row has an entry for 'instruction' and 'output', and optionally 'input'. Note that currently, the 'input' field is only used in the Alpaca chat template. If you are using the Alpaca template, 'input' can be the empty string if the instruction doesn't require a context.
72+
Below is an example json file:
7373

7474
```text
7575
[
7676
{
7777
"instruction": "Arrange the given numbers in ascending order.",
78-
"input": "2, 4, 0, 8, 3",
78+
"input": "2, 4, 0, 8, 3", // Optional: only used in Alpaca chat template
7979
"output": "0, 2, 3, 4, 8"
8080
},
8181
...

tutorials/finetune_lora.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -95,14 +95,14 @@ If your GPU supports `bfloat16`, you can additionally pass `--precision "bf16-tr
9595
You can easily train on your own instruction dataset saved in JSON format.
9696

9797
1. Create a JSON file in which each row holds one instruction-response pair.
98-
A row has an entry for 'instruction', 'input', and 'output', where 'input' is optional and can be
99-
the empty string if the instruction doesn't require a context. Below is an example json file:
98+
A row has an entry for 'instruction' and 'output', and optionally 'input'. Note that currently, the 'input' field is only used in the Alpaca chat template. If you are using the Alpaca template, 'input' can be the empty string if the instruction doesn't require a context.
99+
Below is an example json file:
100100

101101
```text
102102
[
103103
{
104104
"instruction": "Arrange the given numbers in ascending order.",
105-
"input": "2, 4, 0, 8, 3",
105+
"input": "2, 4, 0, 8, 3", // Optional: only used in Alpaca chat template
106106
"output": "0, 2, 3, 4, 8"
107107
},
108108
...

0 commit comments

Comments
 (0)