@@ -137,8 +137,18 @@ def create(
137
137
138
138
[See more information about frequency and presence penalties.](https://platform.openai.com/docs/guides/gpt/parameter-details)
139
139
140
- response_format: An object specifying the format that the model must output. Used to enable JSON
141
- mode.
140
+ response_format: An object specifying the format that the model must output.
141
+
142
+ Setting to `{ "type": "json_object" }` enables JSON mode, which guarantees the
143
+ message the model generates is valid JSON.
144
+
145
+ **Important:** when using JSON mode, you **must** also instruct the model to
146
+ produce JSON yourself via a system or user message. Without this, the model may
147
+ generate an unending stream of whitespace until the generation reaches the token
148
+ limit, resulting in increased latency and appearance of a "stuck" request. Also
149
+ note that the message content may be partially cut off if
150
+ `finish_reason="length"`, which indicates the generation exceeded `max_tokens`
151
+ or the conversation exceeded the max context length.
142
152
143
153
seed: This feature is in Beta. If specified, our system will make a best effort to
144
154
sample deterministically, such that repeated requests with the same `seed` and
@@ -304,8 +314,18 @@ def create(
304
314
305
315
[See more information about frequency and presence penalties.](https://platform.openai.com/docs/guides/gpt/parameter-details)
306
316
307
- response_format: An object specifying the format that the model must output. Used to enable JSON
308
- mode.
317
+ response_format: An object specifying the format that the model must output.
318
+
319
+ Setting to `{ "type": "json_object" }` enables JSON mode, which guarantees the
320
+ message the model generates is valid JSON.
321
+
322
+ **Important:** when using JSON mode, you **must** also instruct the model to
323
+ produce JSON yourself via a system or user message. Without this, the model may
324
+ generate an unending stream of whitespace until the generation reaches the token
325
+ limit, resulting in increased latency and appearance of a "stuck" request. Also
326
+ note that the message content may be partially cut off if
327
+ `finish_reason="length"`, which indicates the generation exceeded `max_tokens`
328
+ or the conversation exceeded the max context length.
309
329
310
330
seed: This feature is in Beta. If specified, our system will make a best effort to
311
331
sample deterministically, such that repeated requests with the same `seed` and
@@ -464,8 +484,18 @@ def create(
464
484
465
485
[See more information about frequency and presence penalties.](https://platform.openai.com/docs/guides/gpt/parameter-details)
466
486
467
- response_format: An object specifying the format that the model must output. Used to enable JSON
468
- mode.
487
+ response_format: An object specifying the format that the model must output.
488
+
489
+ Setting to `{ "type": "json_object" }` enables JSON mode, which guarantees the
490
+ message the model generates is valid JSON.
491
+
492
+ **Important:** when using JSON mode, you **must** also instruct the model to
493
+ produce JSON yourself via a system or user message. Without this, the model may
494
+ generate an unending stream of whitespace until the generation reaches the token
495
+ limit, resulting in increased latency and appearance of a "stuck" request. Also
496
+ note that the message content may be partially cut off if
497
+ `finish_reason="length"`, which indicates the generation exceeded `max_tokens`
498
+ or the conversation exceeded the max context length.
469
499
470
500
seed: This feature is in Beta. If specified, our system will make a best effort to
471
501
sample deterministically, such that repeated requests with the same `seed` and
@@ -704,8 +734,18 @@ async def create(
704
734
705
735
[See more information about frequency and presence penalties.](https://platform.openai.com/docs/guides/gpt/parameter-details)
706
736
707
- response_format: An object specifying the format that the model must output. Used to enable JSON
708
- mode.
737
+ response_format: An object specifying the format that the model must output.
738
+
739
+ Setting to `{ "type": "json_object" }` enables JSON mode, which guarantees the
740
+ message the model generates is valid JSON.
741
+
742
+ **Important:** when using JSON mode, you **must** also instruct the model to
743
+ produce JSON yourself via a system or user message. Without this, the model may
744
+ generate an unending stream of whitespace until the generation reaches the token
745
+ limit, resulting in increased latency and appearance of a "stuck" request. Also
746
+ note that the message content may be partially cut off if
747
+ `finish_reason="length"`, which indicates the generation exceeded `max_tokens`
748
+ or the conversation exceeded the max context length.
709
749
710
750
seed: This feature is in Beta. If specified, our system will make a best effort to
711
751
sample deterministically, such that repeated requests with the same `seed` and
@@ -871,8 +911,18 @@ async def create(
871
911
872
912
[See more information about frequency and presence penalties.](https://platform.openai.com/docs/guides/gpt/parameter-details)
873
913
874
- response_format: An object specifying the format that the model must output. Used to enable JSON
875
- mode.
914
+ response_format: An object specifying the format that the model must output.
915
+
916
+ Setting to `{ "type": "json_object" }` enables JSON mode, which guarantees the
917
+ message the model generates is valid JSON.
918
+
919
+ **Important:** when using JSON mode, you **must** also instruct the model to
920
+ produce JSON yourself via a system or user message. Without this, the model may
921
+ generate an unending stream of whitespace until the generation reaches the token
922
+ limit, resulting in increased latency and appearance of a "stuck" request. Also
923
+ note that the message content may be partially cut off if
924
+ `finish_reason="length"`, which indicates the generation exceeded `max_tokens`
925
+ or the conversation exceeded the max context length.
876
926
877
927
seed: This feature is in Beta. If specified, our system will make a best effort to
878
928
sample deterministically, such that repeated requests with the same `seed` and
@@ -1031,8 +1081,18 @@ async def create(
1031
1081
1032
1082
[See more information about frequency and presence penalties.](https://platform.openai.com/docs/guides/gpt/parameter-details)
1033
1083
1034
- response_format: An object specifying the format that the model must output. Used to enable JSON
1035
- mode.
1084
+ response_format: An object specifying the format that the model must output.
1085
+
1086
+ Setting to `{ "type": "json_object" }` enables JSON mode, which guarantees the
1087
+ message the model generates is valid JSON.
1088
+
1089
+ **Important:** when using JSON mode, you **must** also instruct the model to
1090
+ produce JSON yourself via a system or user message. Without this, the model may
1091
+ generate an unending stream of whitespace until the generation reaches the token
1092
+ limit, resulting in increased latency and appearance of a "stuck" request. Also
1093
+ note that the message content may be partially cut off if
1094
+ `finish_reason="length"`, which indicates the generation exceeded `max_tokens`
1095
+ or the conversation exceeded the max context length.
1036
1096
1037
1097
seed: This feature is in Beta. If specified, our system will make a best effort to
1038
1098
sample deterministically, such that repeated requests with the same `seed` and
0 commit comments