You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* sampling : refactor + optimize penalties sampler
ggml-ci
* common : apply ignore_eos as logit bias
ggml-ci
* batched : remove penalties sampler
* params : allow penalty_last_n == -1 to be equal to context size
ggml-ci
* common : by default, move the penalties at the end of the sampling chain
ggml-ci
* common : ignore all EOG tokens
Co-authored-by: Diego Devesa <[email protected]>
* common : move back the penalties at the front of the sampling chain
ggml-ci
* readme : restore hint about --ignore-eos flag [no ci]
* llama : minor
ggml-ci
* webui : update
---------
Co-authored-by: Diego Devesa <[email protected]>
Copy file name to clipboardExpand all lines: examples/main/README.md
-5
Original file line number
Diff line number
Diff line change
@@ -177,16 +177,11 @@ Example usage: `--temp 0`
177
177
178
178
-`--repeat-penalty N`: Control the repetition of token sequences in the generated text default: 1.0, 1.0 = disabled).
179
179
-`--repeat-last-n N`: Last n tokens to consider for penalizing repetition (default: 64, 0 = disabled, -1 = ctx-size).
180
-
-`--no-penalize-nl`: Disable penalization for newline tokens when applying the repeat penalty.
181
180
182
181
The `repeat-penalty` option helps prevent the model from generating repetitive or monotonous text. A higher value (e.g., 1.5) will penalize repetitions more strongly, while a lower value (e.g., 0.9) will be more lenient. The default value is 1.
183
182
184
183
The `repeat-last-n` option controls the number of tokens in the history to consider for penalizing repetition. A larger value will look further back in the generated text to prevent repetitions, while a smaller value will only consider recent tokens. A value of 0 disables the penalty, and a value of -1 sets the number of tokens considered equal to the context size (`ctx-size`).
185
184
186
-
Use the `--no-penalize-nl` option to disable newline penalization when applying the repeat penalty. This option is particularly useful for generating chat conversations, dialogues, code, poetry, or any text where newline tokens play a significant role in structure and formatting. Disabling newline penalization helps maintain the natural flow and intended formatting in these specific use cases.
187
-
188
-
Example usage: `--repeat-penalty 1.15 --repeat-last-n 128 --no-penalize-nl`
189
-
190
185
### DRY Repetition Penalty
191
186
192
187
DRY (Don't Repeat Yourself) sampling is an effective technique for reducing repetition in generated text even across long contexts by penalizing tokens based on their recent usage patterns (original [PR link](https://github.com/oobabooga/text-generation-webui/pull/5677)).
0 commit comments