Skip to content

Commit fe2d20d

Browse files
authored
Fix doctest more (for docs/source/en) (#30247)
* fix * fix --------- Co-authored-by: ydshieh <[email protected]>
1 parent ec344b5 commit fe2d20d

File tree

8 files changed

+36
-29
lines changed

8 files changed

+36
-29
lines changed

docs/source/en/generation_strategies.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -57,9 +57,10 @@ When you load a model explicitly, you can inspect the generation configuration t
5757
>>> model = AutoModelForCausalLM.from_pretrained("distilbert/distilgpt2")
5858
>>> model.generation_config
5959
GenerationConfig {
60-
"bos_token_id": 50256,
61-
"eos_token_id": 50256,
60+
"bos_token_id": 50256,
61+
"eos_token_id": 50256
6262
}
63+
<BLANKLINE>
6364
```
6465

6566
Printing out the `model.generation_config` reveals only the values that are different from the default generation
@@ -244,8 +245,7 @@ To enable multinomial sampling set `do_sample=True` and `num_beams=1`.
244245

245246
>>> outputs = model.generate(**inputs, do_sample=True, num_beams=1, max_new_tokens=100)
246247
>>> tokenizer.batch_decode(outputs, skip_special_tokens=True)
247-
['Today was an amazing day because when you go to the World Cup and you don\'t, or when you don\'t get invited,
248-
that\'s a terrible feeling."']
248+
["Today was an amazing day because we received these wonderful items by the way of a gift shop. The box arrived on a Thursday and I opened it on Monday afternoon to receive the gifts. Both bags featured pieces from all the previous years!\n\nThe box had lots of surprises in it, including some sweet little mini chocolate chips! I don't think I'd eat all of these. This was definitely one of the most expensive presents I have ever got, I actually got most of them for free!\n\nThe first package came"]
249249
```
250250

251251
### Beam-search decoding
@@ -393,7 +393,7 @@ just like in multinomial sampling. However, in assisted decoding, reducing the t
393393
>>> assistant_model = AutoModelForCausalLM.from_pretrained(assistant_checkpoint)
394394
>>> outputs = model.generate(**inputs, assistant_model=assistant_model, do_sample=True, temperature=0.5)
395395
>>> tokenizer.batch_decode(outputs, skip_special_tokens=True)
396-
['Alice and Bob are going to the same party. It is a small party, in a small']
396+
['Alice and Bob, a couple of friends of mine, who are both in the same office as']
397397
```
398398

399399
Alternativelly, you can also set the `prompt_lookup_num_tokens` to trigger n-gram based assisted decoding, as opposed

docs/source/en/model_doc/code_llama.md

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -65,20 +65,20 @@ After conversion, the model and tokenizer can be loaded via:
6565
>>> tokenizer = CodeLlamaTokenizer.from_pretrained("codellama/CodeLlama-7b-hf")
6666
>>> model = LlamaForCausalLM.from_pretrained("codellama/CodeLlama-7b-hf")
6767
>>> PROMPT = '''def remove_non_ascii(s: str) -> str:
68-
""" <FILL_ME>
69-
return result
70-
'''
68+
... """ <FILL_ME>
69+
... return result
70+
... '''
7171
>>> input_ids = tokenizer(PROMPT, return_tensors="pt")["input_ids"]
7272
>>> generated_ids = model.generate(input_ids, max_new_tokens=128)
7373

7474
>>> filling = tokenizer.batch_decode(generated_ids[:, input_ids.shape[1]:], skip_special_tokens = True)[0]
7575
>>> print(PROMPT.replace("<FILL_ME>", filling))
7676
def remove_non_ascii(s: str) -> str:
7777
""" Remove non-ASCII characters from a string.
78-
78+
<BLANKLINE>
7979
Args:
8080
s: The string to remove non-ASCII characters from.
81-
81+
<BLANKLINE>
8282
Returns:
8383
The string with non-ASCII characters removed.
8484
"""
@@ -87,6 +87,7 @@ def remove_non_ascii(s: str) -> str:
8787
if ord(c) < 128:
8888
result += c
8989
return result
90+
<BLANKLINE>
9091
```
9192

9293
If you only want the infilled part:

docs/source/en/model_doc/phi.md

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -92,7 +92,9 @@ Phi-2 has been integrated in the development version (4.37.0.dev) of `transforme
9292
>>> outputs = model.generate(**inputs, max_length=30)
9393
>>> text = tokenizer.batch_decode(outputs)[0]
9494
>>> print(text)
95-
'Can you help me write a formal email to a potential business partner proposing a joint venture?\nInput: Company A: ABC Inc.\nCompany B: XYZ Ltd.\nJoint Venture: A new online platform for e-commerce'
95+
Can you help me write a formal email to a potential business partner proposing a joint venture?
96+
Input: Company A: ABC Inc.
97+
Company B
9698
```
9799

98100
### Example :
@@ -134,7 +136,7 @@ To load and run a model using Flash Attention 2, refer to the snippet below:
134136
>>> from transformers import PhiForCausalLM, AutoTokenizer
135137

136138
>>> # define the model and tokenizer and push the model and tokens to the GPU.
137-
>>> model = PhiForCausalLM.from_pretrained("microsoft/phi-1_5", torch_dtype=torch.float16, attn_implementation="flash_attention_2").to("cuda")
139+
>>> model = PhiForCausalLM.from_pretrained("microsoft/phi-1_5", torch_dtype=torch.float16, attn_implementation="flash_attention_2").to("cuda") # doctest: +SKIP
138140
>>> tokenizer = AutoTokenizer.from_pretrained("microsoft/phi-1_5")
139141

140142
>>> # feel free to change the prompt to your liking.
@@ -144,9 +146,9 @@ To load and run a model using Flash Attention 2, refer to the snippet below:
144146
>>> tokens = tokenizer(prompt, return_tensors="pt").to("cuda")
145147

146148
>>> # use the model to generate new tokens.
147-
>>> generated_output = model.generate(**tokens, use_cache=True, max_new_tokens=10)
149+
>>> generated_output = model.generate(**tokens, use_cache=True, max_new_tokens=10) # doctest: +SKIP
148150

149-
>>> tokenizer.batch_decode(generated_output)[0]
151+
>>> tokenizer.batch_decode(generated_output)[0] # doctest: +SKIP
150152
'If I were an AI that had just achieved a breakthrough in machine learning, I would be thrilled'
151153
```
152154

docs/source/en/model_doc/stablelm.md

Lines changed: 14 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -37,19 +37,21 @@ We also provide `StableLM Zephyr 3B`, an instruction fine-tuned version of the m
3737
The following code snippet demonstrates how to use `StableLM 3B 4E1T` for inference:
3838

3939
```python
40-
>>> from transformers import AutoModelForCausalLM, AutoTokenizer
40+
>>> from transformers import AutoModelForCausalLM, AutoTokenizer, set_seed
4141
>>> device = "cuda" # the device to load the model onto
4242

43+
>>> set_seed(0)
44+
4345
>>> tokenizer = AutoTokenizer.from_pretrained("stabilityai/stablelm-3b-4e1t")
4446
>>> model = AutoModelForCausalLM.from_pretrained("stabilityai/stablelm-3b-4e1t")
45-
>>> model.to(device)
47+
>>> model.to(device) # doctest: +IGNORE_RESULT
4648

4749
>>> model_inputs = tokenizer("The weather is always wonderful in", return_tensors="pt").to(model.device)
4850

4951
>>> generated_ids = model.generate(**model_inputs, max_length=32, do_sample=True)
5052
>>> responses = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)
5153
>>> responses
52-
['The weather is always wonderful in Santa Barbara and, for visitors hoping to make the move to our beautiful seaside city, this town offers plenty of great places to...']
54+
['The weather is always wonderful in Costa Rica, which makes it a prime destination for retirees. That’s where the Pensionado program comes in, offering']
5355
```
5456

5557
## Combining StableLM and Flash Attention 2
@@ -66,19 +68,21 @@ Now, to run the model with Flash Attention 2, refer to the snippet below:
6668

6769
```python
6870
>>> import torch
69-
>>> from transformers import AutoModelForCausalLM, AutoTokenizer
71+
>>> from transformers import AutoModelForCausalLM, AutoTokenizer, set_seed
7072
>>> device = "cuda" # the device to load the model onto
7173

74+
>>> set_seed(0)
75+
7276
>>> tokenizer = AutoTokenizer.from_pretrained("stabilityai/stablelm-3b-4e1t")
73-
>>> model = AutoModelForCausalLM.from_pretrained("stabilityai/stablelm-3b-4e1t", torch_dtype=torch.bfloat16, attn_implementation="flash_attention_2")
74-
>>> model.to(device)
77+
>>> model = AutoModelForCausalLM.from_pretrained("stabilityai/stablelm-3b-4e1t", torch_dtype=torch.bfloat16, attn_implementation="flash_attention_2") # doctest: +SKIP
78+
>>> model.to(device) # doctest: +SKIP
7579

7680
>>> model_inputs = tokenizer("The weather is always wonderful in", return_tensors="pt").to(model.device)
7781

78-
>>> generated_ids = model.generate(**model_inputs, max_length=32, do_sample=True)
79-
>>> responses = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)
80-
>>> responses
81-
['The weather is always wonderful in Santa Barbara and, for visitors hoping to make the move to our beautiful seaside city, this town offers plenty of great places to...']
82+
>>> generated_ids = model.generate(**model_inputs, max_length=32, do_sample=True) # doctest: +SKIP
83+
>>> responses = tokenizer.batch_decode(generated_ids, skip_special_tokens=True) # doctest: +SKIP
84+
>>> responses # doctest: +SKIP
85+
['The weather is always wonderful in Costa Rica, which makes it a prime destination for retirees. That’s where the Pensionado program comes in, offering']
8286
```
8387

8488

docs/source/en/model_doc/starcoder2.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -42,11 +42,10 @@ These ready-to-use checkpoints can be downloaded and used via the HuggingFace Hu
4242
>>> prompt = "def print_hello_world():"
4343

4444
>>> model_inputs = tokenizer([prompt], return_tensors="pt").to("cuda")
45-
>>> model.to(device)
4645

4746
>>> generated_ids = model.generate(**model_inputs, max_new_tokens=10, do_sample=False)
4847
>>> tokenizer.batch_decode(generated_ids)[0]
49-
"def print_hello_world():\n\treturn 'Hello World!'"
48+
'def print_hello_world():\n print("Hello World!")\n\ndef print'
5049
```
5150

5251
## Starcoder2Config

docs/source/en/model_doc/t5.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -309,7 +309,7 @@ The predicted tokens will then be placed between the sentinel tokens.
309309
>>> sequence_ids = model.generate(input_ids)
310310
>>> sequences = tokenizer.batch_decode(sequence_ids)
311311
>>> sequences
312-
['<pad><extra_id_0> park offers<extra_id_1> the<extra_id_2> park.</s>']
312+
['<pad> <extra_id_0> park offers <extra_id_1> the <extra_id_2> park.</s>']
313313
```
314314

315315
## Performance

docs/source/en/tasks/prompting.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,7 @@ Run inference with decoder-only models with the `text-generation` pipeline:
8080
>>> prompt = "Hello, I'm a language model"
8181

8282
>>> generator(prompt, max_length = 30)
83-
[{'generated_text': "Hello, I'm a language model expert, so I'm a big believer in the concept that I know very well and then I try to look into"}]
83+
[{'generated_text': "Hello, I'm a language model programmer so you can use some of my stuff. But you also need some sort of a C program to run."}]
8484
```
8585

8686
To run inference with an encoder-decoder, use the `text2text-generation` pipeline:
@@ -284,7 +284,7 @@ the leading word or phrase (`"Answer:"`) to nudge the model to start generating
284284

285285
>>> for seq in sequences:
286286
... print(f"Result: {seq['generated_text']}")
287-
Result: Modern tools are used, such as immersion blenders
287+
Result: Modern tools often used to make gazpacho include
288288
```
289289

290290
#### Reasoning

utils/slow_documentation_tests.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
docs/source/en/generation_strategies.md
2+
docs/source/en/model_doc/code_llama.md
23
docs/source/en/model_doc/ctrl.md
34
docs/source/en/model_doc/kosmos-2.md
45
docs/source/en/model_doc/seamless_m4t.md

0 commit comments

Comments
 (0)