Skip to content

[CodeGen][Documentation] #956

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Mar 23, 2023
Merged
Changes from 14 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
69 changes: 63 additions & 6 deletions examples/codegen/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,17 +14,74 @@ See the License for the specific language governing permissions and
limitations under the License.
-->

Example of how to run the pipeline:
## ONNX Export
Firstly, we need to install HuggingFace optimum library
```bash
pip install optimum
```

### Patch the original PyTorch Model
TODO

### Export the model to ONNX

```bash
optimum-cli export onnx --model Salesforce/codegen-350M-multi codegen-350M-multi
```
This saves the model to directory `codegen-350-multi`

### Updating Model's Inputs Outputs Dimension Sizes
TODO

## Running in the DeepSparse Pipeline

First, we need to rename `decoder_with_past_model.onnx` to `model.onnx` inside
the `static-codegen-350-multi`, to abide the naming convention

Finally, run the pipeline:

```python
from examples.codegen.text_generation import TextGenerationPipeline

codegen = TextGenerationPipeline(
model_path="/network/damian/static-codegen-350M-multi",
engine_type="onnxruntime",
sequence_length=128, )
sequence_length=128)

out = codegen(sequences="def hello_world():")
print(out.sequences[0])
```

```bash
def hello_world():
return 'Hello World!'

def hello_world_2():
return 'Hello World!'

def hello_world_3():
return 'Hello World!'

def hello_world_4():
return 'Hello World!'

def hello_world_5():
return 'Hello World!'

def hello_world_6():
return 'Hello World!'

def hello_world_7():
return 'Hello World!'

def hello_world_8():
return 'Hello World!'

def hello
```

out = codegen(sequences=["def hello_world():", "def fibonacci(x):"])
for seq in out.sequences:
print(seq)
```
Modifying pipeline behaviour:
1. By adding argument `deterministic=False`, the next token of the sequence will not be chosen deterministically (using argmax), but will be
sampled from the probablility distribution.
2. By setting `sampling_temperature` when `deterministic=False`, we are allowing more or less randomness in the sampling method (https://towardsdatascience.com/how-to-sample-from-language-models-682bceb97277)
3. By setting `num_tokens_to_generate`, we strictly specify how many tokens we want to generate per input.