diff --git a/README.md b/README.md index e22fc9b0..6d8df947 100644 --- a/README.md +++ b/README.md @@ -42,18 +42,6 @@ Create a new Python file and add the following code, replacing the model identif ['https://replicate.com/api/models/stability-ai/stable-diffusion/files/50fcac81-865d-499e-81ac-49de0cb79264/out-0.png'] ``` -Some models, particularly language models, may not require the version string. Refer to the API documentation for the model for more on the specifics: - -```python -replicate.run( - "meta/llama-2-70b-chat", - input={ - "prompt": "Can you write a poem about open source machine learning?", - "system_prompt": "You are a helpful, respectful and honest assistant.", - }, -) -``` - Some models, like [andreasjansson/blip-2](https://replicate.com/andreasjansson/blip-2), have files as inputs. To run a model that takes a file input, pass a URL to a publicly accessible file. @@ -69,14 +57,14 @@ Or, for smaller files (<10MB), you can pass a file handle directly. ``` > [!NOTE] -> You can also use the Replicate client asynchronously by prepending `async_` to the method name. -> +> You can also use the Replicate client asynchronously by prepending `async_` to the method name. +> > Here's an example of how to run several predictions concurrently and wait for them all to complete: > > ```python > import asyncio > import replicate -> +> > # https://replicate.com/stability-ai/sdxl > model_version = "stability-ai/sdxl:39ed52f2a78e934b3ba6e2a89f5b1c712de7dfea535525255b1aa35c5565e08b" > prompts = [ @@ -96,17 +84,14 @@ Or, for smaller files (<10MB), you can pass a file handle directly. ## Run a model and stream its output -Replicate’s API supports server-sent event streams (SSEs) for language models. -Use the `stream` method to consume tokens as they're produced by the model. +Replicate’s API supports server-sent event streams (SSEs) for language models. +Use the `stream` method to consume tokens as they're produced. ```python import replicate -# https://replicate.com/meta/llama-2-70b-chat -model_version = "meta/llama-2-70b-chat:02e509c789964a7ea8736978a43525956ef40397be9033abf9fd2badfe68c9e3" - for event in replicate.stream( - model_version, + "meta/meta-llama-3-70b-instruct, input={ "prompt": "Please write a haiku about llamas.", }, @@ -114,13 +99,17 @@ for event in replicate.stream( print(str(event), end="") ``` +> [!TIP] +> Some models, like [meta/meta-llama-3-70b-instruct](https://replicate.com/meta/meta-llama-3-70b-instruct), +> don't require a version string. +> You can always refer to the API documentation on the model page for specifics. + You can also stream the output of a prediction you create. This is helpful when you want the ID of the prediction separate from its output. ```python -version = "02e509c789964a7ea8736978a43525956ef40397be9033abf9fd2badfe68c9e3" prediction = replicate.predictions.create( - version=version, + model="meta/meta-llama-3-70b-instruct", input={"prompt": "Please write a haiku about llamas."}, stream=True, ) @@ -132,7 +121,6 @@ for event in prediction.stream(): For more information, see ["Streaming output"](https://replicate.com/docs/streaming) in Replicate's docs. - ## Run a model in the background You can start a model and run it in the background: @@ -337,12 +325,12 @@ Here's how to list of all the available hardware for running models on Replicate ## Fine-tune a model -Use the [training API](https://replicate.com/docs/fine-tuning) -to fine-tune models to make them better at a particular task. -To see what **language models** currently support fine-tuning, +Use the [training API](https://replicate.com/docs/fine-tuning) +to fine-tune models to make them better at a particular task. +To see what **language models** currently support fine-tuning, check out Replicate's [collection of trainable language models](https://replicate.com/collections/trainable-language-models). -If you're looking to fine-tune **image models**, +If you're looking to fine-tune **image models**, check out Replicate's [guide to fine-tuning image models](https://replicate.com/docs/guides/fine-tune-an-image-model). Here's how to fine-tune a model on Replicate: