Skip to content

Fix initialization of stream decoder #288

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 19, 2024

Conversation

mattt
Copy link
Contributor

@mattt mattt commented Apr 19, 2024

Fixes #287

Given the following code:

import replicate


def go():
    for event in replicate.stream(
        "meta/meta-llama-3-8b-instruct",
        input={
            "top_p": 0.9,
            "prompt": "Hi! Help me please:)",
            "max_tokens": 512,
            "min_tokens": 0,
            "temperature": 0.01,
            "stop_sequences": "<|end_of_text|>",
            "prompt_template": "<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\nYou are a helpful assistant<|eot_id|><|start_header_id|>user<|end_header_id|>\n\n{prompt}<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n",
            "presence_penalty": 0,
            "frequency_penalty": 1,
        },
    ):
        print(str(event), end="")
    print()


go()
print("---------------------------------")
go()
print("---------------------------------")
go()
print("---------------------------------")

The latest release repeats the same token once for each previous invocation.

Hi there! I'd be happy to help you with whatever you need. What's on your mind?assistant

I'm glad you asked! I'm here to help you with any questions or problems you might have. Whether it's something big or something small, I'm here to help.

So, what's on your mind? Do you have a specific question or problem you'd like to talk about?assistant

I'm all ears! Take your time, and feel free to share as much or as little as you'd like.

Remember, everything we discuss is confidential and just between us. So,
---------------------------------
Hi
Hi there! I'd be happy to help you with whatever you need. What's on your mind?assistant

I'm glad you asked! I'm here to help you with any questions or problems you might have. Whether it's something big or something small, I'm here to help.

So, what's on your mind? Do you have a specific question or problem you'd like to talk about?assistant

I'm all ears! Take your time, and feel free to share as much or as little as you'd like.

Remember, everything we discuss is confidential and just between us. So,
---------------------------------
Hi
Hi
Hi there! I'd be happy to help you with whatever you need. What's on your mind?assistant

I'm glad you asked! I'm here to help you with any questions or problems you might have. Whether it's something big or something small, I'm here to help.

So, what's on your mind? Do you have a specific question or problem you'd like to talk about?assistant

I'm all ears! Take your time, and feel free to share as much or as little as you'd like.

Remember, everything we discuss is confidential and just between us. So,
---------------------------------

This is caused by incorrect initialization of the stream Decoder object. After applying this change, the code produces the correct behavior:

Hi there! I'd be happy to help you with whatever you need. What's on your mind?assistant

I'm glad you asked! I'm here to help you with any questions or problems you might have. Whether it's something big or something small, I'm here to help.

So, what's on your mind? Do you have a specific question or problem you'd like to talk about?assistant

I'm all ears! Take your time, and feel free to share as much or as little as you'd like.

Remember, everything we discuss is confidential and just between us. So,
---------------------------------
Hi there! I'd be happy to help you with whatever you need. What's on your mind?assistant

I'm glad you asked! I'm here to help you with any questions or problems you might have. Whether it's something big or something small, I'm here to help.

So, what's on your mind? Do you have a specific question or problem you'd like to talk about?assistant

I'm all ears! Take your time, and feel free to share as much or as little as you'd like.

Remember, everything we discuss is confidential and just between us. So,
---------------------------------
Hi there! I'd be happy to help you with whatever you need. What's on your mind?assistant

I'm glad you asked! I'm here to help you with any questions or problems you might have. Whether it's something big or something small, I'm here to help.

So, what's on your mind? Do you have a specific question or problem you'd like to talk about?assistant

I'm all ears! Take your time, and feel free to share as much or as little as you'd like.

Remember, everything we discuss is confidential and just between us. So,
---------------------------------

@mattt mattt merged commit a745d15 into main Apr 19, 2024
8 checks passed
@mattt mattt deleted the mattt/fix-stream-decoder-initialization branch April 19, 2024 21:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

LLama3 streaming repeats the previous request's first token.
1 participant