-
Notifications
You must be signed in to change notification settings - Fork 245
LLama3 streaming repeats the previous request's first token. #287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi @mikutsky. Thanks for reporting this. Can you share any predictions for these? (Go to your replicate.com Dashboard, look under Predictions). Seeing that would help us tell if the problem is in the model or the client library. |
Hi, have the same issue |
@Gusakovskyi @mikutsky We've confirmed that there's an issue with stop sequences for |
It looks like the client library problem. I provide you second query info. Because the next queries collect mistakes in the prompt. Everything looks correct on the dashboard: Here is the prompt for the second query, and the prompt is still correct:
However console output contains the extra tag 'Hi':
|
@mikutsky We just pushed a new build of the model, which should address the stop sequence problem. Please give your client code another try and let me know if that's working for you now. If not, could you please try calling |
Actually, I'm able to reproduce this in isolation, so it does appear to be an issue with the client. Working on a fix now. |
Fixes #287 Given the following code: ```python import replicate def go(): for event in replicate.stream( "meta/meta-llama-3-8b-instruct", input={ "top_p": 0.9, "prompt": "Hi! Help me please:)", "max_tokens": 512, "min_tokens": 0, "temperature": 0.01, "stop_sequences": "<|end_of_text|>", "prompt_template": "<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\nYou are a helpful assistant<|eot_id|><|start_header_id|>user<|end_header_id|>\n\n{prompt}<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n", "presence_penalty": 0, "frequency_penalty": 1, }, ): print(str(event), end="") print() go() print("---------------------------------") go() print("---------------------------------") go() print("---------------------------------") ``` The latest release repeats the same token once for each previous invocation. <details> ``` Hi there! I'd be happy to help you with whatever you need. What's on your mind?assistant I'm glad you asked! I'm here to help you with any questions or problems you might have. Whether it's something big or something small, I'm here to help. So, what's on your mind? Do you have a specific question or problem you'd like to talk about?assistant I'm all ears! Take your time, and feel free to share as much or as little as you'd like. Remember, everything we discuss is confidential and just between us. So, --------------------------------- Hi Hi there! I'd be happy to help you with whatever you need. What's on your mind?assistant I'm glad you asked! I'm here to help you with any questions or problems you might have. Whether it's something big or something small, I'm here to help. So, what's on your mind? Do you have a specific question or problem you'd like to talk about?assistant I'm all ears! Take your time, and feel free to share as much or as little as you'd like. Remember, everything we discuss is confidential and just between us. So, --------------------------------- Hi Hi Hi there! I'd be happy to help you with whatever you need. What's on your mind?assistant I'm glad you asked! I'm here to help you with any questions or problems you might have. Whether it's something big or something small, I'm here to help. So, what's on your mind? Do you have a specific question or problem you'd like to talk about?assistant I'm all ears! Take your time, and feel free to share as much or as little as you'd like. Remember, everything we discuss is confidential and just between us. So, --------------------------------- ``` </details> This is caused by incorrect initialization of the stream `Decoder` object. After applying this change, the code produces the correct behavior: <details> ``` Hi there! I'd be happy to help you with whatever you need. What's on your mind?assistant I'm glad you asked! I'm here to help you with any questions or problems you might have. Whether it's something big or something small, I'm here to help. So, what's on your mind? Do you have a specific question or problem you'd like to talk about?assistant I'm all ears! Take your time, and feel free to share as much or as little as you'd like. Remember, everything we discuss is confidential and just between us. So, --------------------------------- Hi there! I'd be happy to help you with whatever you need. What's on your mind?assistant I'm glad you asked! I'm here to help you with any questions or problems you might have. Whether it's something big or something small, I'm here to help. So, what's on your mind? Do you have a specific question or problem you'd like to talk about?assistant I'm all ears! Take your time, and feel free to share as much or as little as you'd like. Remember, everything we discuss is confidential and just between us. So, --------------------------------- Hi there! I'd be happy to help you with whatever you need. What's on your mind?assistant I'm glad you asked! I'm here to help you with any questions or problems you might have. Whether it's something big or something small, I'm here to help. So, what's on your mind? Do you have a specific question or problem you'd like to talk about?assistant I'm all ears! Take your time, and feel free to share as much or as little as you'd like. Remember, everything we discuss is confidential and just between us. So, --------------------------------- ``` Signed-off-by: Mattt Zmuda <[email protected]>
@mikutsky @Gusakovskyi Thanks again for reporting. This should be fixed by 0.25.2. Please let me know if you continue to see this behavior. |
Thanks a lot! It works! |
Hi! I'm running into a problem of repeating the first token in subsequent requests using a stream. The prompt structure follows the Meta LLama3 documentation. Could you explain why is this going on?
Simple chat example output looks in this way:
Example code:
Thanks for your help!
The text was updated successfully, but these errors were encountered: