Skip to content

Pydantic Issue when running Ollama + FastAPI backend #244

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
BastianSpatz opened this issue Aug 19, 2024 · 9 comments · Fixed by #275
Closed

Pydantic Issue when running Ollama + FastAPI backend #244

BastianSpatz opened this issue Aug 19, 2024 · 9 comments · Fixed by #275
Assignees

Comments

@BastianSpatz
Copy link

When using ollama as a model source, i get the error:

ERROR: Error when generating next question: 1 validation error for LLMStructuredPredictEndEvent output value is not a valid dict (type=type_error.dict)

when it wants to generate the NextQuestions.

output: NextQuestions = await Settings.llm.astructured_predict(
NextQuestions,
prompt=NEXT_QUESTIONS_SUGGESTION_PROMPT,
conversation=conversation,
number_of_questions=number_of_questions,
)

I think this is a llama-index/pydantic problem when calling astructured_predict in the call dispatcher.event(LLMStructuredPredictEndEvent(output=result)).

Has anybody anybody seen or fixed this error?

@marcusschiesser
Copy link
Collaborator

  1. The astructured_predict call requires good function calling. What model are you using?
  2. The code generated by create llama is only trying this call and shouldn't show the next questions if it fails - is this your behavior?

@BastianSpatz
Copy link
Author

Thanks for the reply.

  1. I'm using llama 3.1 8B
  2. ANd yes it just throws the error and doesnt generate questions. But the app works fine nonetheless

@marcusschiesser
Copy link
Collaborator

@BastianSpatz
I guess that the model is not capable enough to use structured_predict.

As Typescript doesn't have structured_predict, it's using a simple LLM call that is parsed; see:

export async function generateNextQuestions(
conversation: ChatMessage[],
numberOfQuestions: number = N_QUESTIONS_TO_GENERATE,
) {
const llm = Settings.llm;
// Format conversation
const conversationText = conversation
.map((message) => `${message.role}: ${message.content}`)
.join("\n");
const message = NEXT_QUESTION_PROMPT_TEMPLATE.replace(
"$conversation",
conversationText,
).replace("$number_of_questions", numberOfQuestions.toString());
try {
const response = await llm.complete({ prompt: message });
const questions = extractQuestions(response.text);
return questions;
} catch (error) {
console.error("Error when generating the next questions: ", error);
return [];
}
}

Can you try using the NextJS template first with your Ollama model - if that works you could modify suggestion.py accordingly.

@BastianSpatz
Copy link
Author

Thank you for the help Ill check it out :)

@marcusschiesser
Copy link
Collaborator

Great. can you let me know the result, we can keep the ticket open till then

@BastianSpatz
Copy link
Author

Using the same approach as in the Typescript version it works

@marcusschiesser
Copy link
Collaborator

cool. can you send a PR or post here your changes?

@BastianSpatz
Copy link
Author

Sorry here is what i changed in the suggestions.py:

NEXT_QUESTIONS_SUGGESTION_PROMPT = PromptTemplate(
    "You're a helpful assistant! Your task is to suggest the next question that user might ask. "
    "\nHere is the conversation history"
    "\n---------------------\n{conversation}\n---------------------"
    "Given the conversation history, please give me {number_of_questions} questions that you might ask next!"
    "Keep the answers relevant to the conversation history and its context."
    "Your answer should be wrapped in three sticks which follows the following format:"
    "\`\`\`"
    "<question 1>\n"
    "<question 2>\n\`\`\`"
)

class NextQuestionSuggestion:
    @staticmethod
    def suggest_next_questions(
        messages: List[Message],
        number_of_questions: int = N_QUESTION_TO_GENERATE,
    ) -> List[str]:
        """
        Suggest the next questions that user might ask based on the conversation history
        Return as empty list if there is an error
        """
        try:
            # Reduce the cost by only using the last two messages
            last_user_message = None
            last_assistant_message = None
            for message in reversed(messages):
                if message.role == "user":
                    last_user_message = f"User: {message.content}"
                elif message.role == "assistant":
                    last_assistant_message = f"Assistant: {message.content}"
                if last_user_message and last_assistant_message:
                    break
            conversation: str = f"{last_user_message}\n{last_assistant_message}"

            # output: NextQuestions = await Settings.llm.astructured_predict(
            #     NextQuestions,
            #     prompt=NEXT_QUESTIONS_SUGGESTION_PROMPT,
            #     conversation=conversation,
            #     number_of_questions=number_of_questions,
            # )
            prompt = (
                NEXT_QUESTIONS_SUGGESTION_PROMPT.get_template()
                .replace("{conversation}", conversation)
                .replace("{number_of_questions}", str(number_of_questions))
            )
            output = Settings.llm.complete(prompt)
            questions = extract_questions_from_text(output.text)

            return questions
        except Exception as e:
            logger.error(f"Error when generating next question: {e}")
            return []


def extract_questions_from_text(prompt: str) -> List[str]:
    # Regular expression to match content within triple backticks
    pattern = r"\`\`\`(.*?)\`\`\`"

    # Find the content inside the backticks
    match = re.search(pattern, prompt, re.DOTALL)

    if match:
        # Split the content by newlines and strip any leading/trailing whitespace
        questions = [
            line.strip() for line in match.group(1).splitlines() if line.strip()
        ]
        questions = [question for question in questions if "?" in question]
        return questions

    return []

I have noticed that after a few questions the format of the output questions by the llm seem to deteriorate.

@marcusschiesser
Copy link
Collaborator

Thanks @BastianSpatz

@marcusschiesser marcusschiesser moved this from Todo to In Progress in Framework Sep 6, 2024
@github-project-automation github-project-automation bot moved this from In Progress to Done in Framework Sep 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants