-
Notifications
You must be signed in to change notification settings - Fork 4.7k
Document summarizer #952
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
You could potentially use this for summarizing documents, if you amend the prompt in the approaches files. How are you envisioning people interacting with it? Would they say "summarize " or would they select a document from a list and then ask for a summary? I'd wonder whether it'd be more efficient to generate the summaries ahead of time, for every document, instead of waiting for a user to ask. |
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this issue will be closed. |
We would like to use it to c/p our internal company data text into prompt and make summary out of it with the model, instead of using the "public" openai service. There is a limit 1000 chars for the input prompt. Could we change this (to which limit then) and how that would also probably impact the performance as well? |
Hm, what do you mean by a limit of 1000 characters for the input prompt? I believe the limit should be for the entire request, and that usually varies between 4K and 128K tokens, depending on what model you're using. Generally, performance does depend on both input tokens and output tokens. If you know you always want to summarize, then you could run an offline script to generate the summaries and save them. |
Well if I try to paste more than 1000 characters into the question input (prompt), I can't do it. I think this is connected to the limitation in QuestionInput.tsx (frontend/src/components/QuestionInput/QuestionInput.tsx), starting on line 44: const onQuestionChange = (_ev: React.FormEvent<HTMLInputElement | HTMLTextAreaElement>, newValue?: string) => { |
Ah good find! I didn't write that original code so didn't realize we had a limitation imposed in the frontend. I'm not sure we need that there, given that developers may be using GPT models with different context length limits. @mattgotteiner What do you think about removing that constraint? We could also ask @pablocastro about its original intent, as he wrote that line. You can remove that for now, for your own needs. |
We should remove this constraint - it is probably due to an oversight in the original implementation. Thanks for finding this problem. |
+1, feel free to change, I suspect this was originally in place to make sure the input + instructions didn't exceed the context window limit, but those limits have gone up, and there's probably better ways to restrict this if needed. As a general good practice, you still want a limit (either a higher one or a more elaborate one that depends on the target LLM's max prompt length) since you should not allow inputs that haven't been tested, so you need a test that tries whatever static/dynamic max length input you want to allow. |
This issue is for a: (mark with an
x
)Is it possible to use the current solution/ indexer to summarize documents?
The text was updated successfully, but these errors were encountered: