Skip to content

LLMs and RAG Pipeline #47

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Tracked by #43
Solobrad opened this issue Sep 30, 2024 · 34 comments
Closed
Tracked by #43

LLMs and RAG Pipeline #47

Solobrad opened this issue Sep 30, 2024 · 34 comments

Comments

@Solobrad
Copy link
Contributor

Solobrad commented Sep 30, 2024

Modern LLMs like Llama seem to outperform traditional RAG methods on long-context tasks, demonstrating improved context handling and understanding, which may lead to reconsidering the need for RAG in many scenarios.

However, I recently came across something called SRF-RAG, which offers several key benefits.

Retrieval: Retrieves relevant context from external sources.
Generation: Produces coherent responses based on retrieved context.
Instruction Tuning: Improves understanding of complex queries.
Hallucination Reduction: Minimizes incorrect or misleading information.
Multi-Hop Reasoning: Handles complex questions by synthesising information from multiple sources.

I think we could use Langchain to set up a pipeline that figures out whether an input needs a RAG-based approach or can be handled directly by the LLM

Call for Contributions #43

@ariG23498
Copy link
Collaborator

Hey @Solobrad

Thanks for taking this bit up!

For the time being would you be interested in building a quick RAG pipeline with the Llama family of models? Once that is done, we could look into SRF-RAG as an enhancement.

The suggestion is based on the fact that this repository (huggingface-llama-recipes) is built with the idea of helping anyone to get started quickly.

Please let me know how you feel about my suggestion. Also feel free to ask any questions if you have.

@Solobrad
Copy link
Contributor Author

Hi @ariG23498

I'm in to collaborate with other teams on this repo. Thanks for the opportunity!

@ariG23498
Copy link
Collaborator

That would be great!

But would you be open to implementing a very simple (here simple is the keyword) RAG pipeline in the first place?

If you are fine with that, I can redirect other contributors to this issue so that you can collaborate with them on this.

@Solobrad
Copy link
Contributor Author

Solobrad commented Sep 30, 2024

Yup count me in

This was referenced Sep 30, 2024
@Purity-E
Copy link

Hi @ariG23498 ,
Thanks for redirecting me to this issue. @Solobrad I'm looking forward to collaborating with you.

@Solobrad
Copy link
Contributor Author

Same here, nice meeting you @Purity-E !

@Solobrad
Copy link
Contributor Author

@Purity-E, we are required to start simple so I've added a very basic pipeline code. If the PR is accepted, we can work on enhancements later

@Solobrad Solobrad changed the title Proposal for Call for Contributions #43: LLMs and RAG LLMs and RAG Pipeline Sep 30, 2024
@Solobrad
Copy link
Contributor Author

@ariG23498 I see a few issues same as mine, should we bring them over here? Brainstorming on the enhancements.

@Purity-E
Copy link

@Purity-E, we are required to start simple so I've added a very basic pipeline code. If the PR is accepted, we can work on enhancements later

Cool. Thanks for the update.

@ariG23498
Copy link
Collaborator

@ariG23498 I see a few issues same as mine, should we bring them over here? Brainstorming on the enhancements.

Feel free to. Having said that, we are not really looking for a very complicated project with RAG. It should be enough to get anyone started with RAG using Llama.

This was referenced Oct 1, 2024
@atharv-jiwane
Copy link
Contributor

Hey @ariG23498 thanks for redirecting me to here.
Hii @Solobrad looking forward to working with you!

@Solobrad
Copy link
Contributor Author

Solobrad commented Oct 1, 2024

Hi @atharv-jiwane welcome to the team! I like your idea though, image retrieval can be more effective, especially with PDFs.

@atharv-jiwane
Copy link
Contributor

Hey! This is my first time contributing to an open source project so I am really excited! I saw the PR that was created pertaining to this issue and wanted to discuss the how we are going to build from the initial commit. Also saw @ariG23498 's comments on the PR and wanted to take that up. Let me know how you wanna divide/distribute the work.

@Solobrad
Copy link
Contributor Author

Solobrad commented Oct 1, 2024

Sure man, which would you like to work on? I'm thinking of adding you both to my forked repo as collaborators, so we can discuss work delegations and work on it. Let me know if that works for you. @Purity-E @atharv-jiwane

@Purity-E
Copy link

Purity-E commented Oct 1, 2024

@Solobrad sure that's okay

@atharv-jiwane
Copy link
Contributor

@Solobrad Yup sounds good! I could take up the embedding part

@Solobrad
Copy link
Contributor Author

Solobrad commented Oct 1, 2024

Cool, we'll be working on the "llama-rag" branch then.

@Solobrad
Copy link
Contributor Author

Solobrad commented Oct 1, 2024

I'll check on the dataset.

@Solobrad
Copy link
Contributor Author

Solobrad commented Oct 1, 2024

I've added a transcript dataset @atharv-jiwane, it's clean and pretty straightforward. You can try embedding it. Thanks

@atharv-jiwane
Copy link
Contributor

I have tried embedding the dataset. I am not sure I committed the changes properly, @Solobrad could you please guide me

@Solobrad
Copy link
Contributor Author

Solobrad commented Oct 1, 2024

Hey @atharv-jiwane , I saw an error about the LLaMa access try filling in the access form https://huggingface.co/meta-llama/Llama-3.1-8B. Even though LLaMa is an open-sourced LLM, we normally have to fill in an application before we can use it from Hugging Face or Kaggle. Does this answer your question?

I changed the code a little because the naming you used for sentenceTransfromer was rewriting the previous LLaMa model. Go ahead and check it out. Hope this helps.

@atharv-jiwane
Copy link
Contributor

Hey @Solobrad , I've reviewed your changes and I have filled the verification form for using LLaMa. Thank you for the information on that. I think it takes some time to get the request reviewed.

Meanwhile, I think the only changes I have made are in the embeddings sections right after the dataset has been imported so could you please commit an error-free version of the code to the "llama-rag" branch?

@atharv-jiwane
Copy link
Contributor

atharv-jiwane commented Oct 1, 2024

Hey @Solobrad , I have added an LLM pipeline in the latest commit and fixed the earlier auth issues with LLaMa models. I tried to run the query but it took too long to generate a response. Could you please guide me as to where I am going wrong?

Also, the earlier version of embeddings that I wrote could instead just be done when we create the vector store right?

@Solobrad
Copy link
Contributor Author

Solobrad commented Oct 2, 2024

Yup @atharv-jiwane , just create the vector store and let it handle embedding the documents. You don't need to separately encode them. If this was what you were asking.

I'll try checking on the prolonged response time.

@atharv-jiwane
Copy link
Contributor

Cool, so @Solobrad let's do away with the separate encodings? Also can we add GPU support? I am running this locally on a Macbook Air M2 so I think GPU support would be nice.

Also pertaining to the response time, when I first passed the query ("What is Hugging Face.") to the LLM there was an error code generated which said that the max_new_tokens was exceeded beyond 20. This might also be causing an issue.

@Solobrad
Copy link
Contributor Author

Solobrad commented Oct 3, 2024

Hi, I solved the max token problem, and I attribute the ‘long’ response time to using the model locally. I’ve tried to use APIs directly and also a smaller model.

@Purity-E @atharv-jiwane, I've pushed the latest runnable code, go on and have a try.

@atharv-jiwane
Copy link
Contributor

@Solobrad Thanks for the update! I’ll have a go soon, running slightly busy

@Solobrad
Copy link
Contributor Author

Solobrad commented Oct 4, 2024

@sinatayebati will be joining us.

@Solobrad
Copy link
Contributor Author

Solobrad commented Oct 5, 2024

Hey everyone, I suggest adding some compelling markdown so users can easily read what's going on (as mentioned before). So it's like a simple DEMO or tutorial, what's your take on this? @sinatayebati @atharv-jiwane @Purity-E

PS: I added the LLaMa back

@sinatayebati
Copy link
Contributor

@Solobrad Hey Nicholas. Thanks for the latest commits. In my opinion this latest notebook should be very close to what HF team has in mind. I also just pushed two minor updates:

  • added a line in the beginning of notebook to pip install required libraries
  • updated the readme with a section pointing to this notebook.

@Solobrad
Copy link
Contributor Author

Solobrad commented Oct 5, 2024

Awesome, thanks!

@atharv-jiwane
Copy link
Contributor

Hey @Solobrad! I think the latest commit looks good. I think we should consult the maintainers and ask for their opinion on this

@Solobrad
Copy link
Contributor Author

Solobrad commented Oct 7, 2024

I've updated the code according to the latest requirements guys @Purity-E @atharv-jiwane. Feel free to add any markdowns or so. You should use Google Collab if you want to run the code.

ariG23498 pushed a commit that referenced this issue Oct 7, 2024
* Adding a simple LLaMa-RAG pipeline

* Adding datasets from Hugging Face

* Update llama_rag_pipeline.ipynb

* changing model names and removing unused components

* Added llm pipeline and fixed prev LLaMa issue

* Access LLaMa by API

* left out files

* removing fles

* working version of simple RAG LLaMa

* Cleaning up comments and unused components

* Modifying the prompt

* Commenting out chat template code

* adding chat template

* removing files

* simple RAG

* adding LLaMa

* changing system prompt

* pip install reuqired libraries + readme update

* Adding markdowns

* removing outputs

* updated pip install + resolving conflict in readme

* bringing back the readme note after conflict reolsved

---------

Co-authored-by: atharv-jiwane <[email protected]>
Co-authored-by: sinatayebati <[email protected]>
@ariG23498
Copy link
Collaborator

Closing this issue as the PR has been merged! Thanks for the great contribution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants