Skip to content

Commit 6af3619

Browse files
committed
Merge branch 'main' into fsdp_lmm
2 parents 8715e04 + 82d4049 commit 6af3619

File tree

4 files changed

+4
-3
lines changed

4 files changed

+4
-3
lines changed

.github/scripts/spellcheck_conf/wordlist.txt

+1
Original file line numberDiff line numberDiff line change
@@ -1483,3 +1483,4 @@ ttft
14831483
uv
14841484
8xL40S
14851485
xL
1486+
EDA

docs/multi_gpu.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ To run fine-tuning on multi-GPUs, we will make use of two packages:
44

55
1. [PEFT](https://huggingface.co/blog/peft) methods and in particular using the Hugging Face [PEFT](https://github.com/huggingface/peft)library.
66

7-
2. [FSDP](https://pytorch.org/tutorials/intermediate/FSDP_adavnced_tutorial.html) which helps us parallelize the training over multiple GPUs. [More details](LLM_finetuning.md/#2-full-partial-parameter-finetuning).
7+
2. [FSDP](https://pytorch.org/tutorials/intermediate/FSDP_adavnced_tutorial.html) which helps us parallelize the training over multiple GPUs. [More details](./LLM_finetuning.md).
88

99
Given the combination of PEFT and FSDP, we would be able to fine tune a Meta Llama 8B model on multiple GPUs in one node.
1010
For big models like 405B we will need to fine-tune in a multi-node setup even if 4bit quantization is enabled.

recipes/experimental/long_context/H2O/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ Expected results on XSUM (Rouge-2 score, the higher the better) from the above s
3636

3737
### One Demo on Streaming to "Infinite" Context Length
3838

39-
The following example demonstrates the generation process of "infinite" sequence length. We use MT-Bench data and generate the context sample-by-sample. The KV Cache will keep the KV pairs from the previous samples while maintain a fixed size. Results can be found on [Demo](https://allenz.work/?p=11) (Video 1).
39+
The following example demonstrates the generation process of "infinite" sequence length. We use MT-Bench data and generate the context sample-by-sample. The KV Cache will keep the KV pairs from the previous samples while maintain a fixed size.
4040

4141
```
4242
# run with full cache

recipes/quickstart/agents/DeepLearningai_Course_Notebooks/Building_Agentic_RAG_with_Llamaindex_L1_Router_Engine.ipynb

+1-1
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
"cell_type": "markdown",
55
"metadata": {},
66
"source": [
7-
"<a href=\"https://colab.research.google.com/github/meta-llama/llama-recipes/blob/main/recipes/quickstart/agents/dlai/Building_Agentic_RAG_with_Llamaindex_L1_Router_Engine.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
7+
"<a href=\"https://colab.research.google.com/github/meta-llama/llama-recipes/blob/main/recipes/quickstart/agents/DeepLearningai_Course_Notebooks/Building_Agentic_RAG_with_Llamaindex_L1_Router_Engine.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
88
]
99
},
1010
{

0 commit comments

Comments
 (0)