You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Feature request
Nous Research and EleutherAI have recently released the YaRN model, which comes in two versions with context sizes of 64k and 128k. This model utilizes RoFormer-style embeddings, distinguishing it from GPT-NeoX and GPT-J. It is built upon the foundation of the LLaMa 2 model, making it largely compatible with some minor adjustments required for optimal support.
Motivation
The YaRN model's longer context length (up to 128k) is highly valuable for tasks involving extensive context, compared to the limited 4096 context length of the llama2 base model.
Feature request
Nous Research and EleutherAI have recently released the YaRN model, which comes in two versions with context sizes of 64k and 128k. This model utilizes RoFormer-style embeddings, distinguishing it from GPT-NeoX and GPT-J. It is built upon the foundation of the LLaMa 2 model, making it largely compatible with some minor adjustments required for optimal support.
Motivation
The YaRN model's longer context length (up to 128k) is highly valuable for tasks involving extensive context, compared to the limited 4096 context length of the llama2 base model.
Other
YaRN paper: YaRN: Efficient Context Window Extension of Large Language Models
YaRN Code: YaRN Github
The text was updated successfully, but these errors were encountered: