-
Notifications
You must be signed in to change notification settings - Fork 11.9k
llama: implement YaRN RoPE scaling #2268
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 33 commits
Commits
Show all changes
36 commits
Select commit
Hold shift + click to select a range
8dec38c
llama: implement NTK-By-Parts (NTKv2) RoPE scaling
cebtenzzre 6aeb46b
CUDA implementation
cebtenzzre 9348aa4
Metal implementation
cebtenzzre a30ae20
implement new YaRN algorithm
cebtenzzre b5ced4f
Merge branch 'master' of https://github.com/ggerganov/llama.cpp into …
cebtenzzre 826269a
ggml : increase GGML_MAX_OP_PARAMS
cebtenzzre cf731d5
YaRN : avoid NaN if unused betas are zero
cebtenzzre dcb058c
YaRN : fix missing parameter in CUDA impl
cebtenzzre 281b26e
convert : reduce unnecessary variables in Params
cebtenzzre a06c729
Merge branch 'master' of https://github.com/ggerganov/llama.cpp into …
cebtenzzre dc26a0d
llama : simplify use of context params
cebtenzzre 904d4ed
llama : store YaRN parameters in GGUF
cebtenzzre 56abb9a
fix convert scripts
cebtenzzre 43eaf06
llama : fix C compatibility
cebtenzzre fe788c4
don't hardcode max_pos_emb
cebtenzzre e0b120c
address review comments
cebtenzzre 19bb74e
restore backwards compatiblity with *.rope.scale_linear
cebtenzzre 4d5fe73
better option descriptions in help
cebtenzzre 7466415
gguf : store scaling type as a string instead of an int
cebtenzzre 4f4e948
improve printing of YaRN parameters
cebtenzzre 5d7a3a5
allow forcing ext_factor to zero if scaling type is YaRN
cebtenzzre 9bd050f
Merge branch 'master' of https://github.com/ggerganov/llama.cpp into …
cebtenzzre babf0e0
fix rope_cuda parameter order
cebtenzzre 0050e1e
default n_yarn_orig_ctx to n_ctx_train
cebtenzzre 09c3102
fix uninitialized cparams
cebtenzzre 57c3442
make printed param formatting more consistent
cebtenzzre a20b3e6
fix missing import
cebtenzzre 9ef91b1
Merge branch 'master' of https://github.com/ggerganov/llama.cpp into …
cebtenzzre 9ae10b3
Fix YaRN inverted scaling and add "rope.scaling.type" to GGUF (#1)
jquesnelle 14cf93b
fix YaRN ramp, make mscale conditional, add --yarn-orig-ctx (#2)
jquesnelle 237f1e7
Merge branch 'master' of https://github.com/ggerganov/llama.cpp into …
cebtenzzre bc8395d
Merge branch 'master' of https://github.com/ggerganov/llama.cpp into …
cebtenzzre 4d5ed83
Merge branch 'master' of https://github.com/ggerganov/llama.cpp into …
cebtenzzre 9fc8238
fix loading rope.scaling.original_context_length from GGUF (#3)
jquesnelle 15f26ef
implement YaRN for GPT-NeoX RoPE
cebtenzzre 081f738
Merge branch 'master' of https://github.com/ggerganov/llama.cpp into …
cebtenzzre File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.