-
Notifications
You must be signed in to change notification settings - Fork 12k
llama : initial Mamba-2 support #9126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
compilade
wants to merge
33
commits into
master
Choose a base branch
from
compilade/mamba2
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+862
−324
Open
Changes from 11 commits
Commits
Show all changes
33 commits
Select commit
Hold shift + click to select a range
1f0fea7
llama : initial Mamba-2 support
compilade dceff23
ggml : SIMD ggml_ssm_scan for Mamba-2
compilade 2bfe9de
llama : support running Mamba-Codestral-7B-v0.1
compilade aff9692
llama : fix Mamba-2 conv state saving
compilade e04910d
llama : remove unused variable
compilade fa358e7
llama : add missing break
compilade 38913dc
convert_hf : prefer SentencePiece tokenizer for Mamba-2 when present
compilade 0e601ca
Merge branch 'master' into compilade/mamba2
compilade 273e7a4
llama : avoid redundant state copy for Mamba 1 and 2
compilade 7d6cb36
Merge branch 'master' into compilade/mamba2
compilade 2c77d79
metal : attempt to adapt SSM_SCAN for Mamba-2
compilade 87b97d0
metal : fix SSM_SCAN pipeline scope
compilade 03d0e6e
metal : use log and exp instead of log1pf and expf in SSM_SCAN
compilade 7a351ab
metal : remove unused arguments for SSM_SCAN
compilade 8b15bc6
metal : add back n_seqs to SSM_SCAN args
compilade 5b8ec2b
metal : fix SSM_SCAN state head offset
compilade 62b09b3
metal : fix wrong number of tokens per sequence in SSM_SCAN
compilade 038d958
Merge branch 'master' into compilade/mamba2
compilade 805512a
ggml : remove unused fast broadcast path in GGML_MUL
compilade 7d16e1b
Merge branch 'master' into compilade/mamba2
compilade 3bc7103
ggml : avoid multiply by D in GGML_OP_SSM_SCAN
compilade 8d8f065
Merge branch 'master' into compilade/mamba2
compilade b4e9c59
convert : fix flake8 lint
compilade 1ee6c48
Merge branch 'master' into compilade/mamba2
compilade c9ecf62
Merge branch 'master' into compilade/mamba2
compilade 35d06fa
Merge branch 'master' into compilade/mamba2
compilade cf4f0a4
metal : fix confusion between ; and ,
compilade 6def5cd
metal : add missing args for nb references in ssm_scan_f32_group
compilade 791998b
metal : single-user mamba2 inference works
compilade 94c3d53
kv-cache : remove const_cast when setting inputs for s_copy
compilade 929fe85
Merge branch 'master' into compilade/mamba2
compilade d55b0d0
convert : avoid AutoConfig for Mamba and Mamba2 hparams
compilade e94f393
kv-cache : allow context shift for recurrent models
compilade File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.