Replies: 5 comments 10 replies
-
What does the "☁️" mean? |
Beta Was this translation helpful? Give feedback.
-
So "Parallel decoding" is done by |
Beta Was this translation helpful? Give feedback.
-
Should beam search be added here? I think it is broken atm, at least with CUDA. |
Beta Was this translation helpful? Give feedback.
-
What would be criteria for considering OpenCL back-end working correctly? I've fixed all known bugs in ggml-opencl.cpp and now working on refactoring like #3669. |
Beta Was this translation helpful? Give feedback.
-
Is there any further progress on Finetuning on metal gpu? |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
[NO LONGER UPDATED]
Below is a summary of the functionality provided by the
llama.cpp
project.Legend (feel free to update):
✅ - Working correctly
☁️ - Partially working
❌ - Failing
❓ - Status unknown (needs testing)
🔬 - Under investigation
🚧 - Currently in development
main
,simple
batched
parallel
speculative
speculative
speculative
lookahead
infill
server
embedding
main
main
main
main
main
main
main
main
main
main
main
,server
beam-search
main
test-tokenizer-0-llama
test-tokenizer-0-falcon
main
main
main
main
main
main
main
llava
main
main
main
finetune
finetune
ggml
ggml
ggml-cuda
ggml-cuda
ggml-metal
ggml-opencl
ggml-vulkan
Beta Was this translation helpful? Give feedback.
All reactions