You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+2-1
Original file line number
Diff line number
Diff line change
@@ -47,6 +47,7 @@ SWIFT has rich documentations for users, please check [here](https://github.com/
47
47
SWIFT web-ui is available both on [Huggingface space](https://huggingface.co/spaces/tastelikefeet/swift) and [ModelScope studio](https://www.modelscope.cn/studios/iic/Scalable-lightWeight-Infrastructure-for-Fine-Tuning/summary), please feel free to try!
48
48
49
49
## 🎉 News
50
+
- 🔥2024.06.28: Support for Gemma2 series models: gemma2-9b, gemma2-9b-instruct, gemma2-27b, gemma2-27b-instruct.
50
51
- 🔥2024.06.18: Supports **DeepSeek-Coder-v2** series model! Use model_type `deepseek-coder-v2-instruct` and `deepseek-coder-v2-lite-instruct` to begin.
51
52
- 🔥2024.06.16: Supports **KTO** and **CPO** training! See [document](https://github.com/modelscope/swift/blob/main/docs/source_en/LLM/Human-Preference-Alignment-Training-Documentation.md) to start training!
52
53
- 2024.06.11: Support for tool-calling agent deployment that conform to the OpenAI interface.You can refer to [Agent deployment best practice](https://github.com/modelscope/swift/blob/main/docs/source_en/LLM/Agent-deployment-best-practice.md)
@@ -512,7 +513,7 @@ The complete list of supported models and datasets can be found at [Supported Mo
512
513
| InternLM<br>InternLM2<br>InternLM2-Math |[Pujiang AI Lab InternLM series models](https://github.com/InternLM/InternLM)| Chinese<br>English | 1.8B-20B | base model<br>chat model<br>math model |
513
514
| DeepSeek<br>DeepSeek-MoE<br>DeepSeek-Coder<br>DeepSeek-Math<br>DeepSeek-V2<br>DeepSeek-Coder-V2 |[DeepSeek series models](https://github.com/deepseek-ai)| Chinese<br>English | 1.3B-236B | base model<br>chat model<br>MoE model<br>code model<br>math model |
514
515
| MAMBA |[MAMBA temporal convolution model](https://github.com/state-spaces/mamba)| English | 130M-2.8B | base model |
515
-
| Gemma |[Google Gemma series models](https://github.com/google/gemma_pytorch)| English | 2B-7B| base model<br>instruct model |
516
+
| Gemma<br>Gemma2|[Google Gemma series models](https://github.com/google/gemma_pytorch)| English | 2B-27B| base model<br>instruct model |
516
517
| MiniCPM |[OpenBmB MiniCPM series models](https://github.com/OpenBMB/MiniCPM)| Chinese<br>English | 2B-3B | chat model<br>MoE model |
517
518
| OpenBuddy |[OpenBuddy series models](https://github.com/OpenBuddy/OpenBuddy)| Chinese<br>English | 7B-70B | base model<br>chat model |
518
519
| Orion |[OrionStar AI series models](https://github.com/OrionStarAI)| Chinese<br>English | 14B | base model<br>chat model |
|🔥swift-mix|[swift/swift-sft-mixture](https://modelscope.cn/datasets/swift/swift-sft-mixture/summary)|sharegpt<br>firefly<br>codefuse<br>metamathqa|-|Dataset is too huge, please click the original link to view the dataset stat.|chat, sft, general|-|
|🔥swift-mix|[swift/swift-sft-mixture](https://modelscope.cn/datasets/swift/swift-sft-mixture/summary)|sharegpt<br>firefly<br>codefuse<br>metamathqa|-|Dataset is too huge, please click the original link to view the dataset stat.|chat, sft, general|-|
0 commit comments