Skip to content

Commit 9fb095c

Browse files
committed
Merge commit 'bd420f2e9ef6d04a8b4c62c1ab27d3b8b7fb9b67' into feat/qwen2-llava
* commit 'bd420f2e9ef6d04a8b4c62c1ab27d3b8b7fb9b67': Support gemma2 (modelscope#1247) Add more datasets (modelscope#1246) refactor inference (modelscope#1245) fix bugs (modelscope#1242)
2 parents 5abd7f4 + bd420f2 commit 9fb095c

18 files changed

+381
-261
lines changed

README.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,7 @@ SWIFT has rich documentations for users, please check [here](https://github.com/
4747
SWIFT web-ui is available both on [Huggingface space](https://huggingface.co/spaces/tastelikefeet/swift) and [ModelScope studio](https://www.modelscope.cn/studios/iic/Scalable-lightWeight-Infrastructure-for-Fine-Tuning/summary), please feel free to try!
4848

4949
## 🎉 News
50+
- 🔥2024.06.28: Support for Gemma2 series models: gemma2-9b, gemma2-9b-instruct, gemma2-27b, gemma2-27b-instruct.
5051
- 🔥2024.06.18: Supports **DeepSeek-Coder-v2** series model! Use model_type `deepseek-coder-v2-instruct` and `deepseek-coder-v2-lite-instruct` to begin.
5152
- 🔥2024.06.16: Supports **KTO** and **CPO** training! See [document](https://github.com/modelscope/swift/blob/main/docs/source_en/LLM/Human-Preference-Alignment-Training-Documentation.md) to start training!
5253
- 2024.06.11: Support for tool-calling agent deployment that conform to the OpenAI interface.You can refer to [Agent deployment best practice](https://github.com/modelscope/swift/blob/main/docs/source_en/LLM/Agent-deployment-best-practice.md)
@@ -512,7 +513,7 @@ The complete list of supported models and datasets can be found at [Supported Mo
512513
| InternLM<br>InternLM2<br>InternLM2-Math | [Pujiang AI Lab InternLM series models](https://github.com/InternLM/InternLM) | Chinese<br>English | 1.8B-20B | base model<br>chat model<br>math model |
513514
| DeepSeek<br>DeepSeek-MoE<br>DeepSeek-Coder<br>DeepSeek-Math<br>DeepSeek-V2<br>DeepSeek-Coder-V2 | [DeepSeek series models](https://github.com/deepseek-ai) | Chinese<br>English | 1.3B-236B | base model<br>chat model<br>MoE model<br>code model<br>math model |
514515
| MAMBA | [MAMBA temporal convolution model](https://github.com/state-spaces/mamba) | English | 130M-2.8B | base model |
515-
| Gemma | [Google Gemma series models](https://github.com/google/gemma_pytorch) | English | 2B-7B | base model<br>instruct model |
516+
| Gemma<br>Gemma2 | [Google Gemma series models](https://github.com/google/gemma_pytorch) | English | 2B-27B | base model<br>instruct model |
516517
| MiniCPM | [OpenBmB MiniCPM series models](https://github.com/OpenBMB/MiniCPM) | Chinese<br>English | 2B-3B | chat model<br>MoE model |
517518
| OpenBuddy | [OpenBuddy series models](https://github.com/OpenBuddy/OpenBuddy) | Chinese<br>English | 7B-70B | base model<br>chat model |
518519
| Orion | [OrionStar AI series models](https://github.com/OrionStarAI) | Chinese<br>English | 14B | base model<br>chat model |

README_CN.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,7 @@ SWIFT具有丰富的文档体系,如有使用问题请请查看[这里](https:
4848
可以在[Huggingface space](https://huggingface.co/spaces/tastelikefeet/swift)[ModelScope创空间](https://www.modelscope.cn/studios/iic/Scalable-lightWeight-Infrastructure-for-Fine-Tuning/summary) 中体验SWIFT web-ui功能了。
4949

5050
## 🎉 新闻
51+
- 🔥2024.06.28: 支持**Gemma2**系列模型: gemma2-9b, gemma2-9b-instruct, gemma2-27b, gemma2-27b-instruct.
5152
- 🔥2024.06.18: 支持**DeepSeek-Coder-v2**系列模型! 使用model_type`deepseek-coder-v2-instruct``deepseek-coder-v2-lite-instruct`来开启训练和推理.
5253
- 🔥2024.06.16: 支持**KTO****CPO**训练,使用`swift rlhf --rlhf_type kto``swift rlhf --rlhf_type cpo`来开始训练,可以参考[文档](./docs/source/LLM/人类偏好对齐训练文档.md).
5354
- 2024.06.11: 支持符合OpenAI接口的工具调用Agent部署, 可以查看[Agent部署最佳实践](docs/source/LLM/Agent部署最佳实践.md).
@@ -508,7 +509,7 @@ CUDA_VISIBLE_DEVICES=0 swift deploy \
508509
| InternLM<br>InternLM2<br>InternLM2-Math | [浦江实验室书生浦语系列模型](https://github.com/InternLM/InternLM) | 中文<br>英文 | 1.8B-20B | base模型<br>chat模型<br>数学模型 |
509510
| DeepSeek<br>DeepSeek-MoE<br>DeepSeek-Coder<br>DeepSeek-Math<br>DeepSeek-V2<br>DeepSeek-Coder-V2 | [幻方系列模型](https://github.com/deepseek-ai) | 中文<br>英文 | 1.3B-236B | base模型<br>chat模型<br>MoE模型<br>代码模型<br>数学模型 |
510511
| MAMBA | [MAMBA时序卷积模型](https://github.com/state-spaces/mamba) | 英文 | 130M-2.8B | base模型 |
511-
| Gemma | [Google Gemma系列模型](https://github.com/google/gemma_pytorch) | 英文 | 2B-7B | base模型<br>instruct模型 |
512+
| Gemma<br>Gemma2 | [Google Gemma系列模型](https://github.com/google/gemma_pytorch) | 英文 | 2B-27B | base模型<br>instruct模型 |
512513
| MiniCPM | [OpenBmB MiniCPM系列模型](https://github.com/OpenBMB/MiniCPM) | 中文<br>英文 | 2B-3B | chat模型<br>MoE模型 |
513514
| OpenBuddy | [OpenBuddy系列模型](https://github.com/OpenBuddy/OpenBuddy) | 中文<br>英文 | 7B-70B | base模型<br>chat模型 |
514515
| Orion | [猎户星空系列模型](https://github.com/OrionStarAI) | 中文<br>英文 | 14B | base模型<br>chat模型 |

0 commit comments

Comments
 (0)