[Doc] Update the modelslim website from gitee to gitcode. (#3615)
### What this PR does / why we need it? Because the ModelSlim code repository has migrated from gitee to gitcode, all relevant links in the repository have been updated. [migration notice](https://gitee.com/ascend/msit/tree/master/.%E6%9C%AC%E9%A1%B9%E7%9B%AE%E5%B7%B2%E7%BB%8F%E6%AD%A3%E5%BC%8F%E8%BF%81%E7%A7%BB%E8%87%B3%20Gitcode%20%E5%B9%B3%E5%8F%B0) ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? vLLM version: v0.11.0rc3 vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0 - vLLM version: v0.11.0rc3 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0 --------- Signed-off-by: Crazyang <im.crazyang@gmail.com> Signed-off-by: Pr0Wh1teGivee <calvin_zhu0210@outlook.com> Co-authored-by: weichen <calvin_zhu0210@outlook.com>
This commit is contained in:
@@ -50,23 +50,23 @@ msgstr "安装 modelslim"
|
||||
#: ../../user_guide/feature_guide/quantization.md:9
|
||||
msgid ""
|
||||
"To quantize a model, users should install "
|
||||
"[ModelSlim](https://gitee.com/ascend/msit/blob/master/msmodelslim/README.md)"
|
||||
"[ModelSlim](https://gitcode.com/Ascend/msit/blob/master/msmodelslim/README.md)"
|
||||
" which is the Ascend compression and acceleration tool. It is an affinity-"
|
||||
"based compression tool designed for acceleration, using compression as its "
|
||||
"core technology and built upon the Ascend platform."
|
||||
msgstr ""
|
||||
"要对模型进行量化,用户应安装[ModelSlim](https://gitee.com/ascend/msit/blob/master/msmodelslim/README.md),这是昇腾的压缩与加速工具。它是一种基于亲和性的压缩工具,专为加速设计,以压缩为核心技术,并基于昇腾平台构建。"
|
||||
"要对模型进行量化,用户应安装[ModelSlim](https://gitcode.com/Ascend/msit/blob/master/msmodelslim/README.md),这是昇腾的压缩与加速工具。它是一种基于亲和性的压缩工具,专为加速设计,以压缩为核心技术,并基于昇腾平台构建。"
|
||||
|
||||
#: ../../user_guide/feature_guide/quantization.md:11
|
||||
msgid ""
|
||||
"Currently, only the specific tag [modelslim-"
|
||||
"VLLM-8.1.RC1.b020_001](https://gitee.com/ascend/msit/blob/modelslim-"
|
||||
"VLLM-8.1.RC1.b020_001](https://gitcode.com/Ascend/msit/blob/modelslim-"
|
||||
"VLLM-8.1.RC1.b020_001/msmodelslim/README.md) of modelslim works with vLLM "
|
||||
"Ascend. Please do not install other version until modelslim master version "
|
||||
"is available for vLLM Ascend in the future."
|
||||
msgstr ""
|
||||
"目前,只有 modelslim 的特定标签 [modelslim-"
|
||||
"VLLM-8.1.RC1.b020_001](https://gitee.com/ascend/msit/blob/modelslim-"
|
||||
"VLLM-8.1.RC1.b020_001](https://gitcode.com/Ascend/msit/blob/modelslim-"
|
||||
"VLLM-8.1.RC1.b020_001/msmodelslim/README.md) 支持 vLLM Ascend。在未来 modelslim "
|
||||
"的主版本支持 vLLM Ascend 之前,请不要安装其他版本。"
|
||||
|
||||
@@ -85,12 +85,12 @@ msgid ""
|
||||
"ai/DeepSeek-V2-Lite) as an example, you just need to download the model, and"
|
||||
" then execute the convert command. The command is shown below. More info can"
|
||||
" be found in modelslim doc [deepseek w8a8 dynamic quantization "
|
||||
"docs](https://gitee.com/ascend/msit/blob/modelslim-"
|
||||
"docs](https://gitcode.com/Ascend/msit/blob/modelslim-"
|
||||
"VLLM-8.1.RC1.b020_001/msmodelslim/example/DeepSeek/README.md#deepseek-v2-w8a8-dynamic%E9%87%8F%E5%8C%96)."
|
||||
msgstr ""
|
||||
"以 [DeepSeek-V2-Lite](https://modelscope.cn/models/deepseek-"
|
||||
"ai/DeepSeek-V2-Lite) 为例,你只需要下载模型,然后执行转换命令。命令如下所示。更多信息可参考 modelslim 文档 "
|
||||
"[deepseek w8a8 动态量化文档](https://gitee.com/ascend/msit/blob/modelslim-"
|
||||
"[deepseek w8a8 动态量化文档](https://gitcode.com/Ascend/msit/blob/modelslim-"
|
||||
"VLLM-8.1.RC1.b020_001/msmodelslim/example/DeepSeek/README.md#deepseek-v2-w8a8-dynamic%E9%87%8F%E5%8C%96)。"
|
||||
|
||||
#: ../../user_guide/feature_guide/quantization.md:32
|
||||
|
||||
@@ -37,7 +37,7 @@ see https://www.modelscope.cn/models/vllm-ascend/QwQ-32B-W8A8
|
||||
|
||||
```bash
|
||||
# (Optional)This tag is recommended and has been verified
|
||||
git clone https://gitee.com/ascend/msit -b modelslim-VLLM-8.1.RC1.b020_001
|
||||
git clone https://gitcode.com/Ascend/msit -b modelslim-VLLM-8.1.RC1.b020_001
|
||||
|
||||
cd msit/msmodelslim
|
||||
# Install by run this script
|
||||
|
||||
@@ -34,7 +34,7 @@ see https://www.modelscope.cn/models/vllm-ascend/Qwen3-8B-W4A8
|
||||
|
||||
```bash
|
||||
# The branch(br_release_MindStudio_8.1.RC2_TR5_20260624) has been verified
|
||||
git clone -b br_release_MindStudio_8.1.RC2_TR5_20260624 https://gitee.com/ascend/msit
|
||||
git clone -b br_release_MindStudio_8.1.RC2_TR5_20260624 https://gitcode.com/Ascend/msit
|
||||
|
||||
cd msit/msmodelslim
|
||||
|
||||
|
||||
@@ -6,13 +6,13 @@ Since 0.9.0rc2 version, quantization feature is experimentally supported in vLLM
|
||||
|
||||
## Install modelslim
|
||||
|
||||
To quantize a model, users should install [ModelSlim](https://gitee.com/ascend/msit/blob/master/msmodelslim/README.md) which is the Ascend compression and acceleration tool. It is an affinity-based compression tool designed for acceleration, using compression as its core technology and built upon the Ascend platform.
|
||||
To quantize a model, users should install [ModelSlim](https://gitcode.com/Ascend/msit/blob/master/msmodelslim/README.md) which is the Ascend compression and acceleration tool. It is an affinity-based compression tool designed for acceleration, using compression as its core technology and built upon the Ascend platform.
|
||||
|
||||
Install modelslim:
|
||||
|
||||
```bash
|
||||
# The branch(br_release_MindStudio_8.1.RC2_TR5_20260624) has been verified
|
||||
git clone -b br_release_MindStudio_8.1.RC2_TR5_20260624 https://gitee.com/ascend/msit
|
||||
git clone -b br_release_MindStudio_8.1.RC2_TR5_20260624 https://gitcode.com/Ascend/msit
|
||||
|
||||
cd msit/msmodelslim
|
||||
|
||||
@@ -29,8 +29,8 @@ This conversion process will require a larger CPU memory, please ensure that the
|
||||
:::
|
||||
|
||||
### Adapts and change
|
||||
1. Ascend does not support the `flash_attn` library. To run the model, you need to follow the [guide](https://gitee.com/ascend/msit/blob/master/msmodelslim/example/DeepSeek/README.md#deepseek-v3r1) and comment out certain parts of the code in `modeling_deepseek.py` located in the weights folder.
|
||||
2. The current version of transformers does not support loading weights in FP8 quantization format. you need to follow the [guide](https://gitee.com/ascend/msit/blob/master/msmodelslim/example/DeepSeek/README.md#deepseek-v3r1) and delete the quantization related fields from `config.json` in the weights folder
|
||||
1. Ascend does not support the `flash_attn` library. To run the model, you need to follow the [guide](https://gitcode.com/Ascend/msit/blob/master/msmodelslim/example/DeepSeek/README.md#deepseek-v3r1) and comment out certain parts of the code in `modeling_deepseek.py` located in the weights folder.
|
||||
2. The current version of transformers does not support loading weights in FP8 quantization format. you need to follow the [guide](https://gitcode.com/Ascend/msit/blob/master/msmodelslim/example/DeepSeek/README.md#deepseek-v3r1) and delete the quantization related fields from `config.json` in the weights folder
|
||||
|
||||
### Generate the w8a8 weights
|
||||
|
||||
|
||||
Reference in New Issue
Block a user