diff --git a/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/feature_guide/quantization.po b/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/feature_guide/quantization.po index 54f524e3..d942cfbf 100644 --- a/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/feature_guide/quantization.po +++ b/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/feature_guide/quantization.po @@ -50,23 +50,23 @@ msgstr "安装 modelslim" #: ../../user_guide/feature_guide/quantization.md:9 msgid "" "To quantize a model, users should install " -"[ModelSlim](https://gitee.com/ascend/msit/blob/master/msmodelslim/README.md)" +"[ModelSlim](https://gitcode.com/Ascend/msit/blob/master/msmodelslim/README.md)" " which is the Ascend compression and acceleration tool. It is an affinity-" "based compression tool designed for acceleration, using compression as its " "core technology and built upon the Ascend platform." msgstr "" -"要对模型进行量化,用户应安装[ModelSlim](https://gitee.com/ascend/msit/blob/master/msmodelslim/README.md),这是昇腾的压缩与加速工具。它是一种基于亲和性的压缩工具,专为加速设计,以压缩为核心技术,并基于昇腾平台构建。" +"要对模型进行量化,用户应安装[ModelSlim](https://gitcode.com/Ascend/msit/blob/master/msmodelslim/README.md),这是昇腾的压缩与加速工具。它是一种基于亲和性的压缩工具,专为加速设计,以压缩为核心技术,并基于昇腾平台构建。" #: ../../user_guide/feature_guide/quantization.md:11 msgid "" "Currently, only the specific tag [modelslim-" -"VLLM-8.1.RC1.b020_001](https://gitee.com/ascend/msit/blob/modelslim-" +"VLLM-8.1.RC1.b020_001](https://gitcode.com/Ascend/msit/blob/modelslim-" "VLLM-8.1.RC1.b020_001/msmodelslim/README.md) of modelslim works with vLLM " "Ascend. Please do not install other version until modelslim master version " "is available for vLLM Ascend in the future." msgstr "" "目前,只有 modelslim 的特定标签 [modelslim-" -"VLLM-8.1.RC1.b020_001](https://gitee.com/ascend/msit/blob/modelslim-" +"VLLM-8.1.RC1.b020_001](https://gitcode.com/Ascend/msit/blob/modelslim-" "VLLM-8.1.RC1.b020_001/msmodelslim/README.md) 支持 vLLM Ascend。在未来 modelslim " "的主版本支持 vLLM Ascend 之前,请不要安装其他版本。" @@ -85,12 +85,12 @@ msgid "" "ai/DeepSeek-V2-Lite) as an example, you just need to download the model, and" " then execute the convert command. The command is shown below. More info can" " be found in modelslim doc [deepseek w8a8 dynamic quantization " -"docs](https://gitee.com/ascend/msit/blob/modelslim-" +"docs](https://gitcode.com/Ascend/msit/blob/modelslim-" "VLLM-8.1.RC1.b020_001/msmodelslim/example/DeepSeek/README.md#deepseek-v2-w8a8-dynamic%E9%87%8F%E5%8C%96)." msgstr "" "以 [DeepSeek-V2-Lite](https://modelscope.cn/models/deepseek-" "ai/DeepSeek-V2-Lite) 为例,你只需要下载模型,然后执行转换命令。命令如下所示。更多信息可参考 modelslim 文档 " -"[deepseek w8a8 动态量化文档](https://gitee.com/ascend/msit/blob/modelslim-" +"[deepseek w8a8 动态量化文档](https://gitcode.com/Ascend/msit/blob/modelslim-" "VLLM-8.1.RC1.b020_001/msmodelslim/example/DeepSeek/README.md#deepseek-v2-w8a8-dynamic%E9%87%8F%E5%8C%96)。" #: ../../user_guide/feature_guide/quantization.md:32 diff --git a/docs/source/tutorials/multi_npu_quantization.md b/docs/source/tutorials/multi_npu_quantization.md index a1f3a3be..8d41fdac 100644 --- a/docs/source/tutorials/multi_npu_quantization.md +++ b/docs/source/tutorials/multi_npu_quantization.md @@ -37,7 +37,7 @@ see https://www.modelscope.cn/models/vllm-ascend/QwQ-32B-W8A8 ```bash # (Optional)This tag is recommended and has been verified -git clone https://gitee.com/ascend/msit -b modelslim-VLLM-8.1.RC1.b020_001 +git clone https://gitcode.com/Ascend/msit -b modelslim-VLLM-8.1.RC1.b020_001 cd msit/msmodelslim # Install by run this script diff --git a/docs/source/tutorials/single_npu_qwen3_quantization.md b/docs/source/tutorials/single_npu_qwen3_quantization.md index 46b84322..716ba0b0 100644 --- a/docs/source/tutorials/single_npu_qwen3_quantization.md +++ b/docs/source/tutorials/single_npu_qwen3_quantization.md @@ -34,7 +34,7 @@ see https://www.modelscope.cn/models/vllm-ascend/Qwen3-8B-W4A8 ```bash # The branch(br_release_MindStudio_8.1.RC2_TR5_20260624) has been verified -git clone -b br_release_MindStudio_8.1.RC2_TR5_20260624 https://gitee.com/ascend/msit +git clone -b br_release_MindStudio_8.1.RC2_TR5_20260624 https://gitcode.com/Ascend/msit cd msit/msmodelslim diff --git a/docs/source/user_guide/feature_guide/quantization.md b/docs/source/user_guide/feature_guide/quantization.md index 5300ad55..f66fd46c 100644 --- a/docs/source/user_guide/feature_guide/quantization.md +++ b/docs/source/user_guide/feature_guide/quantization.md @@ -6,13 +6,13 @@ Since 0.9.0rc2 version, quantization feature is experimentally supported in vLLM ## Install modelslim -To quantize a model, users should install [ModelSlim](https://gitee.com/ascend/msit/blob/master/msmodelslim/README.md) which is the Ascend compression and acceleration tool. It is an affinity-based compression tool designed for acceleration, using compression as its core technology and built upon the Ascend platform. +To quantize a model, users should install [ModelSlim](https://gitcode.com/Ascend/msit/blob/master/msmodelslim/README.md) which is the Ascend compression and acceleration tool. It is an affinity-based compression tool designed for acceleration, using compression as its core technology and built upon the Ascend platform. Install modelslim: ```bash # The branch(br_release_MindStudio_8.1.RC2_TR5_20260624) has been verified -git clone -b br_release_MindStudio_8.1.RC2_TR5_20260624 https://gitee.com/ascend/msit +git clone -b br_release_MindStudio_8.1.RC2_TR5_20260624 https://gitcode.com/Ascend/msit cd msit/msmodelslim @@ -29,8 +29,8 @@ This conversion process will require a larger CPU memory, please ensure that the ::: ### Adapts and change -1. Ascend does not support the `flash_attn` library. To run the model, you need to follow the [guide](https://gitee.com/ascend/msit/blob/master/msmodelslim/example/DeepSeek/README.md#deepseek-v3r1) and comment out certain parts of the code in `modeling_deepseek.py` located in the weights folder. -2. The current version of transformers does not support loading weights in FP8 quantization format. you need to follow the [guide](https://gitee.com/ascend/msit/blob/master/msmodelslim/example/DeepSeek/README.md#deepseek-v3r1) and delete the quantization related fields from `config.json` in the weights folder +1. Ascend does not support the `flash_attn` library. To run the model, you need to follow the [guide](https://gitcode.com/Ascend/msit/blob/master/msmodelslim/example/DeepSeek/README.md#deepseek-v3r1) and comment out certain parts of the code in `modeling_deepseek.py` located in the weights folder. +2. The current version of transformers does not support loading weights in FP8 quantization format. you need to follow the [guide](https://gitcode.com/Ascend/msit/blob/master/msmodelslim/example/DeepSeek/README.md#deepseek-v3r1) and delete the quantization related fields from `config.json` in the weights folder ### Generate the w8a8 weights