v0.10.1rc1

2025-09-09 09:40:35 +08:00
parent d6f6ef41fe
commit 9149384e03
432 changed files with 84698 additions and 1 deletions
--- a/docs/source/locale/zh_CN/LC_MESSAGES/community/contributors.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/community/contributors.po
--- a/docs/source/locale/zh_CN/LC_MESSAGES/community/governance.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/community/governance.po
@@ -0,0 +1,204 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-ascend team
+# This file is distributed under the same license as the vllm-ascend
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-ascend\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../community/governance.md:1
+msgid "Governance"
+msgstr "治理"
+
+#: ../../community/governance.md:3
+msgid "Mission"
+msgstr "使命"
+
+#: ../../community/governance.md:4
+msgid ""
+"As a vital component of vLLM, the vLLM Ascend project is dedicated to "
+"providing an easy, fast, and cheap LLM Serving for Everyone on Ascend NPU, "
+"and to actively contribute to the enrichment of vLLM."
+msgstr ""
+"作为 vLLM 的重要组成部分，vLLM Ascend 项目致力于为所有人在 Ascend NPU 上提供简单、快速且低成本的大语言模型服务，并积极促进"
+" vLLM 的丰富发展。"
+
+#: ../../community/governance.md:6
+msgid "Principles"
+msgstr "原则"
+
+#: ../../community/governance.md:7
+msgid ""
+"vLLM Ascend follows the vLLM community's code of conduct：[vLLM - CODE OF "
+"CONDUCT](https://github.com/vllm-project/vllm/blob/main/CODE_OF_CONDUCT.md)"
+msgstr ""
+"vLLM Ascend 遵循 vLLM 社区的行为准则：[vLLM - 行为准则](https://github.com/vllm-"
+"project/vllm/blob/main/CODE_OF_CONDUCT.md)"
+
+#: ../../community/governance.md:9
+msgid "Governance - Mechanics"
+msgstr "治理 - 机制"
+
+#: ../../community/governance.md:10
+msgid ""
+"vLLM Ascend is an open-source project under the vLLM community, where the "
+"authority to appoint roles is ultimately determined by the vLLM community. "
+"It adopts a hierarchical technical governance structure."
+msgstr "vLLM Ascend 是 vLLM 社区下的一个开源项目，其角色任命权最终由 vLLM 社区决定。它采用分层的技术治理结构。"
+
+#: ../../community/governance.md:12
+msgid "Contributor:"
+msgstr "贡献者："
+
+#: ../../community/governance.md:14
+msgid ""
+"**Responsibility:** Help new contributors on boarding, handle and respond to"
+" community questions, review RFCs, code"
+msgstr "**职责：** 帮助新贡献者加入，处理和回复社区问题，审查RFC和代码"
+
+#: ../../community/governance.md:16
+msgid ""
+"**Requirements:** Complete at least 1 contribution. Contributor is someone "
+"who consistently and actively participates in a project, included but not "
+"limited to issue/review/commits/community involvement."
+msgstr "**要求：** 完成至少1次贡献。贡献者是指持续且积极参与项目的人，包括但不限于问题、评审、提交和社区参与。"
+
+#: ../../community/governance.md:18
+msgid ""
+"Contributors will be empowered [vllm-project/vllm-"
+"ascend](https://github.com/vllm-project/vllm-ascend) Github repo `Triage` "
+"permissions (`Can read and clone this repository. Can also manage issues and"
+" pull requests`) to help community developers collaborate more efficiently."
+msgstr ""
+"贡献者将被赋予 [vllm-project/vllm-ascend](https://github.com/vllm-project/vllm-"
+"ascend) Github 仓库的 `Triage` 权限（`可读取和克隆此仓库。还可以管理问题和拉取请求`），以帮助社区开发者更加高效地协作。"
+
+#: ../../community/governance.md:20
+msgid "Maintainer:"
+msgstr "维护者："
+
+#: ../../community/governance.md:22
+msgid ""
+"**Responsibility:** Develop the project's vision and mission. Maintainers "
+"are responsible for driving the technical direction of the entire project "
+"and ensuring its overall success, possessing code merge permissions. They "
+"formulate the roadmap, review contributions from community members, "
+"continuously contribute code, and actively engage in community activities "
+"(such as regular meetings/events)."
+msgstr ""
+"**责任：** "
+"制定项目的愿景和使命。维护者负责引领整个项目的技术方向并确保其整体成功，拥有代码合并权限。他们制定路线图，审核社区成员的贡献，持续贡献代码，并积极参与社区活动（如定期会议/活动）。"
+
+#: ../../community/governance.md:24
+msgid ""
+"**Requirements:** Deep understanding of ‌vLLM‌ and ‌vLLM Ascend‌ codebases, "
+"with a commitment to sustained code contributions. Competency in "
+"‌design/development/PR review workflows‌."
+msgstr ""
+"**要求：** 深入理解 ‌vLLM‌ 和 ‌vLLM Ascend‌ 代码库，并承诺持续贡献代码。具备 ‌设计/开发/PR 审核流程‌ 的能力。"
+
+#: ../../community/governance.md:25
+msgid ""
+"**Review Quality‌:** Actively participate in community code reviews, "
+"ensuring high-quality code integration."
+msgstr "**评审质量：** 积极参与社区代码评审，确保高质量的代码集成。"
+
+#: ../../community/governance.md:26
+msgid ""
+"**Quality Contribution‌:** Successfully develop and deliver at least one "
+"major feature while maintaining consistent high-quality contributions."
+msgstr "**质量贡献‌：** 成功开发并交付至少一个主要功能，同时持续保持高质量的贡献。"
+
+#: ../../community/governance.md:27
+msgid ""
+"**Community Involvement‌:** Actively address issues, respond to forum "
+"inquiries, participate in discussions, and engage in community-driven tasks."
+msgstr "**社区参与：** 积极解决问题，回复论坛询问，参与讨论，并参与社区驱动的任务。"
+
+#: ../../community/governance.md:29
+msgid ""
+"Requires approval from existing Maintainers. The vLLM community has the "
+"final decision-making authority."
+msgstr "需要现有维护者的批准。vLLM社区拥有最终决策权。"
+
+#: ../../community/governance.md:31
+msgid ""
+"Maintainer will be empowered [vllm-project/vllm-"
+"ascend](https://github.com/vllm-project/vllm-ascend) Github repo write "
+"permissions (`Can read, clone, and push to this repository. Can also manage "
+"issues and pull requests`)."
+msgstr ""
+"维护者将被授予 [vllm-project/vllm-ascend](https://github.com/vllm-project/vllm-"
+"ascend) Github 仓库的写入权限（`可以读取、克隆和推送到此仓库。还可以管理问题和拉取请求`）。"
+
+#: ../../community/governance.md:33
+msgid "Nominating and Removing Maintainers"
+msgstr "提名和移除维护者"
+
+#: ../../community/governance.md:35
+msgid "The Principles"
+msgstr "原则"
+
+#: ../../community/governance.md:37
+msgid ""
+"Membership in vLLM Ascend is given to individuals on merit basis after they "
+"demonstrated strong expertise of the vLLM / vLLM Ascend through "
+"contributions, reviews and discussions."
+msgstr ""
+"vLLM Ascend 的成员资格是基于个人能力授予的，只有在通过贡献、评审和讨论展示出对 vLLM / vLLM Ascend "
+"的深厚专业知识后，才可获得。"
+
+#: ../../community/governance.md:39
+msgid ""
+"For membership in the maintainer group the individual has to demonstrate "
+"strong and continued alignment with the overall vLLM / vLLM Ascend "
+"principles."
+msgstr "要成为维护者组成员，个人必须表现出与 vLLM / vLLM Ascend 总体原则的高度一致并持续支持。"
+
+#: ../../community/governance.md:41
+msgid ""
+"Light criteria of moving module maintenance to ‘emeritus’ status if they "
+"don’t actively participate over long periods of time."
+msgstr "如果模块维护人员在长时间内没有积极参与，可根据较宽松的标准将其维护状态转为“荣誉”状态。"
+
+#: ../../community/governance.md:43
+msgid "The membership is for an individual, not a company."
+msgstr "该会员资格属于个人，而非公司。"
+
+#: ../../community/governance.md:45
+msgid "Nomination and Removal"
+msgstr "提名与罢免"
+
+#: ../../community/governance.md:47
+msgid ""
+"Nomination: Anyone can nominate someone to become a maintainer (include "
+"self-nominate). All existing maintainers are responsible for evaluating the "
+"nomination. The nominator should provide nominee's info around the strength "
+"of the candidate to be a maintainer, include but not limited to review "
+"quality, quality contribution, community involvement."
+msgstr ""
+"提名：任何人都可以提名他人成为维护者（包括自荐）。所有现有维护者都有责任评估提名。提名人应提供被提名人成为维护者的相关优势信息，包括但不限于评审质量、优质贡献、社区参与等。"
+
+#: ../../community/governance.md:48
+msgid ""
+"Removal: Anyone can nominate a person to be removed from maintainer position"
+" (include self-nominate). All existing maintainers are responsible for "
+"evaluating the nomination. The nominator should provide nominee's info, "
+"include but not limited to lack of activity, conflict with the overall "
+"direction and other information that makes them unfit to be a maintainer."
+msgstr ""
+"移除：任何人都可以提名某人被移出维护者职位（包括自荐）。所有现有维护者都有责任评估该提名。提名者应提供被提名人的相关信息，包括但不限于缺乏活动、与整体方向冲突以及使其不适合作为维护者的其他信息。"
--- a/docs/source/locale/zh_CN/LC_MESSAGES/community/user_stories/index.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/community/user_stories/index.po
@@ -0,0 +1,103 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-ascend team
+# This file is distributed under the same license as the vllm-ascend
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-ascend\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../community/user_stories/index.md:15
+msgid "More details"
+msgstr "更多细节"
+
+#: ../../community/user_stories/index.md:1
+msgid "User Stories"
+msgstr "用户故事"
+
+#: ../../community/user_stories/index.md:3
+msgid ""
+"Read case studies on how users and developers solves real, everyday problems"
+" with vLLM Ascend"
+msgstr "阅读案例研究，了解用户和开发者如何使用 vLLM Ascend 解决实际日常问题。"
+
+#: ../../community/user_stories/index.md:5
+msgid ""
+"[LLaMA-Factory](./llamafactory.md) is an easy-to-use and efficient platform "
+"for training and fine-tuning large language models, it supports vLLM Ascend "
+"to speed up inference since [LLaMA-"
+"Factory#7739](https://github.com/hiyouga/LLaMA-Factory/pull/7739), gain 2x "
+"performance enhancement of inference."
+msgstr ""
+"[LLaMA-Factory](./llamafactory.md) 是一个易于使用且高效的大语言模型训练与微调平台，自 [LLaMA-"
+"Factory#7739](https://github.com/hiyouga/LLaMA-Factory/pull/7739) 起支持 vLLM "
+"Ascend 加速推理，推理性能提升 2 倍。"
+
+#: ../../community/user_stories/index.md:7
+msgid ""
+"[Huggingface/trl](https://github.com/huggingface/trl) is a cutting-edge "
+"library designed for post-training foundation models using advanced "
+"techniques like SFT, PPO and DPO, it uses vLLM Ascend since "
+"[v0.17.0](https://github.com/huggingface/trl/releases/tag/v0.17.0) to "
+"support RLHF on Ascend NPU."
+msgstr ""
+"[Huggingface/trl](https://github.com/huggingface/trl) 是一个前沿的库，专为使用 SFT、PPO 和"
+" DPO 等先进技术对基础模型进行后训练而设计。从 "
+"[v0.17.0](https://github.com/huggingface/trl/releases/tag/v0.17.0) 版本开始，该库利用"
+" vLLM Ascend 来支持在 Ascend NPU 上进行 RLHF。"
+
+#: ../../community/user_stories/index.md:9
+msgid ""
+"[MindIE Turbo](https://pypi.org/project/mindie-turbo) is an LLM inference "
+"engine acceleration plug-in library developed by Huawei on Ascend hardware, "
+"which includes self-developed large language model optimization algorithms "
+"and optimizations related to the inference engine framework. It supports "
+"vLLM Ascend since "
+"[2.0rc1](https://www.hiascend.com/document/detail/zh/mindie/20RC1/AcceleratePlugin/turbodev/mindie-"
+"turbo-0001.html)."
+msgstr ""
+"[MindIE Turbo](https://pypi.org/project/mindie-turbo) "
+"是华为在昇腾硬件上开发的一款用于加速LLM推理引擎的插件库，包含自主研发的大语言模型优化算法及与推理引擎框架相关的优化。从 "
+"[2.0rc1](https://www.hiascend.com/document/detail/zh/mindie/20RC1/AcceleratePlugin/turbodev/mindie-"
+"turbo-0001.html) 起，支持 vLLM Ascend。"
+
+#: ../../community/user_stories/index.md:11
+msgid ""
+"[GPUStack](https://github.com/gpustack/gpustack) is an open-source GPU "
+"cluster manager for running AI models. It supports vLLM Ascend since "
+"[v0.6.2](https://github.com/gpustack/gpustack/releases/tag/v0.6.2), see more"
+" GPUStack performance evaluation info on "
+"[link](https://mp.weixin.qq.com/s/pkytJVjcH9_OnffnsFGaew)."
+msgstr ""
+"[GPUStack](https://github.com/gpustack/gpustack) 是一个开源的 GPU 集群管理器，用于运行 AI "
+"模型。从 [v0.6.2](https://github.com/gpustack/gpustack/releases/tag/v0.6.2) "
+"版本开始支持 vLLM Ascend，更多 GPUStack 性能评测信息见 "
+"[链接](https://mp.weixin.qq.com/s/pkytJVjcH9_OnffnsFGaew)。"
+
+#: ../../community/user_stories/index.md:13
+msgid ""
+"[verl](https://github.com/volcengine/verl) is a flexible, efficient and "
+"production-ready RL training library for large language models (LLMs), uses "
+"vLLM Ascend since "
+"[v0.4.0](https://github.com/volcengine/verl/releases/tag/v0.4.0), see more "
+"info on [verl x Ascend "
+"Quickstart](https://verl.readthedocs.io/en/latest/ascend_tutorial/ascend_quick_start.html)."
+msgstr ""
+"[verl](https://github.com/volcengine/verl) "
+"是一个灵活、高效且可用于生产环境的大型语言模型（LLM）强化学习训练库，自 "
+"[v0.4.0](https://github.com/volcengine/verl/releases/tag/v0.4.0) 起支持 vLLM "
+"Ascend，更多信息请参见 [verl x Ascend "
+"快速上手](https://verl.readthedocs.io/en/latest/ascend_tutorial/ascend_quick_start.html)。"
--- a/docs/source/locale/zh_CN/LC_MESSAGES/community/user_stories/llamafactory.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/community/user_stories/llamafactory.po
@@ -0,0 +1,87 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-ascend team
+# This file is distributed under the same license as the vllm-ascend
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-ascend\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../community/user_stories/llamafactory.md:1
+msgid "LLaMA-Factory"
+msgstr "LLaMA-Factory"
+
+#: ../../community/user_stories/llamafactory.md:3
+msgid "**About / Introduction**"
+msgstr "**关于 / 介绍**"
+
+#: ../../community/user_stories/llamafactory.md:5
+msgid ""
+"[LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) is an easy-to-use "
+"and efficient platform for training and fine-tuning large language models. "
+"With LLaMA-Factory, you can fine-tune hundreds of pre-trained models locally"
+" without writing any code."
+msgstr ""
+"[LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) "
+"是一个易于使用且高效的平台，用于训练和微调大型语言模型。有了 LLaMA-Factory，你可以在本地对数百个预训练模型进行微调，无需编写任何代码。"
+
+#: ../../community/user_stories/llamafactory.md:7
+msgid ""
+"LLaMA-Facotory users need to evaluate and inference the model after fine-"
+"tuning the model."
+msgstr "LLaMA-Facotory 用户需要在对模型进行微调后对模型进行评估和推理。"
+
+#: ../../community/user_stories/llamafactory.md:9
+msgid "**The Business Challenge**"
+msgstr "**业务挑战**"
+
+#: ../../community/user_stories/llamafactory.md:11
+msgid ""
+"LLaMA-Factory used transformers to perform inference on Ascend NPU, but the "
+"speed was slow."
+msgstr "LLaMA-Factory 使用 transformers 在 Ascend NPU 上进行推理，但速度较慢。"
+
+#: ../../community/user_stories/llamafactory.md:13
+msgid "**Solving Challenges and Benefits with vLLM Ascend**"
+msgstr "**通过 vLLM Ascend 解决挑战与收益**"
+
+#: ../../community/user_stories/llamafactory.md:15
+msgid ""
+"With the joint efforts of LLaMA-Factory and vLLM Ascend ([LLaMA-"
+"Factory#7739](https://github.com/hiyouga/LLaMA-Factory/pull/7739)), the "
+"performance of LLaMA-Factory in the model inference stage has been "
+"significantly improved. According to the test results, the inference speed "
+"of LLaMA-Factory has been increased to 2x compared to the transformers "
+"version."
+msgstr ""
+"在 LLaMA-Factory 和 vLLM Ascend 的共同努力下（参见 [LLaMA-"
+"Factory#7739](https://github.com/hiyouga/LLaMA-Factory/pull/7739)），LLaMA-"
+"Factory 在模型推理阶段的性能得到了显著提升。根据测试结果，LLaMA-Factory 的推理速度相比 transformers 版本提升到了 2"
+" 倍。"
+
+#: ../../community/user_stories/llamafactory.md:17
+msgid "**Learn more**"
+msgstr "**了解更多**"
+
+#: ../../community/user_stories/llamafactory.md:19
+msgid ""
+"See more about LLaMA-Factory and how it uses vLLM Ascend for inference on "
+"the Ascend NPU in the following documentation: [LLaMA-Factory Ascend NPU "
+"Inference](https://llamafactory.readthedocs.io/en/latest/advanced/npu_inference.html)."
+msgstr ""
+"在以下文档中查看更多关于 LLaMA-Factory 以及其如何在 Ascend NPU 上使用 vLLM Ascend 进行推理的信息：[LLaMA-"
+"Factory Ascend NPU "
+"推理](https://llamafactory.readthedocs.io/en/latest/advanced/npu_inference.html)。"
--- a/docs/source/locale/zh_CN/LC_MESSAGES/community/versioning_policy.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/community/versioning_policy.po
@@ -0,0 +1,624 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-ascend team
+# This file is distributed under the same license as the vllm-ascend
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-ascend\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../community/versioning_policy.md:1
+msgid "Versioning policy"
+msgstr "版本管理策略"
+
+#: ../../community/versioning_policy.md:3
+msgid ""
+"Starting with vLLM 0.7.x, the vLLM Ascend Plugin ([vllm-project/vllm-"
+"ascend](https://github.com/vllm-project/vllm-ascend)) project follows the "
+"[PEP 440](https://peps.python.org/pep-0440/) to publish matching with vLLM "
+"([vllm-project/vllm](https://github.com/vllm-project/vllm))."
+msgstr ""
+"从 vLLM 0.7.x 开始，vLLM Ascend 插件（[vllm-project/vllm-"
+"ascend](https://github.com/vllm-project/vllm-ascend)）项目遵循 [PEP "
+"440](https://peps.python.org/pep-0440/) ，以与 vLLM（[vllm-"
+"project/vllm](https://github.com/vllm-project/vllm)）版本匹配发布。"
+
+#: ../../community/versioning_policy.md:5
+msgid "vLLM Ascend Plugin versions"
+msgstr "vLLM Ascend 插件版本"
+
+#: ../../community/versioning_policy.md:7
+msgid ""
+"Each vLLM Ascend release will be versioned: "
+"`v[major].[minor].[micro][rcN][.postN]` (such as `v0.7.3rc1`, `v0.7.3`, "
+"`v0.7.3.post1`)"
+msgstr ""
+"每个 vLLM Ascend 版本将采用以下版本格式：`v[major].[minor].[micro][rcN][.postN]`（例如 "
+"`v0.7.3rc1`、`v0.7.3`、`v0.7.3.post1`）"
+
+#: ../../community/versioning_policy.md:10
+msgid ""
+"**Final releases**: will typically be released every **3 months**, will take"
+" the vLLM upstream release plan and Ascend software product release plan "
+"into comprehensive consideration."
+msgstr "**正式版本**：通常每**3个月**发布一次，将综合考虑 vLLM 上游发行计划和昇腾软件产品发行计划。"
+
+#: ../../community/versioning_policy.md:11
+msgid ""
+"**Pre releases**: will typically be released **on demand**, ending with rcN,"
+" represents the Nth release candidate version, to support early testing by "
+"our users prior to a final release."
+msgstr "**预发布版本**：通常会**按需发布**，以 rcN 结尾，表示第N个候选发布版本，旨在支持用户在正式发布前进行早期测试。"
+
+#: ../../community/versioning_policy.md:12
+msgid ""
+"**Post releases**: will typically be released **on demand** to support to "
+"address minor errors in a final release. It's different from [PEP-440 post "
+"release note](https://peps.python.org/pep-0440/#post-releases) suggestion, "
+"it will contain actual bug fixes considering that the final release version "
+"should be matched strictly with the vLLM final release version "
+"(`v[major].[minor].[micro]`). The post version has to be published as a "
+"patch version of the final release."
+msgstr ""
+"**后续版本**：通常会根据需要发布，以支持解决正式发布中的小错误。这与 [PEP-440 "
+"的后续版本说明](https://peps.python.org/pep-0440/#post-releases) 建议不同，它将包含实际的 bug "
+"修复，因为最终发布版本应严格与 vLLM "
+"的最终发布版本（`v[major].[minor].[micro]`）匹配。后续版本必须以正式发布的补丁版本形式发布。"
+
+#: ../../community/versioning_policy.md:14
+msgid "For example:"
+msgstr "例如："
+
+#: ../../community/versioning_policy.md:15
+msgid ""
+"`v0.7.x`: it's the first final release to match the vLLM `v0.7.x` version."
+msgstr "`v0.7.x`：这是第一个与 vLLM `v0.7.x` 版本相匹配的正式发布版本。"
+
+#: ../../community/versioning_policy.md:16
+msgid "`v0.7.3rc1`: will be the first pre version of vLLM Ascend."
+msgstr "`v0.7.3rc1`：将会是 vLLM Ascend 的第一个预发布版本。"
+
+#: ../../community/versioning_policy.md:17
+msgid ""
+"`v0.7.3.post1`: will be the post release if the `v0.7.3` release has some "
+"minor errors."
+msgstr "`v0.7.3.post1`：如果 `v0.7.3` 版本发布有一些小错误，将作为后续修正版发布。"
+
+#: ../../community/versioning_policy.md:19
+msgid "Release Compatibility Matrix"
+msgstr "版本兼容性矩阵"
+
+#: ../../community/versioning_policy.md:21
+msgid "Following is the Release Compatibility Matrix for vLLM Ascend Plugin:"
+msgstr "以下是 vLLM Ascend 插件的版本兼容性矩阵："
+
+#: ../../community/versioning_policy.md
+msgid "vLLM Ascend"
+msgstr "vLLM Ascend"
+
+#: ../../community/versioning_policy.md
+msgid "vLLM"
+msgstr "vLLM"
+
+#: ../../community/versioning_policy.md
+msgid "Python"
+msgstr "Python"
+
+#: ../../community/versioning_policy.md
+msgid "Stable CANN"
+msgstr "Stable CANN"
+
+#: ../../community/versioning_policy.md
+msgid "PyTorch/torch_npu"
+msgstr "PyTorch/torch_npu"
+
+#: ../../community/versioning_policy.md
+msgid "MindIE Turbo"
+msgstr "MindIE Turbo"
+
+#: ../../community/versioning_policy.md
+msgid "v0.9.2rc1"
+msgstr "v0.9.2rc1"
+
+#: ../../community/versioning_policy.md
+msgid "v0.9.2"
+msgstr "v0.9.2"
+
+#: ../../community/versioning_policy.md
+msgid ">= 3.9, < 3.12"
+msgstr ">= 3.9，< 3.12"
+
+#: ../../community/versioning_policy.md
+msgid "8.1.RC1"
+msgstr "8.1.RC1"
+
+#: ../../community/versioning_policy.md
+msgid "2.5.1 / 2.5.1.post1.dev20250619"
+msgstr "2.5.1 / 2.5.1.post1.dev20250619"
+
+#: ../../community/versioning_policy.md
+msgid "v0.9.1rc1"
+msgstr "v0.9.1rc1"
+
+#: ../../community/versioning_policy.md
+msgid "v0.9.1"
+msgstr "v0.9.1"
+
+#: ../../community/versioning_policy.md
+msgid "2.5.1 / 2.5.1.post1.dev20250528"
+msgstr "2.5.1 / 2.5.1.post1.dev20250528"
+
+#: ../../community/versioning_policy.md
+msgid "v0.9.0rc2"
+msgstr "v0.9.0rc2"
+
+#: ../../community/versioning_policy.md
+msgid "v0.9.0"
+msgstr "v0.9.0"
+
+#: ../../community/versioning_policy.md
+msgid "2.5.1 / 2.5.1"
+msgstr "2.5.1 / 2.5.1"
+
+#: ../../community/versioning_policy.md
+msgid "v0.9.0rc1"
+msgstr "v0.9.0rc1"
+
+#: ../../community/versioning_policy.md
+msgid "v0.8.5rc1"
+msgstr "v0.8.5rc1"
+
+#: ../../community/versioning_policy.md
+msgid "v0.8.5.post1"
+msgstr "v0.8.5.post1"
+
+#: ../../community/versioning_policy.md
+msgid "v0.8.4rc2"
+msgstr "v0.8.4rc2"
+
+#: ../../community/versioning_policy.md
+msgid "v0.8.4"
+msgstr "v0.8.4"
+
+#: ../../community/versioning_policy.md
+msgid "8.0.0"
+msgstr "8.0.0"
+
+#: ../../community/versioning_policy.md
+msgid "v0.7.3.post1"
+msgstr "v0.7.3.post1"
+
+#: ../../community/versioning_policy.md
+msgid "v0.7.3"
+msgstr "v0.7.3"
+
+#: ../../community/versioning_policy.md
+msgid "2.0rc1"
+msgstr "2.0候选版本1"
+
+#: ../../community/versioning_policy.md:34
+msgid "Release cadence"
+msgstr "发布节奏"
+
+#: ../../community/versioning_policy.md:36
+msgid "release window"
+msgstr "发布窗口"
+
+#: ../../community/versioning_policy.md
+msgid "Date"
+msgstr "日期"
+
+#: ../../community/versioning_policy.md
+msgid "Event"
+msgstr "事件"
+
+#: ../../community/versioning_policy.md
+msgid "2025.07.11"
+msgstr "2025.07.11"
+
+#: ../../community/versioning_policy.md
+msgid "Release candidates, v0.9.2rc1"
+msgstr "候选发布版本，v0.9.2rc1"
+
+#: ../../community/versioning_policy.md
+msgid "2025.06.22"
+msgstr "2025.06.22"
+
+#: ../../community/versioning_policy.md
+msgid "Release candidates, v0.9.1rc1"
+msgstr "候选发布版本，v0.9.1rc1"
+
+#: ../../community/versioning_policy.md
+msgid "2025.06.10"
+msgstr "2025.06.10"
+
+#: ../../community/versioning_policy.md
+msgid "Release candidates, v0.9.0rc2"
+msgstr "候选发布版本，v0.9.0rc2"
+
+#: ../../community/versioning_policy.md
+msgid "2025.06.09"
+msgstr "2025.06.09"
+
+#: ../../community/versioning_policy.md
+msgid "Release candidates, v0.9.0rc1"
+msgstr "候选发布版本本，v0.9.0rc1"
+
+#: ../../community/versioning_policy.md
+msgid "2025.05.29"
+msgstr "2025.05.29"
+
+#: ../../community/versioning_policy.md
+msgid "v0.7.x post release, v0.7.3.post1"
+msgstr "v0.7.x 补丁版，v0.7.3.post1"
+
+#: ../../community/versioning_policy.md
+msgid "2025.05.08"
+msgstr "2025.05.08"
+
+#: ../../community/versioning_policy.md
+msgid "v0.7.x Final release, v0.7.3"
+msgstr "v0.7.x 正式版，v0.7.3"
+
+#: ../../community/versioning_policy.md
+msgid "2025.05.06"
+msgstr "2025.05.06"
+
+#: ../../community/versioning_policy.md
+msgid "Release candidates, v0.8.5rc1"
+msgstr "候选发布版本，v0.8.5rc1"
+
+#: ../../community/versioning_policy.md
+msgid "2025.04.28"
+msgstr "2025.04.28"
+
+#: ../../community/versioning_policy.md
+msgid "Release candidates, v0.8.4rc2"
+msgstr "候选发布版本，v0.8.4rc2"
+
+#: ../../community/versioning_policy.md
+msgid "2025.04.18"
+msgstr "2025.04.18"
+
+#: ../../community/versioning_policy.md
+msgid "Release candidates, v0.8.4rc1"
+msgstr "候选发布版本，v0.8.4rc1"
+
+#: ../../community/versioning_policy.md
+msgid "2025.03.28"
+msgstr "2025.03.28"
+
+#: ../../community/versioning_policy.md
+msgid "Release candidates, v0.7.3rc2"
+msgstr "候选发布版本，v0.7.3rc2"
+
+#: ../../community/versioning_policy.md
+msgid "2025.03.14"
+msgstr "2025.03.14"
+
+#: ../../community/versioning_policy.md
+msgid "Release candidates, v0.7.3rc1"
+msgstr "候选发布版本，v0.7.3rc1"
+
+#: ../../community/versioning_policy.md
+msgid "2025.02.19"
+msgstr "2025.02.19"
+
+#: ../../community/versioning_policy.md
+msgid "Release candidates, v0.7.1rc1"
+msgstr "候选发布版本，v0.7.1rc1"
+
+#: ../../community/versioning_policy.md:53
+msgid "Branch policy"
+msgstr "分支策略"
+
+#: ../../community/versioning_policy.md:55
+msgid "vLLM Ascend has main branch and dev branch."
+msgstr "vLLM Ascend 有主分支和开发分支。"
+
+#: ../../community/versioning_policy.md:57
+msgid ""
+"**main**: main branch，corresponds to the vLLM main branch and latest 1 or 2 "
+"release version. It is continuously monitored for quality through Ascend CI."
+msgstr "**main**：main 分支，对应 vLLM 的主分支和最新的 1 或 2 个发布版本。该分支通过 Ascend CI 持续监控质量。"
+
+#: ../../community/versioning_policy.md:58
+msgid ""
+"**vX.Y.Z-dev**: development branch, created with part of new releases of "
+"vLLM. For example, `v0.7.3-dev` is the dev branch for vLLM `v0.7.3` version."
+msgstr ""
+"**vX.Y.Z-dev**：开发分支，是随着 vLLM 新版本的一部分一起创建的。例如，`v0.7.3-dev` 是 vLLM `v0.7.3` "
+"版本的开发分支。"
+
+#: ../../community/versioning_policy.md:60
+msgid ""
+"Usually, a commit should be ONLY first merged in the main branch, and then "
+"backported to the dev branch to reduce maintenance costs as much as "
+"possible."
+msgstr "通常，提交应该只先合并到主分支，然后再回溯合并到开发分支，以尽可能降低维护成本。"
+
+#: ../../community/versioning_policy.md:62
+msgid "Maintenance branch and EOL:"
+msgstr "维护分支与生命周期结束（EOL）："
+
+#: ../../community/versioning_policy.md:63
+msgid "The branch status will be in one of the following states:"
+msgstr "分支状态将处于以下几种状态之一："
+
+#: ../../community/versioning_policy.md
+msgid "Branch"
+msgstr "分支"
+
+#: ../../community/versioning_policy.md
+msgid "Time frame"
+msgstr "时间范围"
+
+#: ../../community/versioning_policy.md
+msgid "Summary"
+msgstr "摘要"
+
+#: ../../community/versioning_policy.md
+msgid "Maintained"
+msgstr "维护中"
+
+#: ../../community/versioning_policy.md
+msgid "Approximately 2-3 minor versions"
+msgstr "大约 2-3 个小版本"
+
+#: ../../community/versioning_policy.md
+msgid "All bugfixes are appropriate. Releases produced, CI commitment."
+msgstr "所有的错误修复都是合适的。正常发布版本，持续集成承诺。"
+
+#: ../../community/versioning_policy.md
+msgid "Unmaintained"
+msgstr "无人维护"
+
+#: ../../community/versioning_policy.md
+msgid "Community interest driven"
+msgstr "社区兴趣驱动"
+
+#: ../../community/versioning_policy.md
+msgid "All bugfixes are appropriate. No Releases produced, No CI commitment"
+msgstr "所有的 bug 修复都是合适的。没有发布版本，不承诺持续集成（CI）。"
+
+#: ../../community/versioning_policy.md
+msgid "End of Life (EOL)"
+msgstr "生命周期结束（EOL）"
+
+#: ../../community/versioning_policy.md
+msgid "N/A"
+msgstr "不适用"
+
+#: ../../community/versioning_policy.md
+msgid "Branch no longer accepting changes"
+msgstr "该分支不再接受更改"
+
+#: ../../community/versioning_policy.md:71
+msgid "Branch state"
+msgstr "分支状态"
+
+#: ../../community/versioning_policy.md:73
+msgid ""
+"Note that vLLM Ascend will only be released for a certain vLLM release "
+"version rather than all versions. Hence, You might see only part of versions"
+" have dev branches (such as only `0.7.1-dev` / `0.7.3-dev` but no "
+"`0.7.2-dev`), this is as expected."
+msgstr ""
+"请注意，vLLM Ascend 只会针对某些 vLLM 发布版本发布，而不是所有版本。因此，您可能会看到只有部分版本拥有开发分支（例如只有 "
+"`0.7.1-dev` / `0.7.3-dev`，而没有 `0.7.2-dev`），这是正常现象。"
+
+#: ../../community/versioning_policy.md:75
+msgid ""
+"Usually, each minor version of vLLM (such as 0.7) will correspond to a vLLM "
+"Ascend version branch and support its latest version (for example, we plan "
+"to support version 0.7.3) as following shown:"
+msgstr ""
+"通常，vLLM 的每一个小版本（例如 0.7）都会对应一个 vLLM Ascend 版本分支，并支持其最新版本（例如，我们计划支持 0.7.3 "
+"版），如下所示："
+
+#: ../../community/versioning_policy.md
+msgid "Status"
+msgstr "状态"
+
+#: ../../community/versioning_policy.md
+msgid "Note"
+msgstr "注释"
+
+#: ../../community/versioning_policy.md
+msgid "main"
+msgstr "main"
+
+#: ../../community/versioning_policy.md
+msgid "CI commitment for vLLM main branch and vLLM 0.9.2 branch"
+msgstr "vLLM 主分支和 vLLM 0.9.2 分支的 CI 承诺"
+
+#: ../../community/versioning_policy.md
+msgid "v0.9.1-dev"
+msgstr "v0.9.1-dev"
+
+#: ../../community/versioning_policy.md
+msgid "CI commitment for vLLM 0.9.1 version"
+msgstr "vLLM 0.9.1 版本的 CI 承诺"
+
+#: ../../community/versioning_policy.md
+msgid "v0.7.3-dev"
+msgstr "v0.7.3-dev"
+
+#: ../../community/versioning_policy.md
+msgid "CI commitment for vLLM 0.7.3 version"
+msgstr "vLLM 0.7.3 版本的 CI 承诺"
+
+#: ../../community/versioning_policy.md
+msgid "v0.7.1-dev"
+msgstr "v0.7.1-dev"
+
+#: ../../community/versioning_policy.md
+msgid "Replaced by v0.7.3-dev"
+msgstr "已被 v0.7.3-dev 替代"
+
+#: ../../community/versioning_policy.md:84
+msgid "Backward compatibility"
+msgstr "向后兼容性"
+
+#: ../../community/versioning_policy.md:86
+msgid ""
+"For main branch, vLLM Ascend should works with vLLM main branch and latest 1"
+" or 2 release version. So to ensure the backward compatibility, we will do "
+"the following:"
+msgstr ""
+"对于主分支，vLLM Ascend 应该与 vLLM 主分支以及最新的 1 或 2 个发布版本兼容。因此，为了确保向后兼容性，我们将执行以下操作："
+
+#: ../../community/versioning_policy.md:87
+msgid ""
+"Both main branch and target vLLM release is tested by Ascend E2E CI. For "
+"example, currently, vLLM main branch and vLLM 0.8.4 are tested now."
+msgstr "主分支和目标 vLLM 发行版都经过了 Ascend E2E CI 的测试。例如，目前正在测试 vLLM 主分支和 vLLM 0.8.4。"
+
+#: ../../community/versioning_policy.md:88
+msgid ""
+"For code changes, we will make sure that the changes are compatible with the"
+" latest 1 or 2 vLLM release version as well. In this case, vLLM Ascend "
+"introduced a version check machinism inner the code. It'll check the version"
+" of installed vLLM package first to decide which code logic to use. If users"
+" hit the `InvalidVersion` error, it sometimes means that they have installed"
+" an dev/editable version of vLLM package. In this case, we provide the env "
+"variable `VLLM_VERSION` to let users specify the version of vLLM package to "
+"use."
+msgstr ""
+"对于代码更改，我们也会确保这些更改与最新的 1 或 2 个 vLLM 发行版本兼容。在这种情况下，vLLM Ascend "
+"在代码中引入了版本检查机制。它会先检查已安装的 vLLM 包的版本，然后决定使用哪段代码逻辑。如果用户遇到 `InvalidVersion` "
+"错误，这有时意味着他们安装了 dev/可编辑版本的 vLLM 包。此时，我们提供了环境变量 `VLLM_VERSION`，让用户可以指定要使用的 "
+"vLLM 包版本。"
+
+#: ../../community/versioning_policy.md:89
+msgid ""
+"For documentation changes, we will make sure that the changes are compatible"
+" with the latest 1 or 2 vLLM release version as well. Note should be added "
+"if there are any breaking changes."
+msgstr "对于文档更改，我们会确保这些更改也兼容于最新的1个或2个 vLLM 发布版本。如果有任何重大变更，应添加说明。"
+
+#: ../../community/versioning_policy.md:91
+msgid "Document Branch Policy"
+msgstr "文档分支政策"
+
+#: ../../community/versioning_policy.md:92
+msgid ""
+"To reduce maintenance costs, **all branch documentation content should "
+"remain consistent, and version differences can be controlled via variables "
+"in [docs/source/conf.py](https://github.com/vllm-project/vllm-"
+"ascend/blob/main/docs/source/conf.py)**. While this is not a simple task, it"
+" is a principle we should strive to follow."
+msgstr ""
+"为了减少维护成本，**所有分支的文档内容应保持一致，版本差异可以通过 "
+"[docs/source/conf.py](https://github.com/vllm-project/vllm-"
+"ascend/blob/main/docs/source/conf.py) 中的变量进行控制**。虽然这并非易事，但这是我们应当努力遵循的原则。"
+
+#: ../../community/versioning_policy.md
+msgid "Version"
+msgstr "版本"
+
+#: ../../community/versioning_policy.md
+msgid "Purpose"
+msgstr "用途"
+
+#: ../../community/versioning_policy.md
+msgid "Code Branch"
+msgstr "代码分支"
+
+#: ../../community/versioning_policy.md
+msgid "latest"
+msgstr "最新"
+
+#: ../../community/versioning_policy.md
+msgid "Doc for the latest dev branch"
+msgstr "最新开发分支的文档"
+
+#: ../../community/versioning_policy.md
+msgid "vX.Y.Z-dev (Will be `main` after the first final release)"
+msgstr "vX.Y.Z-dev（在第一个正式版本发布后将成为 `main`）"
+
+#: ../../community/versioning_policy.md
+msgid "version"
+msgstr "版本"
+
+#: ../../community/versioning_policy.md
+msgid "Doc for historical released versions"
+msgstr "历史版本文档"
+
+#: ../../community/versioning_policy.md
+msgid "Git tags, like vX.Y.Z[rcN]"
+msgstr "Git 标签，如 vX.Y.Z[rcN]"
+
+#: ../../community/versioning_policy.md
+msgid "stable（not yet released）"
+msgstr "稳定版（尚未发布）"
+
+#: ../../community/versioning_policy.md
+msgid "Doc for latest final release branch"
+msgstr "最新正式发布分支的文档"
+
+#: ../../community/versioning_policy.md
+msgid "Will be `vX.Y.Z-dev` after the first official release"
+msgstr "首个正式发布后将会是 `vX.Y.Z-dev`"
+
+#: ../../community/versioning_policy.md:100
+msgid "As shown above:"
+msgstr "如上所示："
+
+#: ../../community/versioning_policy.md:102
+msgid ""
+"`latest` documentation: Matches the current maintenance branch `vX.Y.Z-dev` "
+"(Will be `main` after the first final release). Continuously updated to "
+"ensure usability for the latest release."
+msgstr ""
+"`latest` 文档：匹配当前维护分支 `vX.Y.Z-dev`（在首次正式发布后将为 `main`）。持续更新，以确保适用于最新发布版本。"
+
+#: ../../community/versioning_policy.md:103
+msgid ""
+"`version` documentation: Corresponds to specific released versions (e.g., "
+"`v0.7.3`, `v0.7.3rc1`). No further updates after release."
+msgstr "`version` 文档：对应特定的已发布版本（例如，`v0.7.3`、`v0.7.3rc1`）。发布后不再进行更新。"
+
+#: ../../community/versioning_policy.md:104
+msgid ""
+"`stable` documentation (**not yet released**): Official release "
+"documentation. Updates are allowed in real-time after release, typically "
+"based on vX.Y.Z-dev. Once stable documentation is available, non-stable "
+"versions should display a header warning: `You are viewing the latest "
+"developer preview docs. Click here to view docs for the latest stable "
+"release.`."
+msgstr ""
+"`stable` 文档（**尚未发布**）：官方发布版文档。发布后允许实时更新，通常基于 "
+"vX.Y.Z-dev。一旦稳定版文档可用，非稳定版本应显示一个顶部警告：`您正在查看最新的开发预览文档。点击此处查看最新稳定版本文档。`"
+
+#: ../../community/versioning_policy.md:106
+msgid "Software Dependency Management"
+msgstr "软件依赖管理"
+
+#: ../../community/versioning_policy.md:107
+msgid ""
+"`torch-npu`: Ascend Extension for PyTorch (torch-npu) releases a stable "
+"version to [PyPi](https://pypi.org/project/torch-npu) every 3 months, a "
+"development version (aka the POC version) every month, and a nightly version"
+" every day. The PyPi stable version **CAN** be used in vLLM Ascend final "
+"version, the monthly dev version **ONLY CANN** be used in vLLM Ascend RC "
+"version for rapid iteration, the nightly version **CANNOT** be used in vLLM "
+"Ascend any version and branches."
+msgstr ""
+"`torch-npu`：Ascend Extension for PyTorch（torch-npu）每 3 个月会在 "
+"[PyPi](https://pypi.org/project/torch-npu) 上发布一个稳定版本，每个月发布一个开发版本（即 POC "
+"版本），每天发布一个 nightly 版本。PyPi 上的稳定版本**可以**用于 vLLM Ascend 的正式版本，月度开发版本**只能**用于 "
+"vLLM Ascend 的 RC（候选发布）版本以便快速迭代，nightly 版本**不能**用于 vLLM Ascend 的任何版本和分支。"
--- a/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/contribution/index.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/contribution/index.po
@@ -0,0 +1,187 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-ascend team
+# This file is distributed under the same license as the vllm-ascend
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-ascend\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../developer_guide/contribution/index.md:107
+msgid "Index"
+msgstr "索引"
+
+#: ../../developer_guide/contribution/index.md:1
+msgid "Contributing"
+msgstr "贡献"
+
+#: ../../developer_guide/contribution/index.md:3
+msgid "Building and testing"
+msgstr "构建与测试"
+
+#: ../../developer_guide/contribution/index.md:4
+msgid ""
+"It's recommended to set up a local development environment to build and test"
+" before you submit a PR."
+msgstr "建议先搭建本地开发环境来进行构建和测试，再提交 PR。"
+
+#: ../../developer_guide/contribution/index.md:7
+msgid "Setup development environment"
+msgstr "搭建开发环境"
+
+#: ../../developer_guide/contribution/index.md:9
+msgid ""
+"Theoretically, the vllm-ascend build is only supported on Linux because "
+"`vllm-ascend` dependency `torch_npu` only supports Linux."
+msgstr ""
+"理论上，vllm-ascend 构建仅支持 Linux，因为 `vllm-ascend` 的依赖项 `torch_npu` 只支持 Linux。"
+
+#: ../../developer_guide/contribution/index.md:12
+msgid ""
+"But you can still set up dev env on Linux/Windows/macOS for linting and "
+"basic test as following commands:"
+msgstr "但你仍然可以在 Linux/Windows/macOS 上按照以下命令设置开发环境，用于代码规约检查和基本测试："
+
+#: ../../developer_guide/contribution/index.md:15
+msgid "Run lint locally"
+msgstr "在本地运行 lint"
+
+#: ../../developer_guide/contribution/index.md:33
+msgid "Run CI locally"
+msgstr "本地运行CI"
+
+#: ../../developer_guide/contribution/index.md:35
+msgid "After complete \"Run lint\" setup, you can run CI locally:"
+msgstr "在完成“运行 lint”设置后，你可以在本地运行 CI："
+
+#: ../../developer_guide/contribution/index.md:61
+msgid "Submit the commit"
+msgstr "提交该提交"
+
+#: ../../developer_guide/contribution/index.md:68
+msgid ""
+"🎉 Congratulations! You have completed the development environment setup."
+msgstr "🎉 恭喜！你已经完成了开发环境的搭建。"
+
+#: ../../developer_guide/contribution/index.md:70
+msgid "Test locally"
+msgstr "本地测试"
+
+#: ../../developer_guide/contribution/index.md:72
+msgid ""
+"You can refer to [Testing](./testing.md) doc to help you setup testing "
+"environment and running tests locally."
+msgstr "你可以参考 [测试](./testing.md) 文档，帮助你搭建测试环境并在本地运行测试。"
+
+#: ../../developer_guide/contribution/index.md:74
+msgid "DCO and Signed-off-by"
+msgstr "DCO 和签名确认"
+
+#: ../../developer_guide/contribution/index.md:76
+msgid ""
+"When contributing changes to this project, you must agree to the DCO. "
+"Commits must include a `Signed-off-by:` header which certifies agreement "
+"with the terms of the DCO."
+msgstr "当为本项目贡献更改时，您必须同意 DCO。提交必须包含 `Signed-off-by:` 头部，以证明您同意 DCO 的条款。"
+
+#: ../../developer_guide/contribution/index.md:78
+msgid "Using `-s` with `git commit` will automatically add this header."
+msgstr "在使用 `git commit` 时加上 `-s` 参数会自动添加这个头部信息。"
+
+#: ../../developer_guide/contribution/index.md:80
+msgid "PR Title and Classification"
+msgstr "PR 标题与分类"
+
+#: ../../developer_guide/contribution/index.md:82
+msgid ""
+"Only specific types of PRs will be reviewed. The PR title is prefixed "
+"appropriately to indicate the type of change. Please use one of the "
+"following:"
+msgstr "只有特定类型的 PR 会被审核。PR 标题应使用合适的前缀以指明更改类型。请使用以下之一："
+
+#: ../../developer_guide/contribution/index.md:84
+msgid "`[Attention]` for new features or optimization in attention."
+msgstr "`[Attention]` 用于注意力机制中新特性或优化。"
+
+#: ../../developer_guide/contribution/index.md:85
+msgid "`[Communicator]` for new features or optimization in communicators."
+msgstr "`[Communicator]` 适用于通信器中的新特性或优化。"
+
+#: ../../developer_guide/contribution/index.md:86
+msgid "`[ModelRunner]` for new features or optimization in model runner."
+msgstr "`[ModelRunner]` 用于模型运行器中的新功能或优化。"
+
+#: ../../developer_guide/contribution/index.md:87
+msgid "`[Platform]` for new features or optimization in platform."
+msgstr "`[Platform]` 用于平台中新功能或优化。"
+
+#: ../../developer_guide/contribution/index.md:88
+msgid "`[Worker]` for new features or optimization in worker."
+msgstr "`[Worker]` 用于 worker 的新功能或优化。"
+
+#: ../../developer_guide/contribution/index.md:89
+msgid ""
+"`[Core]` for new features or optimization  in the core vllm-ascend logic "
+"(such as platform, attention, communicators, model runner)"
+msgstr "`[Core]` 用于核心 vllm-ascend 逻辑中的新特性或优化（例如平台、注意力机制、通信器、模型运行器）。"
+
+#: ../../developer_guide/contribution/index.md:90
+msgid "`[Kernel]` changes affecting compute kernels and ops."
+msgstr "`[Kernel]` 影响计算内核和操作的更改。"
+
+#: ../../developer_guide/contribution/index.md:91
+msgid "`[Bugfix]` for bug fixes."
+msgstr "`[Bugfix]` 用于表示错误修复。"
+
+#: ../../developer_guide/contribution/index.md:92
+msgid "`[Doc]` for documentation fixes and improvements."
+msgstr "`[Doc]` 用于文档修复和改进。"
+
+#: ../../developer_guide/contribution/index.md:93
+msgid "`[Test]` for tests (such as unit tests)."
+msgstr "`[Test]` 用于测试（如单元测试）。"
+
+#: ../../developer_guide/contribution/index.md:94
+msgid "`[CI]` for build or continuous integration improvements."
+msgstr "`[CI]` 用于构建或持续集成的改进。"
+
+#: ../../developer_guide/contribution/index.md:95
+msgid ""
+"`[Misc]` for PRs that do not fit the above categories. Please use this "
+"sparingly."
+msgstr "对于不属于上述类别的 PR，请使用 `[Misc]`。请谨慎使用此标签。"
+
+#: ../../developer_guide/contribution/index.md:98
+msgid ""
+"If the PR spans more than one category, please include all relevant "
+"prefixes."
+msgstr "如果拉取请求（PR）涵盖多个类别，请包含所有相关的前缀。"
+
+#: ../../developer_guide/contribution/index.md:101
+msgid "Others"
+msgstr "其他"
+
+#: ../../developer_guide/contribution/index.md:103
+msgid ""
+"You may find more information about contributing to vLLM Ascend backend "
+"plugin on "
+"[<u>docs.vllm.ai</u>](https://docs.vllm.ai/en/latest/contributing/overview.html)."
+" If you find any problem when contributing, you can feel free to submit a PR"
+" to improve the doc to help other developers."
+msgstr ""
+"你可以在 "
+"[<u>docs.vllm.ai</u>](https://docs.vllm.ai/en/latest/contributing/overview.html)"
+" 上找到有关为 vLLM Ascend 后端插件做贡献的更多信息。如果你在贡献过程中遇到任何问题，欢迎随时提交 PR 来改进文档，以帮助其他开发者。"
--- a/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/contribution/testing.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/contribution/testing.po
@@ -0,0 +1,237 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-ascend team
+# This file is distributed under the same license as the vllm-ascend
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-ascend\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../developer_guide/contribution/testing.md:1
+msgid "Testing"
+msgstr "测试"
+
+#: ../../developer_guide/contribution/testing.md:3
+msgid ""
+"This secition explains how to write e2e tests and unit tests to verify the "
+"implementation of your feature."
+msgstr "本节介绍如何编写端到端测试和单元测试，以验证你的功能实现。"
+
+#: ../../developer_guide/contribution/testing.md:5
+msgid "Setup test environment"
+msgstr "设置测试环境"
+
+#: ../../developer_guide/contribution/testing.md:7
+msgid ""
+"The fastest way to setup test environment is to use the main branch "
+"container image:"
+msgstr "搭建测试环境最快的方法是使用 main 分支的容器镜像："
+
+#: ../../developer_guide/contribution/testing.md
+msgid "Local (CPU)"
+msgstr "本地（CPU）"
+
+#: ../../developer_guide/contribution/testing.md:18
+msgid "You can run the unit tests on CPU with the following steps:"
+msgstr "你可以按照以下步骤在 CPU 上运行单元测试："
+
+#: ../../developer_guide/contribution/testing.md
+msgid "Single card"
+msgstr "单张卡片"
+
+#: ../../developer_guide/contribution/testing.md:85
+#: ../../developer_guide/contribution/testing.md:123
+msgid ""
+"After starting the container, you should install the required packages:"
+msgstr "启动容器后，你应该安装所需的软件包："
+
+#: ../../developer_guide/contribution/testing.md
+msgid "Multi cards"
+msgstr "多卡"
+
+#: ../../developer_guide/contribution/testing.md:137
+msgid "Running tests"
+msgstr "运行测试"
+
+#: ../../developer_guide/contribution/testing.md:139
+msgid "Unit test"
+msgstr "单元测试"
+
+#: ../../developer_guide/contribution/testing.md:141
+msgid "There are several principles to follow when writing unit tests:"
+msgstr "编写单元测试时需要遵循几个原则："
+
+#: ../../developer_guide/contribution/testing.md:143
+msgid ""
+"The test file path should be consistent with source file and start with "
+"`test_` prefix, such as: `vllm_ascend/worker/worker_v1.py` --> "
+"`tests/ut/worker/test_worker_v1.py`"
+msgstr ""
+"测试文件的路径应与源文件保持一致，并以 `test_` 前缀开头，例如：`vllm_ascend/worker/worker_v1.py` --> "
+"`tests/ut/worker/test_worker_v1.py`"
+
+#: ../../developer_guide/contribution/testing.md:144
+msgid ""
+"The vLLM Ascend test are using unittest framework, see "
+"[here](https://docs.python.org/3/library/unittest.html#module-unittest) to "
+"understand how to write unit tests."
+msgstr ""
+"vLLM Ascend 测试使用 unittest "
+"框架，参见[这里](https://docs.python.org/3/library/unittest.html#module-"
+"unittest)了解如何编写单元测试。"
+
+#: ../../developer_guide/contribution/testing.md:145
+msgid ""
+"All unit tests can be run on CPU, so you must mock the device-related "
+"function to host."
+msgstr "所有单元测试都可以在 CPU 上运行，因此你必须将与设备相关的函数模拟为 host。"
+
+#: ../../developer_guide/contribution/testing.md:146
+msgid ""
+"Example: [tests/ut/test_ascend_config.py](https://github.com/vllm-"
+"project/vllm-ascend/blob/main/tests/ut/test_ascend_config.py)."
+msgstr ""
+"示例：[tests/ut/test_ascend_config.py](https://github.com/vllm-project/vllm-"
+"ascend/blob/main/tests/ut/test_ascend_config.py)。"
+
+#: ../../developer_guide/contribution/testing.md:147
+msgid "You can run the unit tests using `pytest`:"
+msgstr "你可以使用 `pytest` 运行单元测试："
+
+#: ../../developer_guide/contribution/testing.md
+msgid "Multi cards test"
+msgstr "多卡测试"
+
+#: ../../developer_guide/contribution/testing.md:192
+msgid "E2E test"
+msgstr "端到端测试"
+
+#: ../../developer_guide/contribution/testing.md:194
+msgid ""
+"Although vllm-ascend CI provide [e2e test](https://github.com/vllm-"
+"project/vllm-ascend/blob/main/.github/workflows/vllm_ascend_test.yaml) on "
+"Ascend CI, you can run it locally."
+msgstr ""
+"虽然 vllm-ascend CI 在 Ascend CI 上提供了 [端到端测试](https://github.com/vllm-"
+"project/vllm-"
+"ascend/blob/main/.github/workflows/vllm_ascend_test.yaml)，你也可以在本地运行它。"
+
+#: ../../developer_guide/contribution/testing.md:204
+msgid "You can't run e2e test on CPU."
+msgstr "你无法在 CPU 上运行 e2e 测试。"
+
+#: ../../developer_guide/contribution/testing.md:240
+msgid ""
+"This will reproduce e2e test: "
+"[vllm_ascend_test.yaml](https://github.com/vllm-project/vllm-"
+"ascend/blob/main/.github/workflows/vllm_ascend_test.yaml)."
+msgstr ""
+"这将复现端到端测试：[vllm_ascend_test.yaml](https://github.com/vllm-project/vllm-"
+"ascend/blob/main/.github/workflows/vllm_ascend_test.yaml)。"
+
+#: ../../developer_guide/contribution/testing.md:242
+msgid "E2E test example:"
+msgstr "E2E 测试示例："
+
+#: ../../developer_guide/contribution/testing.md:244
+msgid ""
+"Offline test example: "
+"[`tests/e2e/singlecard/test_offline_inference.py`](https://github.com/vllm-"
+"project/vllm-"
+"ascend/blob/main/tests/e2e/singlecard/test_offline_inference.py)"
+msgstr ""
+"离线测试示例：[`tests/e2e/singlecard/test_offline_inference.py`](https://github.com/vllm-"
+"project/vllm-"
+"ascend/blob/main/tests/e2e/singlecard/test_offline_inference.py)"
+
+#: ../../developer_guide/contribution/testing.md:245
+msgid ""
+"Online test examples: "
+"[`tests/e2e/singlecard/test_prompt_embedding.py`](https://github.com/vllm-"
+"project/vllm-ascend/blob/main/tests/e2e/singlecard/test_prompt_embedding.py)"
+msgstr ""
+"在线测试示例：[`tests/e2e/singlecard/test_prompt_embedding.py`](https://github.com/vllm-"
+"project/vllm-ascend/blob/main/tests/e2e/singlecard/test_prompt_embedding.py)"
+
+#: ../../developer_guide/contribution/testing.md:246
+msgid ""
+"Correctness test example: "
+"[`tests/e2e/singlecard/test_aclgraph.py`](https://github.com/vllm-"
+"project/vllm-ascend/blob/main/tests/e2e/singlecard/test_aclgraph.py)"
+msgstr ""
+"正确性测试示例：[`tests/e2e/singlecard/test_aclgraph.py`](https://github.com/vllm-"
+"project/vllm-ascend/blob/main/tests/e2e/singlecard/test_aclgraph.py)"
+
+#: ../../developer_guide/contribution/testing.md:247
+msgid ""
+"Reduced Layer model test example: [test_torchair_graph_mode.py - "
+"DeepSeek-V3-Pruning](https://github.com/vllm-project/vllm-"
+"ascend/blob/20767a043cccb3764214930d4695e53941de87ec/tests/e2e/multicard/test_torchair_graph_mode.py#L48)"
+msgstr ""
+"简化层模型测试示例：[test_torchair_graph_mode.py - "
+"DeepSeek-V3-Pruning](https://github.com/vllm-project/vllm-"
+"ascend/blob/20767a043cccb3764214930d4695e53941de87ec/tests/e2e/multicard/test_torchair_graph_mode.py#L48)"
+
+#: ../../developer_guide/contribution/testing.md:249
+msgid ""
+"The CI resource is limited, you might need to reduce layer number of the "
+"model, below is an example of how to generate a reduced layer model:"
+msgstr "CI 资源有限，您可能需要减少模型的层数，下面是一个生成减少层数模型的示例："
+
+#: ../../developer_guide/contribution/testing.md:250
+msgid ""
+"Fork the original model repo in modelscope, we need all the files in the "
+"repo except for weights."
+msgstr "在 modelscope 中 fork 原始模型仓库，我们需要仓库中的所有文件，除了权重文件。"
+
+#: ../../developer_guide/contribution/testing.md:251
+#, python-brace-format
+msgid ""
+"Set `num_hidden_layers` to the expected number of layers, e.g., "
+"`{\"num_hidden_layers\": 2,}`"
+msgstr "将 `num_hidden_layers` 设置为期望的层数，例如 `{\"num_hidden_layers\": 2,}`"
+
+#: ../../developer_guide/contribution/testing.md:252
+msgid ""
+"Copy the following python script as `generate_random_weight.py`. Set the "
+"relevant parameters `MODEL_LOCAL_PATH`, `DIST_DTYPE` and `DIST_MODEL_PATH` "
+"as needed:"
+msgstr ""
+"将以下 Python 脚本复制为 `generate_random_weight.py`。根据需要设置相关参数 "
+"`MODEL_LOCAL_PATH`、`DIST_DTYPE` 和 `DIST_MODEL_PATH`："
+
+#: ../../developer_guide/contribution/testing.md:270
+msgid "Run doctest"
+msgstr "运行 doctest"
+
+#: ../../developer_guide/contribution/testing.md:272
+msgid ""
+"vllm-ascend provides a `vllm-ascend/tests/e2e/run_doctests.sh` command to "
+"run all doctests in the doc files. The doctest is a good way to make sure "
+"the docs are up to date and the examples are executable, you can run it "
+"locally as follows:"
+msgstr ""
+"vllm-ascend 提供了一个 `vllm-ascend/tests/e2e/run_doctests.sh` 命令，用于运行文档文件中的所有 "
+"doctest。doctest 是确保文档保持最新且示例可执行的好方法，你可以按照以下方式在本地运行它："
+
+#: ../../developer_guide/contribution/testing.md:280
+msgid ""
+"This will reproduce the same environment as the CI: "
+"[vllm_ascend_doctest.yaml](https://github.com/vllm-project/vllm-"
+"ascend/blob/main/.github/workflows/vllm_ascend_doctest.yaml)."
+msgstr ""
+"这将复现与 CI 相同的环境：[vllm_ascend_doctest.yaml](https://github.com/vllm-"
+"project/vllm-ascend/blob/main/.github/workflows/vllm_ascend_doctest.yaml)。"
--- a/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/evaluation/accuracy_report/index.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/evaluation/accuracy_report/index.po
@@ -0,0 +1,26 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-ascend team
+# This file is distributed under the same license as the vllm-ascend
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-ascend\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../developer_guide/evaluation/accuracy_report/index.md:1
+#: ../../developer_guide/evaluation/accuracy_report/index.md:3
+msgid "Accuracy Report"
+msgstr "准确性报告"
--- a/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/evaluation/index.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/evaluation/index.po
@@ -0,0 +1,26 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-ascend team
+# This file is distributed under the same license as the vllm-ascend
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-ascend\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../developer_guide/evaluation/index.md:1
+#: ../../developer_guide/evaluation/index.md:3
+msgid "Accuracy"
+msgstr "准确性"
--- a/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/evaluation/using_evalscope.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/evaluation/using_evalscope.po
@@ -0,0 +1,112 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-ascend team
+# This file is distributed under the same license as the vllm-ascend
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-ascend\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../developer_guide/evaluation/using_evalscope.md:1
+msgid "Using EvalScope"
+msgstr "使用 EvalScope"
+
+#: ../../developer_guide/evaluation/using_evalscope.md:3
+msgid ""
+"This document will guide you have model inference stress testing and "
+"accuracy testing using [EvalScope](https://github.com/modelscope/evalscope)."
+msgstr ""
+"本文档将指导您如何使用 [EvalScope](https://github.com/modelscope/evalscope) "
+"进行模型推理压力测试和精度测试。"
+
+#: ../../developer_guide/evaluation/using_evalscope.md:5
+msgid "1. Online serving"
+msgstr "1. 在线服务"
+
+#: ../../developer_guide/evaluation/using_evalscope.md:7
+msgid "You can run docker container to start the vLLM server on a single NPU:"
+msgstr "你可以运行 docker 容器，在单个 NPU 上启动 vLLM 服务器："
+
+#: ../../developer_guide/evaluation/using_evalscope.md:34
+msgid "If your service start successfully, you can see the info shown below:"
+msgstr "如果你的服务启动成功，你会看到如下所示的信息："
+
+#: ../../developer_guide/evaluation/using_evalscope.md:42
+msgid ""
+"Once your server is started, you can query the model with input prompts in "
+"new terminal:"
+msgstr "一旦你的服务器启动后，你可以在新的终端中用输入提示词查询模型："
+
+#: ../../developer_guide/evaluation/using_evalscope.md:55
+msgid "2. Install EvalScope using pip"
+msgstr "2. 使用 pip 安装 EvalScope"
+
+#: ../../developer_guide/evaluation/using_evalscope.md:57
+msgid "You can install EvalScope by using:"
+msgstr "你可以使用以下方式安装 EvalScope："
+
+#: ../../developer_guide/evaluation/using_evalscope.md:65
+msgid "3. Run gsm8k accuracy test using EvalScope"
+msgstr "3. 使用 EvalScope 运行 gsm8k 准确率测试"
+
+#: ../../developer_guide/evaluation/using_evalscope.md:67
+msgid "You can `evalscope eval` run gsm8k accuracy test:"
+msgstr "你可以使用 `evalscope eval` 运行 gsm8k 准确率测试："
+
+#: ../../developer_guide/evaluation/using_evalscope.md:78
+#: ../../developer_guide/evaluation/using_evalscope.md:114
+msgid "After 1-2 mins, the output is as shown below:"
+msgstr "1-2 分钟后，输出如下所示："
+
+#: ../../developer_guide/evaluation/using_evalscope.md:88
+msgid ""
+"See more detail in: [EvalScope doc - Model API Service "
+"Evaluation](https://evalscope.readthedocs.io/en/latest/get_started/basic_usage.html#model-"
+"api-service-evaluation)."
+msgstr ""
+"更多详情请见：[EvalScope 文档 - 模型 API "
+"服务评测](https://evalscope.readthedocs.io/en/latest/get_started/basic_usage.html#model-"
+"api-service-evaluation)。"
+
+#: ../../developer_guide/evaluation/using_evalscope.md:90
+msgid "4. Run model inference stress testing using EvalScope"
+msgstr "4. 使用 EvalScope 运行模型推理压力测试"
+
+#: ../../developer_guide/evaluation/using_evalscope.md:92
+msgid "Install EvalScope[perf] using pip"
+msgstr "使用 pip 安装 EvalScope[perf]"
+
+#: ../../developer_guide/evaluation/using_evalscope.md:98
+msgid "Basic usage"
+msgstr "基本用法"
+
+#: ../../developer_guide/evaluation/using_evalscope.md:100
+msgid "You can use `evalscope perf` run perf test:"
+msgstr "你可以使用 `evalscope perf` 运行性能测试："
+
+#: ../../developer_guide/evaluation/using_evalscope.md:112
+msgid "Output results"
+msgstr "输出结果"
+
+#: ../../developer_guide/evaluation/using_evalscope.md:173
+msgid ""
+"See more detail in: [EvalScope doc - Model Inference Stress "
+"Testing](https://evalscope.readthedocs.io/en/latest/user_guides/stress_test/quick_start.html#basic-"
+"usage)."
+msgstr ""
+"更多详情见：[EvalScope 文档 - "
+"模型推理压力测试](https://evalscope.readthedocs.io/en/latest/user_guides/stress_test/quick_start.html#basic-"
+"usage)。"
--- a/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/evaluation/using_lm_eval.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/evaluation/using_lm_eval.po
@@ -0,0 +1,65 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-ascend team
+# This file is distributed under the same license as the vllm-ascend
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-ascend\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../developer_guide/evaluation/using_lm_eval.md:1
+msgid "Using lm-eval"
+msgstr "使用 lm-eval"
+
+#: ../../developer_guide/evaluation/using_lm_eval.md:2
+msgid ""
+"This document will guide you have a accuracy testing using [lm-"
+"eval](https://github.com/EleutherAI/lm-evaluation-harness)."
+msgstr ""
+"本文将指导你如何使用 [lm-eval](https://github.com/EleutherAI/lm-evaluation-harness) "
+"进行准确率测试。"
+
+#: ../../developer_guide/evaluation/using_lm_eval.md:4
+msgid "1. Run docker container"
+msgstr "1. 运行 docker 容器"
+
+#: ../../developer_guide/evaluation/using_lm_eval.md:6
+msgid "You can run docker container on a single NPU:"
+msgstr "你可以在单个NPU上运行docker容器："
+
+#: ../../developer_guide/evaluation/using_lm_eval.md:33
+msgid "2. Run ceval accuracy test using lm-eval"
+msgstr "2. 使用 lm-eval 运行 ceval 准确性测试"
+
+#: ../../developer_guide/evaluation/using_lm_eval.md:34
+msgid "Install lm-eval in the container."
+msgstr "在容器中安装 lm-eval。"
+
+#: ../../developer_guide/evaluation/using_lm_eval.md:39
+msgid "Run the following command:"
+msgstr "运行以下命令："
+
+#: ../../developer_guide/evaluation/using_lm_eval.md:50
+msgid "After 1-2 mins, the output is as shown below:"
+msgstr "1-2 分钟后，输出如下所示："
+
+#: ../../developer_guide/evaluation/using_lm_eval.md:62
+msgid ""
+"You can see more usage on [Lm-eval Docs](https://github.com/EleutherAI/lm-"
+"evaluation-harness/blob/main/docs/README.md)."
+msgstr ""
+"你可以在 [Lm-eval 文档](https://github.com/EleutherAI/lm-evaluation-"
+"harness/blob/main/docs/README.md) 上查看更多用法。"
--- a/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/evaluation/using_opencompass.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/evaluation/using_opencompass.po
@@ -0,0 +1,83 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-ascend team
+# This file is distributed under the same license as the vllm-ascend
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-ascend\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../developer_guide/evaluation/using_opencompass.md:1
+msgid "Using OpenCompass"
+msgstr "使用 OpenCompass"
+
+#: ../../developer_guide/evaluation/using_opencompass.md:2
+msgid ""
+"This document will guide you have a accuracy testing using "
+"[OpenCompass](https://github.com/open-compass/opencompass)."
+msgstr ""
+"本文档将指导你如何使用 [OpenCompass](https://github.com/open-compass/opencompass) "
+"进行准确率测试。"
+
+#: ../../developer_guide/evaluation/using_opencompass.md:4
+msgid "1. Online Serving"
+msgstr "1. 在线服务"
+
+#: ../../developer_guide/evaluation/using_opencompass.md:6
+msgid "You can run docker container to start the vLLM server on a single NPU:"
+msgstr "你可以运行 docker 容器，在单个 NPU 上启动 vLLM 服务器："
+
+#: ../../developer_guide/evaluation/using_opencompass.md:32
+msgid "If your service start successfully, you can see the info shown below:"
+msgstr "如果你的服务启动成功，你会看到如下所示的信息："
+
+#: ../../developer_guide/evaluation/using_opencompass.md:39
+msgid ""
+"Once your server is started, you can query the model with input prompts in "
+"new terminal:"
+msgstr "一旦你的服务器启动后，你可以在新的终端中用输入提示词查询模型："
+
+#: ../../developer_guide/evaluation/using_opencompass.md:51
+msgid "2. Run ceval accuracy test using OpenCompass"
+msgstr "2. 使用 OpenCompass 运行 ceval 准确率测试"
+
+#: ../../developer_guide/evaluation/using_opencompass.md:52
+msgid ""
+"Install OpenCompass and configure the environment variables in the "
+"container."
+msgstr "在容器中安装 OpenCompass 并配置环境变量。"
+
+#: ../../developer_guide/evaluation/using_opencompass.md:64
+msgid ""
+"Add `opencompass/configs/eval_vllm_ascend_demo.py` with the following "
+"content:"
+msgstr "添加 `opencompass/configs/eval_vllm_ascend_demo.py`，内容如下："
+
+#: ../../developer_guide/evaluation/using_opencompass.md:104
+msgid "Run the following command:"
+msgstr "运行以下命令："
+
+#: ../../developer_guide/evaluation/using_opencompass.md:110
+msgid "After 1-2 mins, the output is as shown below:"
+msgstr "1-2 分钟后，输出如下所示："
+
+#: ../../developer_guide/evaluation/using_opencompass.md:120
+msgid ""
+"You can see more usage on [OpenCompass "
+"Docs](https://opencompass.readthedocs.io/en/latest/index.html)."
+msgstr ""
+"你可以在 [OpenCompass "
+"文档](https://opencompass.readthedocs.io/en/latest/index.html) 查看更多用法。"
--- a/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/feature_guide/index.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/feature_guide/index.po
@@ -0,0 +1,33 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-ascend team
+# This file is distributed under the same license as the vllm-ascend
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-ascend\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../developer_guide/feature_guide/index.md:1
+#: ../../developer_guide/feature_guide/index.md:5
+msgid "Feature Guide"
+msgstr "功能指南"
+
+#: ../../developer_guide/feature_guide/index.md:3
+msgid ""
+"This section provides an overview of the features implemented in vLLM "
+"Ascend. Developers can refer to this guide to understand how vLLM Ascend "
+"works."
+msgstr "本节概述了 vLLM Ascend 中实现的功能。开发者可以参考本指南以了解 vLLM Ascend 的工作原理。"
--- a/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/feature_guide/patch.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/feature_guide/patch.po
@@ -0,0 +1,248 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-ascend team
+# This file is distributed under the same license as the vllm-ascend
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-ascend\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../developer_guide/feature_guide/patch.md:1
+msgid "Patch in vLLM Ascend"
+msgstr "在 vLLM Ascend 中的补丁"
+
+#: ../../developer_guide/feature_guide/patch.md:3
+msgid ""
+"vLLM Ascend is a platform plugin for vLLM. Due to the release cycle of vLLM "
+"and vLLM Ascend is different, and the hardware limitation in some case, we "
+"need to patch some code in vLLM to make it compatible with vLLM Ascend."
+msgstr ""
+"vLLM Ascend 是 vLLM 的一个平台插件。由于 vLLM 和 vLLM Ascend "
+"的发布周期不同，并且在某些情况下存在硬件限制，我们需要对 vLLM 进行一些代码补丁，以使其能够兼容 vLLM Ascend。"
+
+#: ../../developer_guide/feature_guide/patch.md:5
+msgid ""
+"In vLLM Ascend code, we provide a patch module `vllm_ascend/patch` to "
+"address the change for vLLM."
+msgstr "在 vLLM Ascend 代码中，我们提供了一个补丁模块 `vllm_ascend/patch` 用于应对 vLLM 的变更。"
+
+#: ../../developer_guide/feature_guide/patch.md:7
+msgid "Principle"
+msgstr "原理"
+
+#: ../../developer_guide/feature_guide/patch.md:9
+msgid ""
+"We should keep in mind that Patch is not the best way to make vLLM Ascend "
+"compatible. It's just a temporary solution. The best way is to contribute "
+"the change to vLLM to make it compatible with vLLM Ascend originally. In "
+"vLLM Ascend, we have the basic principle for Patch strategy:"
+msgstr ""
+"我们需要记住，Patch 不是让 vLLM 兼容 Ascend 的最佳方式，这只是一个临时的解决方案。最好的方法是将修改贡献到 vLLM 项目中，从而让"
+" vLLM 原生支持 Ascend。对于 vLLM Ascend，我们对 Patch 策略有一个基本原则："
+
+#: ../../developer_guide/feature_guide/patch.md:11
+msgid "Less is more. Please do not patch unless it's the only way currently."
+msgstr "少即是多。请不要打补丁，除非这是目前唯一的方法。"
+
+#: ../../developer_guide/feature_guide/patch.md:12
+msgid ""
+"Once a patch is added, it's required to describe the future plan for "
+"removing the patch."
+msgstr "一旦补丁被添加，必须说明将来移除该补丁的计划。"
+
+#: ../../developer_guide/feature_guide/patch.md:13
+msgid "Anytime, clean the patch code is welcome."
+msgstr "任何时候，欢迎清理补丁代码。"
+
+#: ../../developer_guide/feature_guide/patch.md:15
+msgid "How it works"
+msgstr "工作原理"
+
+#: ../../developer_guide/feature_guide/patch.md:17
+msgid "In `vllm_ascend/patch`, you can see the code structure as follows:"
+msgstr "在 `vllm_ascend/patch` 目录中，你可以看到如下代码结构："
+
+#: ../../developer_guide/feature_guide/patch.md:33
+msgid ""
+"**platform**: The patch code in this directory is for patching the code in "
+"vLLM main process. It's called by "
+"`vllm_ascend/platform::NPUPlatform::pre_register_and_update` very early when"
+" vLLM is initialized."
+msgstr ""
+"**platform**：此目录下的补丁代码用于修补 vLLM 主进程中的代码。当 vLLM 初始化时，会在很早的阶段由 "
+"`vllm_ascend/platform::NPUPlatform::pre_register_and_update` 调用。"
+
+#: ../../developer_guide/feature_guide/patch.md:34
+msgid ""
+"For online mode, vLLM process calls the platform patch here "
+"`vllm/vllm/engine/arg_utils.py::AsyncEngineArgs.add_cli_args` when parsing "
+"the cli args."
+msgstr ""
+"对于在线模式，vLLM 进程在解析命令行参数时，会在 "
+"`vllm/vllm/engine/arg_utils.py::AsyncEngineArgs.add_cli_args` 这里调用平台补丁。"
+
+#: ../../developer_guide/feature_guide/patch.md:35
+msgid ""
+"For offline mode, vLLM process calls the platform patch here "
+"`vllm/vllm/engine/arg_utils.py::EngineArgs.create_engine_config` when "
+"parsing the input parameters."
+msgstr ""
+"对于离线模式，vLLM 进程在解析输入参数时，会在此处调用平台补丁 "
+"`vllm/vllm/engine/arg_utils.py::EngineArgs.create_engine_config`。"
+
+#: ../../developer_guide/feature_guide/patch.md:36
+msgid ""
+"**worker**: The patch code in this directory is for patching the code in "
+"vLLM worker process. It's called by "
+"`vllm_ascend/worker/worker_v1::NPUWorker::__init__` when the vLLM worker "
+"process is initialized."
+msgstr ""
+"**worker**：此目录中的补丁代码用于修补 vLLM worker 进程中的代码。在初始化 vLLM worker 进程时，会被 "
+"`vllm_ascend/worker/worker_v1::NPUWorker::__init__` 调用。"
+
+#: ../../developer_guide/feature_guide/patch.md:37
+msgid ""
+"For both online and offline mode, vLLM engine core process calls the worker "
+"patch here `vllm/vllm/worker/worker_base.py::WorkerWrapperBase.init_worker` "
+"when initializing the worker process."
+msgstr ""
+"无论是在线还是离线模式，vLLM 引擎核心进程在初始化 worker 进程时，都会在这里调用 worker "
+"补丁：`vllm/vllm/worker/worker_base.py::WorkerWrapperBase.init_worker`。"
+
+#: ../../developer_guide/feature_guide/patch.md:39
+msgid ""
+"In both **platform** and **worker** folder, there are several patch modules."
+" They are used for patching different version of vLLM."
+msgstr "在 **platform** 和 **worker** 文件夹中都有一些补丁模块。它们用于修补不同版本的 vLLM。"
+
+#: ../../developer_guide/feature_guide/patch.md:41
+msgid ""
+"`patch_0_9_2`: This module is used for patching vLLM 0.9.2. The version is "
+"always the nearest version of vLLM. Once vLLM is released, we will drop this"
+" patch module and bump to a new version. For example, `patch_0_9_2` is used "
+"for patching vLLM 0.9.2."
+msgstr ""
+"`patch_0_9_2`：此模块用于修补 vLLM 0.9.2。该版本始终对应于 vLLM 的最近版本。一旦 vLLM "
+"发布新版本，我们将移除此补丁模块并升级到新版本。例如，`patch_0_9_2` 就是用于修补 vLLM 0.9.2 的。"
+
+#: ../../developer_guide/feature_guide/patch.md:42
+msgid ""
+"`patch_main`: This module is used for patching the code in vLLM main branch."
+msgstr "`patch_main`：该模块用于修补 vLLM 主分支代码。"
+
+#: ../../developer_guide/feature_guide/patch.md:43
+msgid ""
+"`patch_common`: This module is used for patching both vLLM 0.9.2 and vLLM "
+"main branch."
+msgstr "`patch_common`：此模块用于同时修补 vLLM 0.9.2 版本和 vLLM 主分支。"
+
+#: ../../developer_guide/feature_guide/patch.md:45
+msgid "How to write a patch"
+msgstr "如何撰写补丁"
+
+#: ../../developer_guide/feature_guide/patch.md:47
+msgid ""
+"Before writing a patch, following the principle above, we should patch the "
+"least code. If it's necessary, we can patch the code in either **platform** "
+"and **worker** folder. Here is an example to patch `distributed` module in "
+"vLLM."
+msgstr ""
+"在编写补丁之前，遵循上述原则，我们应尽量修改最少的代码。如果有必要，我们可以修改 **platform** 和 **worker** "
+"文件夹中的代码。下面是一个在 vLLM 中修改 `distributed` 模块的示例。"
+
+#: ../../developer_guide/feature_guide/patch.md:49
+msgid ""
+"Decide which version of vLLM we should patch. For example, after analysis, "
+"here we want to patch both 0.9.2 and main of vLLM."
+msgstr "决定我们应该修补哪个版本的 vLLM。例如，经过分析后，这里我们想要同时修补 vLLM 的 0.9.2 版和主分支（main）。"
+
+#: ../../developer_guide/feature_guide/patch.md:50
+msgid ""
+"Decide which process we should patch. For example, here `distributed` "
+"belongs to the vLLM main process, so we should patch `platform`."
+msgstr "决定我们应该修补哪个进程。例如，这里 `distributed` 属于 vLLM 主进程，所以我们应该修补 `platform`。"
+
+#: ../../developer_guide/feature_guide/patch.md:51
+#, python-brace-format
+msgid ""
+"Create the patch file in the right folder. The file should be named as "
+"`patch_{module_name}.py`. The example here is "
+"`vllm_ascend/patch/platform/patch_common/patch_distributed.py`."
+msgstr ""
+"在正确的文件夹中创建补丁文件。文件应命名为 `patch_{module_name}.py`。此处的示例是 "
+"`vllm_ascend/patch/platform/patch_common/patch_distributed.py`。"
+
+#: ../../developer_guide/feature_guide/patch.md:52
+msgid "Write your patch code in the new file. Here is an example:"
+msgstr "在新文件中编写你的补丁代码。以下是一个示例："
+
+#: ../../developer_guide/feature_guide/patch.md:62
+msgid ""
+"Import the patch file in `__init__.py`. In this example, add `import "
+"vllm_ascend.patch.platform.patch_common.patch_distributed` into "
+"`vllm_ascend/patch/platform/patch_common/__init__.py`."
+msgstr ""
+"在 `__init__.py` 中导入补丁文件。在这个示例中，将 `import "
+"vllm_ascend.patch.platform.patch_common.patch_distributed` 添加到 "
+"`vllm_ascend/patch/platform/patch_common/__init__.py` 中。"
+
+#: ../../developer_guide/feature_guide/patch.md:63
+msgid ""
+"Add the description of the patch in `vllm_ascend/patch/__init__.py`. The "
+"description format is as follows:"
+msgstr "在 `vllm_ascend/patch/__init__.py` 中添加补丁的描述。描述格式如下："
+
+#: ../../developer_guide/feature_guide/patch.md:77
+msgid ""
+"Add the Unit Test and E2E Test. Any newly added code in vLLM Ascend should "
+"contain the Unit Test and E2E Test as well. You can find more details in "
+"[test guide](../contribution/testing.md)"
+msgstr ""
+"添加单元测试和端到端（E2E）测试。在 vLLM Ascend 中新增的任何代码也应包含单元测试和端到端测试。更多详情请参见 "
+"[测试指南](../contribution/testing.md)。"
+
+#: ../../developer_guide/feature_guide/patch.md:80
+msgid "Limitation"
+msgstr "限制"
+
+#: ../../developer_guide/feature_guide/patch.md:81
+msgid ""
+"In V1 Engine, vLLM starts three kinds of process: Main process, EngineCore "
+"process and Worker process. Now vLLM Ascend only support patch the code in "
+"Main process and Worker process by default. If you want to patch the code "
+"runs in EngineCore process, you should patch EngineCore process entirely "
+"during setup, the entry code is here `vllm.v1.engine.core`. Please override "
+"`EngineCoreProc` and `DPEngineCoreProc` entirely."
+msgstr ""
+"在 V1 引擎中，vLLM 会启动三种类型的进程：主进程、EngineCore 进程和 Worker 进程。现在 vLLM Ascend "
+"默认只支持在主进程和 Worker 进程中打补丁代码。如果你想要在 EngineCore 进程中打补丁，你需要在设置阶段对 EngineCore "
+"进程整体打补丁，入口代码在 `vllm.v1.engine.core`。请完全重写 `EngineCoreProc` 和 "
+"`DPEngineCoreProc`。"
+
+#: ../../developer_guide/feature_guide/patch.md:82
+msgid ""
+"If you are running an edited vLLM code, the version of the vLLM may be "
+"changed automatically. For example, if you runs an edited vLLM based on "
+"v0.9.n, the version of vLLM may be change to v0.9.nxxx, in this case, the "
+"patch for v0.9.n in vLLM Ascend would not work as expect, because that vLLM "
+"Ascend can't distinguish the version of vLLM you're using. In this case, you"
+" can set the environment variable `VLLM_VERSION` to specify the version of "
+"vLLM you're using, then the patch for v0.9.2 should work."
+msgstr ""
+"如果你运行的是经过编辑的 vLLM 代码，vLLM 的版本可能会被自动更改。例如，如果你基于 v0.9.n 运行了编辑后的 vLLM，vLLM "
+"的版本可能会变为 v0.9.nxxx，在这种情况下，vLLM Ascend 的 v0.9.n 补丁将无法正常工作，因为 vLLM Ascend "
+"无法区分你所使用的 vLLM 版本。这时，你可以设置环境变量 `VLLM_VERSION` 来指定你所使用的 vLLM 版本，这样对 v0.9.2 "
+"的补丁就应该可以正常工作。"
--- a/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/modeling/adding_a_new_model.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/modeling/adding_a_new_model.po
@@ -0,0 +1,333 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-ascend team
+# This file is distributed under the same license as the vllm-ascend
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-ascend\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:1
+msgid "Adding a New Model"
+msgstr "添加新模型"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:3
+msgid ""
+"This guide demonstrates how to integrate a novel or customized model into "
+"vllm-ascend. For foundational concepts, it is highly recommended to refer to"
+" [vllm official doc: Adding a New "
+"Model](https://docs.vllm.ai/en/stable/contributing/model/) first."
+msgstr ""
+"本指南演示如何将新颖或自定义的模型集成到 vllm-ascend 中。对于基础概念，强烈建议先参考 [vllm "
+"官方文档：添加新模型](https://docs.vllm.ai/en/stable/contributing/model/)。"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:6
+msgid "Step 1: Implementing Models with `torch` and `torch_npu`"
+msgstr "步骤 1：使用 `torch` 和 `torch_npu` 实现模型"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:8
+msgid ""
+"This section provides instructions for implementing new models compatible "
+"with vllm and vllm-ascend."
+msgstr "本节提供了实现与 vllm 和 vllm-ascend 兼容的新模型的相关说明。"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:10
+msgid "**Before starting:**"
+msgstr "**开始之前：**"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:12
+msgid ""
+"Verify whether your model already exists in vllm's "
+"[models](https://github.com/vllm-"
+"project/vllm/tree/main/vllm/model_executor/models) directory."
+msgstr ""
+"请确认你的模型是否已经存在于 vllm 的 [models](https://github.com/vllm-"
+"project/vllm/tree/main/vllm/model_executor/models) 目录中。"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:13
+msgid ""
+"Use existing models' implementation as templates to accelerate your "
+"development."
+msgstr "使用已有模型的实现作为模板以加速您的开发。"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:15
+msgid "Method 1: Implementing New Models from Scratch"
+msgstr "方法一：从零开始实现新模型"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:17
+msgid ""
+"Follow vllm's [OPT model "
+"adaptation](https://docs.vllm.ai/en/stable/contributing/model/basic.html) "
+"example for guidance."
+msgstr ""
+"请参考 vllm 的 [OPT "
+"模型适配](https://docs.vllm.ai/en/stable/contributing/model/basic.html) 示例进行操作。"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:19
+msgid "**Key implementation requirements:**"
+msgstr "**关键实现要求：**"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:21
+msgid "Place model files in `vllm_ascend/models/` directory."
+msgstr "请将模型文件放在 `vllm_ascend/models/` 目录下。"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:23
+msgid ""
+"Standard module structure for decoder-only LLMs (please checkout vllm's "
+"implementations for other kinds of model):"
+msgstr "解码器-only LLMs 的标准模块结构（请参考 vllm 对其他类型模型的实现）："
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:25
+msgid "`*ModelForCausalLM` (top-level wrapper)"
+msgstr "`*ModelForCausalLM`（顶层包装器）"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:26
+msgid "`*Model` (main architecture)"
+msgstr "`*Model`（主架构）"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:27
+msgid "`*DecoderLayer` (transformer block)"
+msgstr "`*DecoderLayer` （transformer 块）"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:28
+msgid "`*Attention` and `*MLP` (specific computation unit)"
+msgstr "`*Attention` 和 `*MLP`（特定计算单元）"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:31
+msgid "`*` denotes your model's unique identifier."
+msgstr "`*` 表示你的模型的唯一标识符。"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:34
+msgid "Critical Implementation Details:"
+msgstr "关键实现细节："
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:36
+msgid "All modules must include a `prefix` argument in `__init__()`."
+msgstr "所有模块在 `__init__()` 方法中都必须包含一个 `prefix` 参数。"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:38
+msgid "**Required interfaces:**"
+msgstr "**必需的接口：**"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:30
+msgid "Module Type"
+msgstr "模块类型"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:30
+msgid "Required Methods"
+msgstr "必需的方法"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:30
+msgid "`*ModelForCausalLM`"
+msgstr "`*ModelForCausalLM`"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:30
+msgid "`get_input_embeddings`, `compute_logits`, `load_weights`"
+msgstr "`get_input_embeddings`，`compute_logits`，`load_weights`"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:30
+msgid "`*Model`"
+msgstr "`*模型`"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:30
+msgid "`get_input_embeddings`, `load_weights`"
+msgstr "`get_input_embeddings`，`load_weights`"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:45
+msgid "Attention Backend Integration:"
+msgstr "注意后端集成："
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:47
+msgid ""
+"Importing attention via `from vllm.attention import Attention` can "
+"automatically leverage the attention backend routing of vllm-ascend (see: "
+"`get_attn_backend_cls()` in `vllm_ascend/platform.py`)."
+msgstr ""
+"通过 `from vllm.attention import Attention` 导入 attention 可以自动利用 vllm-ascend "
+"的注意力后端路由（详见：`vllm_ascend/platform.py` 中的 `get_attn_backend_cls()`）。"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:49
+msgid "Tensor Parallelism:"
+msgstr "张量并行："
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:51
+msgid ""
+"Use vllm's parallel layers (`ColumnParallelLinear`, "
+"`VocabParallelEmbedding`, etc.) to implement models supporting tensor "
+"parallelism. Note that Ascend-specific customizations are implemented in "
+"`vllm_ascend/ops/` directory (RMSNorm, VocabParallelEmbedding, etc.)."
+msgstr ""
+"使用 vllm 的并行层（如 `ColumnParallelLinear`、`VocabParallelEmbedding` "
+"等）来实现支持张量并行的模型。需要注意的是，Ascend 特有的自定义实现（如 RMSNorm、VocabParallelEmbedding 等）位于 "
+"`vllm_ascend/ops/` 目录下。"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:53
+msgid ""
+"**Reference Implementation Template** (assumed path: "
+"`vllm_ascend/models/custom_model.py`):"
+msgstr "**参考实现模板**（假定路径：`vllm_ascend/models/custom_model.py`）："
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:135
+msgid "Method 2: Customizing Existing vLLM Models"
+msgstr "方法二：自定义已有的 vLLM 模型"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:137
+msgid ""
+"For most use cases, extending existing implementations is preferable. We "
+"demonstrate an example to inherit from base classes and implement a custom "
+"deepseek model below (assumed path: `vllm_ascend/models/deepseek_v2.py`)."
+msgstr ""
+"对于大多数使用场景，建议扩展已有的实现。我们在下面演示了一个示例，通过继承基类并实现一个自定义的 deepseek "
+"模型（假定路径：`vllm_ascend/models/deepseek_v2.py`）。"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:175
+msgid ""
+"For a complete implementation reference, see: "
+"`vllm_ascend/models/deepseek_v2.py`."
+msgstr "完整的实现参考请见：`vllm_ascend/models/deepseek_v2.py`。"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:178
+msgid "Step 2: Registering Custom Models using ModelRegistry Plugins in vLLM"
+msgstr "第2步：使用 vLLM 中的 ModelRegistry 插件注册自定义模型"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:180
+msgid ""
+"vllm provides a plugin mechanism for registering externally implemented "
+"models without modifying its codebase."
+msgstr "vllm 提供了一种插件机制，可用于注册外部实现的模型，而无需修改其代码库。"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:182
+msgid ""
+"To integrate your implemented model from `vllm_ascend/models/` directory:"
+msgstr "要集成你在 `vllm_ascend/models/` 目录下实现的模型："
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:184
+msgid ""
+"Import your model implementation in `vllm_ascend/models/__init__.py` using "
+"relative imports."
+msgstr "使用相对导入在 `vllm_ascend/models/__init__.py` 中导入你的模型实现。"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:185
+msgid ""
+"Register the model wrapper class via `vllm.ModelRegistry.register_model()` "
+"function."
+msgstr "通过 `vllm.ModelRegistry.register_model()` 函数注册模型包装类。"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:187
+msgid ""
+"**Reference Registration Template** (an example of registering new models in"
+" `vllm_ascend/models/__init__.py`):"
+msgstr "**参考注册模板**（在 `vllm_ascend/models/__init__.py` 注册新模型的示例）："
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:210
+msgid ""
+"The first argument of `vllm.ModelRegistry.register_model()` indicates the "
+"unique architecture identifier which must match `architectures` in "
+"`config.json` of the model."
+msgstr ""
+"`vllm.ModelRegistry.register_model()` 的第一个参数表示唯一的架构标识符，这个标识符必须与模型的 "
+"`config.json` 文件中的 `architectures` 匹配。"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:221
+msgid "Step 3: Verification"
+msgstr "第 3 步：验证"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:223
+msgid "Case 1: Overriding Existing vLLM Model Architecture"
+msgstr "案例 1：重载已有的 vLLM 模型架构"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:225
+msgid ""
+"If you're registering a customized model architecture based on vllm's "
+"existing implementation (overriding vllm's original class), when executing "
+"vllm offline/online inference (using any model), you'll observe warning logs"
+" similar to the following output from "
+"`vllm/models_executor/models/registry.py`."
+msgstr ""
+"如果你基于 vllm 的现有实现注册了一个自定义的模型架构（覆盖了 vllm 的原始类），在执行 vllm "
+"的离线/在线推理（无论使用哪个模型）时，你会看到类似于 `vllm/models_executor/models/registry.py` "
+"输出的警告日志。"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:231
+msgid "Case 2: Registering New Model Architecture"
+msgstr "案例2：注册新模型架构"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:233
+msgid ""
+"If you're registering a novel model architecture not present in vllm "
+"(creating a completely new class), current logs won't provide explicit "
+"confirmation by default. It's recommended to add the following logging "
+"statement at the end of the `register_model` method in "
+"`vllm/models_executor/models/registry.py`."
+msgstr ""
+"如果你注册了 vllm 中不存在的新模型架构（创建一个全新的类），当前日志默认不会提供明确的确认信息。建议在 "
+"`vllm/models_executor/models/registry.py` 文件中的 `register_model` "
+"方法末尾添加如下日志语句。"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:239
+msgid ""
+"After adding this line, you will see confirmation logs shown below when "
+"running vllm offline/online inference (using any model)."
+msgstr "添加这一行之后，当你运行 vllm 离线/在线推理（使用任何模型）时，将会看到如下确认日志。"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:245
+msgid ""
+"This log output confirms your novel model architecture has been successfully"
+" registered in vllm."
+msgstr "该日志输出确认了你的新模型架构已成功在 vllm 中注册。"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:247
+msgid "Step 4: Testing"
+msgstr "第4步：测试"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:249
+msgid ""
+"After adding a new model, we should do basic functional test (offline/online"
+" inference), accuracy test and performance benchmark for the model."
+msgstr "在添加新模型后，我们应对该模型进行基本功能测试（离线/在线推理）、准确率测试和性能基准测试。"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:251
+msgid "Find more details at:"
+msgstr "更多详情请见："
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:253
+msgid ""
+"[Accuracy test guide](https://vllm-"
+"ascend.readthedocs.io/en/latest/developer_guide/evaluation/index.html)"
+msgstr ""
+"[精度测试指南](https://vllm-"
+"ascend.readthedocs.io/en/latest/developer_guide/evaluation/index.html)"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:254
+msgid ""
+"[Performance benchmark guide](https://vllm-"
+"ascend.readthedocs.io/en/latest/developer_guide/performance/performance_benchmark.html)"
+msgstr ""
+"[性能基准指南](https://vllm-"
+"ascend.readthedocs.io/en/latest/developer_guide/performance/performance_benchmark.html)"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:256
+msgid "Step 5: Updating Supported Models Doc"
+msgstr "第5步：更新支持的模型文档"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:258
+msgid ""
+"At last, if all the steps above are completed, you should add the new model "
+"into our [Supported Models](https://vllm-"
+"ascend.readthedocs.io/en/latest/user_guide/supported_models.html) doc."
+msgstr ""
+"最后，如果以上所有步骤都已完成，你应该将新模型添加到我们的[支持的模型](https://vllm-"
+"ascend.readthedocs.io/en/latest/user_guide/supported_models.html)文档中。"
--- a/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/modeling/adding_a_new_multimodal_model.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/modeling/adding_a_new_multimodal_model.po
@@ -0,0 +1,29 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-ascend team
+# This file is distributed under the same license as the vllm-ascend
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-ascend\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../developer_guide/modeling/adding_a_new_multimodal_model.md:1
+msgid "Adding a New Multi-Modal Model"
+msgstr "添加新的多模态模型"
+
+#: ../../developer_guide/modeling/adding_a_new_multimodal_model.md:3
+msgid "**_Comming soon ..._**"
+msgstr "**_敬请期待 ..._**"
--- a/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/modeling/index.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/modeling/index.po
@@ -0,0 +1,32 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-ascend team
+# This file is distributed under the same license as the vllm-ascend
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-ascend\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../developer_guide/modeling/index.md:1
+#: ../../developer_guide/modeling/index.md:5
+msgid "Modeling"
+msgstr "新模型"
+
+#: ../../developer_guide/modeling/index.md:3
+msgid ""
+"This section provides tutorials of how to implement and register a new model"
+" into vllm-ascend."
+msgstr "本节提供了如何在 vllm-ascend 中实现并注册新模型的教程。"
--- a/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/performance/index.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/performance/index.po
@@ -0,0 +1,26 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-ascend team
+# This file is distributed under the same license as the vllm-ascend
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-ascend\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../developer_guide/performance/index.md:1
+#: ../../developer_guide/performance/index.md:3
+msgid "Performance"
+msgstr "性能"
--- a/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/performance/performance_benchmark.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/performance/performance_benchmark.po
@@ -0,0 +1,88 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-ascend team
+# This file is distributed under the same license as the vllm-ascend
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-ascend\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../developer_guide/performance/performance_benchmark.md:1
+msgid "Performance Benchmark"
+msgstr "性能基准"
+
+#: ../../developer_guide/performance/performance_benchmark.md:2
+msgid ""
+"This document details the benchmark methodology for vllm-ascend, aimed at "
+"evaluating the performance under a variety of workloads. To maintain "
+"alignment with vLLM, we use the [benchmark](https://github.com/vllm-"
+"project/vllm/tree/main/benchmarks) script provided by the vllm project."
+msgstr ""
+"本文档详细说明了 vllm-ascend 的基准测试方法，旨在评估其在多种工作负载下的性能。为了与 vLLM 保持一致，我们使用 vllm 项目提供的 "
+"[benchmark](https://github.com/vllm-project/vllm/tree/main/benchmarks) 脚本。"
+
+#: ../../developer_guide/performance/performance_benchmark.md:4
+msgid ""
+"**Benchmark Coverage**: We measure offline e2e latency and throughput, and "
+"fixed-QPS online serving benchmarks, for more details see [vllm-ascend "
+"benchmark scripts](https://github.com/vllm-project/vllm-"
+"ascend/tree/main/benchmarks)."
+msgstr ""
+"**基准测试覆盖范围**：我们测量离线端到端延迟和吞吐量，以及固定 QPS 的在线服务基准测试。更多详情请参见 [vllm-ascend "
+"基准测试脚本](https://github.com/vllm-project/vllm-ascend/tree/main/benchmarks)。"
+
+#: ../../developer_guide/performance/performance_benchmark.md:6
+msgid "1. Run docker container"
+msgstr "1. 运行 docker 容器"
+
+#: ../../developer_guide/performance/performance_benchmark.md:31
+msgid "2. Install dependencies"
+msgstr "2. 安装依赖项"
+
+#: ../../developer_guide/performance/performance_benchmark.md:38
+msgid "3. (Optional)Prepare model weights"
+msgstr "3.（可选）准备模型权重"
+
+#: ../../developer_guide/performance/performance_benchmark.md:39
+msgid ""
+"For faster running speed, we recommend downloading the model in advance："
+msgstr "为了更快的运行速度，建议提前下载模型："
+
+#: ../../developer_guide/performance/performance_benchmark.md:44
+msgid ""
+"You can also replace all model paths in the [json](https://github.com/vllm-"
+"project/vllm-ascend/tree/main/benchmarks/tests) files with your local paths:"
+msgstr ""
+"你也可以将 [json](https://github.com/vllm-project/vllm-"
+"ascend/tree/main/benchmarks/tests) 文件中的所有模型路径替换为你的本地路径："
+
+#: ../../developer_guide/performance/performance_benchmark.md:60
+msgid "4. Run benchmark script"
+msgstr "4. 运行基准测试脚本"
+
+#: ../../developer_guide/performance/performance_benchmark.md:61
+msgid "Run benchmark script:"
+msgstr "运行基准测试脚本："
+
+#: ../../developer_guide/performance/performance_benchmark.md:66
+msgid "After about 10 mins, the output is as shown below:"
+msgstr "大约 10 分钟后，输出如下所示："
+
+#: ../../developer_guide/performance/performance_benchmark.md:176
+msgid ""
+"The result json files are generated into the path `benchmark/results` These "
+"files contain detailed benchmarking results for further analysis."
+msgstr "结果 json 文件会生成到路径 `benchmark/results`。这些文件包含了用于进一步分析的详细基准测试结果。"
--- a/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/performance/profile_execute_duration.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/performance/profile_execute_duration.po
@@ -0,0 +1,81 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-ascend team
+# This file is distributed under the same license as the vllm-ascend
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-ascend\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../developer_guide/performance/profile_execute_duration.md:1
+msgid "Profile Execute Duration"
+msgstr "配置执行持续时间"
+
+#: ../../developer_guide/performance/profile_execute_duration.md:3
+msgid ""
+"The execution duration of each stage (including pre/post-processing, model "
+"forward, etc.) usually needs to be captured during a complete inference "
+"process. Typically, this is done by using `torch.npu.synchronize()` and "
+"obtaining CPU timestamps, which increases the performance overhead of "
+"host/device synchronization."
+msgstr ""
+"在完整的推理过程中，通常需要记录每个阶段（包括前/后处理、模型前向等）的执行时长。一般通过使用 `torch.npu.synchronize()` "
+"并获取 CPU 时间戳来实现，这会增加主机/设备同步的性能开销。"
+
+#: ../../developer_guide/performance/profile_execute_duration.md:5
+msgid ""
+"**To reduce the performance overhead, we add this feature, using the NPU "
+"event timestamp mechanism to observe the device execution time "
+"asynchronously.**"
+msgstr "**为了减少性能开销，我们添加了此功能，使用 NPU 事件时间戳机制异步观测设备的执行时间。**"
+
+#: ../../developer_guide/performance/profile_execute_duration.md:7
+msgid "Usage"
+msgstr "用法"
+
+#: ../../developer_guide/performance/profile_execute_duration.md:8
+msgid ""
+"Use the environment variable `VLLM_ASCEND_MODEL_EXECUTE_TIME_OBSERVE` to "
+"enable this feature."
+msgstr "使用环境变量 `VLLM_ASCEND_MODEL_EXECUTE_TIME_OBSERVE` 来启用此功能。"
+
+#: ../../developer_guide/performance/profile_execute_duration.md:9
+msgid ""
+"Use the non-blocking API `ProfileExecuteDuration().capture_async` to set "
+"observation points asynchronously when you need to observe the execution "
+"duration."
+msgstr ""
+"当你需要观察执行时长时，可以使用非阻塞 API `ProfileExecuteDuration().capture_async` 异步设置观察点。"
+
+#: ../../developer_guide/performance/profile_execute_duration.md:10
+msgid ""
+"Use the blocking API `ProfileExecuteDuration().pop_captured_sync` at an "
+"appropriate time to get and print the execution durations of all observed "
+"stages."
+msgstr ""
+"在适当的时机使用阻塞式 API `ProfileExecuteDuration().pop_captured_sync` "
+"获取并打印所有已观察到阶段的执行时长。"
+
+#: ../../developer_guide/performance/profile_execute_duration.md:12
+msgid ""
+"**We have instrumented the key inference stages (including pre-processing, "
+"model forward pass, etc.) for execute duration profiling. Execute the script"
+" as follows:**"
+msgstr "**我们已经对关键的推理阶段（包括预处理、模型前向传递等）进行了执行时长分析的检测。请按如下方式执行脚本：**"
+
+#: ../../developer_guide/performance/profile_execute_duration.md:17
+msgid "Example Output"
+msgstr "示例输出"
--- a/docs/source/locale/zh_CN/LC_MESSAGES/faqs.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/faqs.po
@@ -0,0 +1,479 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-ascend team
+# This file is distributed under the same license as the vllm-ascend
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-ascend\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../faqs.md:1
+msgid "FAQs"
+msgstr ""
+
+#: ../../faqs.md:3
+msgid "Version Specific FAQs"
+msgstr "特定版本常见问题"
+
+#: ../../faqs.md:5
+msgid ""
+"[[v0.7.3.post1] FAQ & Feedback](https://github.com/vllm-project/vllm-"
+"ascend/issues/1007)"
+msgstr ""
+"[[v0.7.3.post1] 常见问题与反馈](https://github.com/vllm-project/vllm-"
+"ascend/issues/1007)"
+
+#: ../../faqs.md:6
+msgid ""
+"[[v0.9.2rc1] FAQ & Feedback](https://github.com/vllm-project/vllm-"
+"ascend/issues/1742)"
+msgstr ""
+"[[v0.9.2rc1] 常见问题与反馈](https://github.com/vllm-project/vllm-"
+"ascend/issues/1742)"
+
+#: ../../faqs.md:8
+msgid "General FAQs"
+msgstr "常见问题解答"
+
+#: ../../faqs.md:10
+msgid "1. What devices are currently supported?"
+msgstr "1. 目前支持哪些设备？"
+
+#: ../../faqs.md:12
+msgid ""
+"Currently, **ONLY** Atlas A2 series(Ascend-cann-kernels-910b) and Atlas "
+"300I(Ascend-cann-kernels-310p) series are supported:"
+msgstr ""
+"目前，**仅**支持 Atlas A2 系列（Ascend-cann-kernels-910b）和 Atlas 300I（Ascend-cann-"
+"kernels-310p）系列："
+
+#: ../../faqs.md:14
+msgid ""
+"Atlas A2 Training series (Atlas 800T A2, Atlas 900 A2 PoD, Atlas 200T A2 "
+"Box16, Atlas 300T A2)"
+msgstr ""
+"Atlas A2 训练系列（Atlas 800T A2，Atlas 900 A2 PoD，Atlas 200T A2 Box16，Atlas 300T "
+"A2）"
+
+#: ../../faqs.md:15
+msgid "Atlas 800I A2 Inference series (Atlas 800I A2)"
+msgstr "Atlas 800I A2 推理系列（Atlas 800I A2）"
+
+#: ../../faqs.md:16
+msgid "Atlas 300I Inference series (Atlas 300I Duo)"
+msgstr "Atlas 300I 推理系列（Atlas 300I Duo）"
+
+#: ../../faqs.md:18
+msgid "Below series are NOT supported yet:"
+msgstr "以下系列目前尚不受支持："
+
+#: ../../faqs.md:19
+msgid "Atlas 200I A2 (Ascend-cann-kernels-310b) unplanned yet"
+msgstr "Atlas 200I A2（Ascend-cann-kernels-310b）尚未计划"
+
+#: ../../faqs.md:20
+msgid "Ascend 910, Ascend 910 Pro B (Ascend-cann-kernels-910) unplanned yet"
+msgstr "Ascend 910，Ascend 910 Pro B（Ascend-cann-kernels-910）尚未计划"
+
+#: ../../faqs.md:22
+msgid ""
+"From a technical view, vllm-ascend support would be possible if the torch-"
+"npu is supported. Otherwise, we have to implement it by using custom ops. We"
+" are also welcome to join us to improve together."
+msgstr ""
+"从技术角度来看，如果支持 torch-npu，则可以支持 vllm-ascend。否则，我们需要通过自定义算子来实现。我们也欢迎大家一起加入，共同改进。"
+
+#: ../../faqs.md:24
+msgid "2. How to get our docker containers?"
+msgstr "2. 如何获取我们的 docker 容器？"
+
+#: ../../faqs.md:26
+msgid ""
+"You can get our containers at `Quay.io`, e.g., [<u>vllm-"
+"ascend</u>](https://quay.io/repository/ascend/vllm-ascend?tab=tags) and "
+"[<u>cann</u>](https://quay.io/repository/ascend/cann?tab=tags)."
+msgstr ""
+"你可以在 `Quay.io` 获取我们的容器，例如，[<u>vllm-"
+"ascend</u>](https://quay.io/repository/ascend/vllm-ascend?tab=tags) 和 "
+"[<u>cann</u>](https://quay.io/repository/ascend/cann?tab=tags)。"
+
+#: ../../faqs.md:28
+msgid ""
+"If you are in China, you can use `daocloud` to accelerate your downloading:"
+msgstr "如果你在中国，可以使用 `daocloud` 来加速下载："
+
+#: ../../faqs.md:36
+msgid "3. What models does vllm-ascend supports?"
+msgstr "3. vllm-ascend 支持哪些模型？"
+
+#: ../../faqs.md:38
+msgid ""
+"Find more details [<u>here</u>](https://vllm-"
+"ascend.readthedocs.io/en/latest/user_guide/support_matrix/supported_models.html)."
+msgstr ""
+"在[<u>此处</u>](https://vllm-"
+"ascend.readthedocs.io/en/latest/user_guide/support_matrix/supported_models.html)查看更多详细信息。"
+
+#: ../../faqs.md:40
+msgid "4. How to get in touch with our community?"
+msgstr "4. 如何与我们的社区取得联系？"
+
+#: ../../faqs.md:42
+msgid ""
+"There are many channels that you can communicate with our community "
+"developers / users:"
+msgstr "你可以通过多种渠道与我们的社区开发者/用户进行交流："
+
+#: ../../faqs.md:44
+msgid ""
+"Submit a GitHub [<u>issue</u>](https://github.com/vllm-project/vllm-"
+"ascend/issues?page=1)."
+msgstr ""
+"提交一个 GitHub [<u>issue</u>](https://github.com/vllm-project/vllm-"
+"ascend/issues?page=1)。"
+
+#: ../../faqs.md:45
+msgid ""
+"Join our [<u>weekly "
+"meeting</u>](https://docs.google.com/document/d/1hCSzRTMZhIB8vRq1_qOOjx4c9uYUxvdQvDsMV2JcSrw/edit?tab=t.0#heading=h.911qu8j8h35z)"
+" and share your ideas."
+msgstr ""
+"加入我们的[<u>每周会议</u>](https://docs.google.com/document/d/1hCSzRTMZhIB8vRq1_qOOjx4c9uYUxvdQvDsMV2JcSrw/edit?tab=t.0#heading=h.911qu8j8h35z)，并分享你的想法。"
+
+#: ../../faqs.md:46
+msgid ""
+"Join our [<u>WeChat</u>](https://github.com/vllm-project/vllm-"
+"ascend/issues/227) group and ask your quenstions."
+msgstr ""
+"加入我们的 [<u>微信群</u>](https://github.com/vllm-project/vllm-ascend/issues/227) "
+"并提问你的问题。"
+
+#: ../../faqs.md:47
+msgid ""
+"Join our ascend channel in [<u>vLLM "
+"forums</u>](https://discuss.vllm.ai/c/hardware-support/vllm-ascend-"
+"support/6) and publish your topics."
+msgstr ""
+"加入我们在 [<u>vLLM 论坛</u>](https://discuss.vllm.ai/c/hardware-support/vllm-"
+"ascend-support/6) 的 ascend 频道并发布你的话题。"
+
+#: ../../faqs.md:49
+msgid "5. What features does vllm-ascend V1 supports?"
+msgstr "5. vllm-ascend V1 支持哪些功能？"
+
+#: ../../faqs.md:51
+msgid ""
+"Find more details [<u>here</u>](https://vllm-"
+"ascend.readthedocs.io/en/latest/user_guide/support_matrix/supported_features.html)."
+msgstr ""
+"在[<u>这里</u>](https://vllm-"
+"ascend.readthedocs.io/en/latest/user_guide/support_matrix/supported_features.html)找到更多详细信息。"
+
+#: ../../faqs.md:53
+msgid ""
+"6. How to solve the problem of \"Failed to infer device type\" or "
+"\"libatb.so: cannot open shared object file\"?"
+msgstr "6. 如何解决“无法推断设备类型”或“libatb.so：无法打开共享对象文件”问题？"
+
+#: ../../faqs.md:55
+msgid ""
+"Basically, the reason is that the NPU environment is not configured "
+"correctly. You can:"
+msgstr "基本上，原因是 NPU 环境没有正确配置。你可以："
+
+#: ../../faqs.md:56
+msgid ""
+"try `source /usr/local/Ascend/nnal/atb/set_env.sh` to enable NNAL package."
+msgstr "尝试运行 `source /usr/local/Ascend/nnal/atb/set_env.sh` 以启用 NNAL 包。"
+
+#: ../../faqs.md:57
+msgid ""
+"try `source /usr/local/Ascend/ascend-toolkit/set_env.sh` to enable CANN "
+"package."
+msgstr "尝试运行 `source /usr/local/Ascend/ascend-toolkit/set_env.sh` 以启用 CANN 包。"
+
+#: ../../faqs.md:58
+msgid "try `npu-smi info` to check whether the NPU is working."
+msgstr "尝试运行 `npu-smi info` 来检查 NPU 是否正常工作。"
+
+#: ../../faqs.md:60
+msgid ""
+"If all above steps are not working, you can try the following code with "
+"python to check whether there is any error:"
+msgstr "如果以上所有步骤都无效，你可以尝试使用以下 python 代码来检查是否有错误："
+
+#: ../../faqs.md:68
+msgid ""
+"If all above steps are not working, feel free to submit a GitHub issue."
+msgstr "如果以上所有步骤都无法解决问题，欢迎提交一个 GitHub issue。"
+
+#: ../../faqs.md:70
+msgid "7. How does vllm-ascend perform?"
+msgstr "7. vllm-ascend 的性能如何？"
+
+#: ../../faqs.md:72
+msgid ""
+"Currently, only some models are improved. Such as `Qwen2.5 VL`, `Qwen3`, "
+"`Deepseek  V3`. Others are not good enough. From 0.9.0rc2, Qwen and Deepseek"
+" works with graph mode to play a good performance. What's more, you can "
+"install `mindie-turbo` with `vllm-ascend v0.7.3` to speed up the inference "
+"as well."
+msgstr ""
+"目前，只有部分模型得到了改进，比如 `Qwen2.5 VL`、`Qwen3` 和 `Deepseek V3`。其他模型的效果还不够理想。从 "
+"0.9.0rc2 开始，Qwen 和 Deepseek 已经支持图模式，以获得更好的性能。此外，你还可以在 `vllm-ascend v0.7.3` "
+"上安装 `mindie-turbo`，进一步加速推理。"
+
+#: ../../faqs.md:74
+msgid "8. How vllm-ascend work with vllm?"
+msgstr "8. vllm-ascend 如何与 vllm 协同工作？"
+
+#: ../../faqs.md:75
+msgid ""
+"vllm-ascend is a plugin for vllm. Basically, the version of vllm-ascend is "
+"the same as the version of vllm. For example, if you use vllm 0.7.3, you "
+"should use vllm-ascend 0.7.3 as well. For main branch, we will make sure "
+"`vllm-ascend` and `vllm` are compatible by each commit."
+msgstr ""
+"vllm-ascend 是 vllm 的一个插件。基本上，vllm-ascend 的版本与 vllm 的版本是相同的。例如，如果你使用 vllm "
+"0.7.3，你也应该使用 vllm-ascend 0.7.3。对于主分支，我们会确保每次提交都让 `vllm-ascend` 和 `vllm` "
+"保持兼容。"
+
+#: ../../faqs.md:77
+msgid "9. Does vllm-ascend support Prefill Disaggregation feature?"
+msgstr "9. vllm-ascend 支持 Prefill Disaggregation 功能吗？"
+
+#: ../../faqs.md:79
+msgid ""
+"Currently, only 1P1D is supported on V0 Engine. For V1 Engine or NPND "
+"support, We will make it stable and supported by vllm-ascend in the future."
+msgstr "目前，V0引擎只支持1P1D。对于V1引擎或NPND的支持，我们将在未来使其稳定并由vllm-ascend支持。"
+
+#: ../../faqs.md:81
+msgid "10. Does vllm-ascend support quantization method?"
+msgstr "10. vllm-ascend 支持量化方法吗？"
+
+#: ../../faqs.md:83
+msgid ""
+"Currently, w8a8 quantization is already supported by vllm-ascend originally "
+"on v0.8.4rc2 or higher, If you're using vllm 0.7.3 version, w8a8 "
+"quantization is supporeted with the integration of vllm-ascend and mindie-"
+"turbo, please use `pip install vllm-ascend[mindie-turbo]`."
+msgstr ""
+"目前，w8a8 量化已在 v0.8.4rc2 或更高版本的 vllm-ascend 中原生支持。如果你使用的是 vllm 0.7.3 版本，集成了 "
+"vllm-ascend 和 mindie-turbo 后也支持 w8a8 量化，请使用 `pip install vllm-ascend[mindie-"
+"turbo]`。"
+
+#: ../../faqs.md:85
+msgid "11. How to run w8a8 DeepSeek model?"
+msgstr "11. 如何运行 w8a8 DeepSeek 模型？"
+
+#: ../../faqs.md:87
+msgid ""
+"Please following the [inferencing tutorail](https://vllm-"
+"ascend.readthedocs.io/en/latest/tutorials/multi_node.html) and replace model"
+" to DeepSeek."
+msgstr ""
+"请按照[inferencing 教程](https://vllm-"
+"ascend.readthedocs.io/en/latest/tutorials/multi_node.html)进行操作，并将模型更换为 "
+"DeepSeek。"
+
+#: ../../faqs.md:89
+msgid ""
+"12. There is no output in log when loading models using vllm-ascend, How to "
+"solve it?"
+msgstr "12. 使用 vllm-ascend 加载模型时日志没有输出，如何解决？"
+
+#: ../../faqs.md:91
+msgid ""
+"If you're using vllm 0.7.3 version, this is a known progress bar display "
+"issue in VLLM, which has been resolved in [this PR](https://github.com/vllm-"
+"project/vllm/pull/12428), please cherry-pick it locally by yourself. "
+"Otherwise, please fill up an issue."
+msgstr ""
+"如果你正在使用 vllm 0.7.3 版本，这是 VLLM 已知的进度条显示问题，已在 [此 PR](https://github.com/vllm-"
+"project/vllm/pull/12428) 中解决，请自行在本地进行 cherry-pick。否则，请提交一个 issue。"
+
+#: ../../faqs.md:93
+msgid "13. How vllm-ascend is tested"
+msgstr "13. 如何测试 vllm-ascend"
+
+#: ../../faqs.md:95
+msgid ""
+"vllm-ascend is tested by functional test, performance test and accuracy "
+"test."
+msgstr "vllm-ascend 经过功能测试、性能测试和精度测试。"
+
+#: ../../faqs.md:97
+msgid ""
+"**Functional test**: we added CI, includes portion of vllm's native unit "
+"tests and vllm-ascend's own unit tests，on vllm-ascend's test, we test basic "
+"functionality、popular models availability and [supported "
+"features](https://vllm-"
+"ascend.readthedocs.io/en/latest/user_guide/support_matrix/supported_features.html)"
+" via e2e test"
+msgstr ""
+"**功能测试**：我们添加了CI，包含了vllm原生单元测试的一部分以及vllm-ascend自己的单元测试。在vllm-"
+"ascend的测试中，我们通过e2e测试验证了基本功能、主流模型可用性和[支持的特性](https://vllm-"
+"ascend.readthedocs.io/en/latest/user_guide/support_matrix/supported_features.html)。"
+
+#: ../../faqs.md:99
+msgid ""
+"**Performance test**: we provide [benchmark](https://github.com/vllm-"
+"project/vllm-ascend/tree/main/benchmarks) tools for end-to-end performance "
+"benchmark which can easily to re-route locally, we'll publish a perf website"
+" to show the performance test results for each pull request"
+msgstr ""
+"**性能测试**：我们提供了用于端到端性能基准测试的[基准测试](https://github.com/vllm-project/vllm-"
+"ascend/tree/main/benchmarks)工具，可以方便地在本地重新运行。我们将发布一个性能网站，用于展示每个拉取请求的性能测试结果。"
+
+#: ../../faqs.md:101
+msgid ""
+"**Accuracy test**: we're working on adding accuracy test to CI as well."
+msgstr "**准确性测试**：我们也在努力将准确性测试添加到CI中。"
+
+#: ../../faqs.md:103
+msgid ""
+"Finnall, for each release, we'll publish the performance test and accuracy "
+"test report in the future."
+msgstr "最后，未来每个版本发布时，我们都会公开性能测试和准确性测试报告。"
+
+#: ../../faqs.md:105
+msgid "14. How to fix the error \"InvalidVersion\" when using vllm-ascend?"
+msgstr "14. 使用 vllm-ascend 时如何解决 “InvalidVersion” 错误？"
+
+#: ../../faqs.md:106
+msgid ""
+"It's usually because you have installed an dev/editable version of vLLM "
+"package. In this case, we provide the env variable `VLLM_VERSION` to let "
+"users specify the version of vLLM package to use. Please set the env "
+"variable `VLLM_VERSION` to the version of vLLM package you have installed. "
+"The format of `VLLM_VERSION` should be `X.Y.Z`."
+msgstr ""
+"这通常是因为你安装了开发版或可编辑版本的 vLLM 包。在这种情况下，我们提供了环境变量 `VLLM_VERSION`，以便用户指定要使用的 vLLM "
+"包版本。请将环境变量 `VLLM_VERSION` 设置为你已安装的 vLLM 包的版本。`VLLM_VERSION` 的格式应为 `X.Y.Z`。"
+
+#: ../../faqs.md:108
+msgid "15. How to handle Out Of Memory?"
+msgstr "15. 如何处理内存溢出？"
+
+#: ../../faqs.md:109
+msgid ""
+"OOM errors typically occur when the model exceeds the memory capacity of a "
+"single NPU. For general guidance, you can refer to [vLLM's OOM "
+"troubleshooting "
+"documentation](https://docs.vllm.ai/en/latest/getting_started/troubleshooting.html#out-"
+"of-memory)."
+msgstr ""
+"当模型超出单个 NPU 的内存容量时，通常会发生 OOM（内存溢出）错误。一般性的指导可以参考 [vLLM 的 OOM "
+"故障排除文档](https://docs.vllm.ai/en/latest/getting_started/troubleshooting.html#out-"
+"of-memory)。"
+
+#: ../../faqs.md:111
+msgid ""
+"In scenarios where NPUs have limited HBM (High Bandwidth Memory) capacity, "
+"dynamic memory allocation/deallocation during inference can exacerbate "
+"memory fragmentation, leading to OOM. To address this:"
+msgstr ""
+"在 NPU 的 HBM（高带宽内存）容量有限的场景下，推理过程中动态内存分配和释放会加剧内存碎片，从而导致 OOM（内存溢出）。为了解决这个问题："
+
+#: ../../faqs.md:113
+msgid ""
+"**Adjust `--gpu-memory-utilization`**: If unspecified, will use the default "
+"value of `0.9`. You can decrease this param to reserve more memory to reduce"
+" fragmentation risks. See more note in: [vLLM - Inference and Serving - "
+"Engine "
+"Arguments](https://docs.vllm.ai/en/latest/serving/engine_args.html#vllm.engine.arg_utils-"
+"_engine_args_parser-cacheconfig)."
+msgstr ""
+"**调整 `--gpu-memory-utilization`**：如果未指定，将使用默认值 "
+"`0.9`。你可以降低此参数来预留更多内存，从而降低内存碎片风险。参见更多说明：[vLLM - 推理与服务 - "
+"引擎参数](https://docs.vllm.ai/en/latest/serving/engine_args.html#vllm.engine.arg_utils-"
+"_engine_args_parser-cacheconfig)。"
+
+#: ../../faqs.md:115
+msgid ""
+"**Configure `PYTORCH_NPU_ALLOC_CONF`**: Set this environment variable to "
+"optimize NPU memory management. For example, you can `export "
+"PYTORCH_NPU_ALLOC_CONF=expandable_segments:True` to enable virtual memory "
+"feature to mitigate memory fragmentation caused by frequent dynamic memory "
+"size adjustments during runtime, see more note in: "
+"[PYTORCH_NPU_ALLOC_CONF](https://www.hiascend.com/document/detail/zh/Pytorch/700/comref/Envvariables/Envir_012.html)."
+msgstr ""
+"**配置 `PYTORCH_NPU_ALLOC_CONF`**：设置此环境变量以优化NPU内存管理。例如，你可以通过 `export "
+"PYTORCH_NPU_ALLOC_CONF=expandable_segments:True` "
+"来启用虚拟内存功能，以缓解运行时频繁动态调整内存大小导致的内存碎片问题，更多说明参见：[PYTORCH_NPU_ALLOC_CONF](https://www.hiascend.com/document/detail/zh/Pytorch/700/comref/Envvariables/Envir_012.html)。"
+
+#: ../../faqs.md:117
+msgid "16. Failed to enable NPU graph mode when running DeepSeek?"
+msgstr "16. 运行 DeepSeek 时无法启用 NPU 图模式？"
+
+#: ../../faqs.md:118
+#, python-brace-format
+msgid ""
+"You may encounter the following error if running DeepSeek with NPU graph "
+"mode enabled. The allowed number of queries per kv when enabling both MLA "
+"and Graph mode only support {32, 64, 128}, **Thus this is not supported for "
+"DeepSeek-V2-Lite**, as it only has 16 attention heads. The NPU graph mode "
+"support on DeepSeek-V2-Lite will be done in the future."
+msgstr ""
+"如果在启用NPU图模式（Graph "
+"mode）运行DeepSeek时，您可能会遇到以下错误。当同时启用MLA和图模式时，每个kv允许的查询数只支持{32, 64, "
+"128}，**因此这不支持DeepSeek-V2-Lite**，因为它只有16个注意力头。未来会增加对DeepSeek-V2-Lite在NPU图模式下的支持。"
+
+#: ../../faqs.md:120
+#, python-brace-format
+msgid ""
+"And if you're using DeepSeek-V3 or DeepSeek-R1, please make sure after the "
+"tensor parallel split, num_heads / num_kv_heads in {32, 64, 128}."
+msgstr ""
+"如果你正在使用 DeepSeek-V3 或 DeepSeek-R1，请确保在张量并行切分后，num_heads / num_kv_heads 的值为 "
+"{32, 64, 128} 中的一个。"
+
+#: ../../faqs.md:127
+msgid ""
+"17. Failed to reinstall vllm-ascend from source after uninstalling vllm-"
+"ascend?"
+msgstr "17. 卸载 vllm-ascend 后无法从源码重新安装 vllm-ascend？"
+
+#: ../../faqs.md:128
+msgid ""
+"You may encounter the problem of C compilation failure when reinstalling "
+"vllm-ascend from source using pip. If the installation fails, it is "
+"recommended to use `python setup.py install` to install, or use `python "
+"setup.py clean` to clear the cache."
+msgstr ""
+"当你使用 pip 从源码重新安装 vllm-ascend 时，可能会遇到 C 编译失败的问题。如果安装失败，建议使用 `python setup.py "
+"install` 进行安装，或者使用 `python setup.py clean` 清除缓存。"
+
+#: ../../faqs.md:130
+msgid "18. How to generate determinitic results when using vllm-ascend?"
+msgstr "18. 使用 vllm-ascend 时如何生成确定性结果？"
+
+#: ../../faqs.md:131
+msgid "There are several factors that affect output certainty:"
+msgstr "有几个因素会影响输出的确定性："
+
+#: ../../faqs.md:133
+msgid ""
+"Sampler Method: using **Greedy sample** by setting `temperature=0` in "
+"`SamplingParams`, e.g.:"
+msgstr ""
+"采样方法：通过在 `SamplingParams` 中设置 `temperature=0` 来使用 **贪婪采样（Greedy "
+"sample）**，例如："
+
+#: ../../faqs.md:158
+msgid "Set the following enveriments parameters:"
+msgstr "设置以下环境参数："
--- a/docs/source/locale/zh_CN/LC_MESSAGES/index.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/index.po
@@ -0,0 +1,79 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-ascend team
+# This file is distributed under the same license as the vllm-ascend
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-ascend\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: 2025-07-18 10:05+0800\n"
+"Last-Translator: \n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+"X-Generator: Poedit 3.5\n"
+
+#: ../../index.md:33
+msgid "Getting Started"
+msgstr "快速开始"
+
+#: ../../index.md:43
+msgid "User Guide"
+msgstr "用户指南"
+
+#: ../../index.md:53
+msgid "Developer Guide"
+msgstr "开发者指南"
+
+#: ../../index.md:64
+msgid "Community"
+msgstr "社区"
+
+#: ../../index.md:1
+msgid "Welcome to vLLM Ascend Plugin"
+msgstr "欢迎使用 vLLM Ascend 插件"
+
+#: ../../index.md:3
+msgid "vLLM"
+msgstr "vLLM"
+
+#: ../../index.md:24
+msgid ""
+"vLLM Ascend plugin (vllm-ascend) is a community maintained hardware plugin "
+"for running vLLM on the Ascend NPU."
+msgstr ""
+"vLLM Ascend 插件（vllm-ascend）是一个由社区维护的硬件插件，用于在 Ascend "
+"NPU 上运行 vLLM。"
+
+#: ../../index.md:26
+msgid ""
+"This plugin is the recommended approach for supporting the Ascend backend "
+"within the vLLM community. It adheres to the principles outlined in the "
+"[[RFC]: Hardware pluggable](https://github.com/vllm-project/vllm/"
+"issues/11162), providing a hardware-pluggable interface that decouples the "
+"integration of the Ascend NPU with vLLM."
+msgstr ""
+"该插件是 vLLM 社区推荐用于支持 Ascend 后端的方法。它遵循 [[RFC]: Hardware "
+"pluggable](https://github.com/vllm-project/vllm/issues/11162) 中提出的原"
+"则，提供了一个硬件可插拔接口，实现了 Ascend NPU 与 vLLM 集成的解耦。"
+
+#: ../../index.md:28
+msgid ""
+"By using vLLM Ascend plugin, popular open-source models, including "
+"Transformer-like, Mixture-of-Expert, Embedding, Multi-modal LLMs can run "
+"seamlessly on the Ascend NPU."
+msgstr ""
+"通过使用 vLLM Ascend 插件，流行的开源模型，包括 Transformer 类、混合专家、"
+"嵌入式、多模态大模型等，都可以在 Ascend NPU 上无缝运行。"
+
+#: ../../index.md:30
+msgid "Documentation"
+msgstr "文档"
--- a/docs/source/locale/zh_CN/LC_MESSAGES/installation.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/installation.po
@@ -0,0 +1,293 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-ascend team
+# This file is distributed under the same license as the vllm-ascend
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-ascend\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: 2025-07-18 10:09+0800\n"
+"Last-Translator: \n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+"X-Generator: Poedit 3.5\n"
+
+#: ../../installation.md:1
+msgid "Installation"
+msgstr "安装"
+
+#: ../../installation.md:3
+msgid "This document describes how to install vllm-ascend manually."
+msgstr "本文档介绍如何手动安装 vllm-ascend。"
+
+#: ../../installation.md:5
+msgid "Requirements"
+msgstr "要求"
+
+#: ../../installation.md:7
+msgid "OS: Linux"
+msgstr "操作系统：Linux"
+
+#: ../../installation.md:8
+msgid "Python: >= 3.9, < 3.12"
+msgstr "Python：>= 3.9，< 3.12"
+
+#: ../../installation.md:9
+msgid "A hardware with Ascend NPU. It's usually the Atlas 800 A2 series."
+msgstr "配备有昇腾NPU的硬件，通常是Atlas 800 A2系列。"
+
+#: ../../installation.md:10
+msgid "Software:"
+msgstr "软件："
+
+#: ../../installation.md
+msgid "Software"
+msgstr "软件"
+
+#: ../../installation.md
+msgid "Supported version"
+msgstr "支持的版本"
+
+#: ../../installation.md
+msgid "Note"
+msgstr "注释"
+
+#: ../../installation.md
+msgid "CANN"
+msgstr "CANN"
+
+#: ../../installation.md
+msgid ">= 8.1.RC1"
+msgstr ">= 8.1.RC1"
+
+#: ../../installation.md
+msgid "Required for vllm-ascend and torch-npu"
+msgstr "vllm-ascend 和 torch-npu 必需"
+
+#: ../../installation.md
+msgid "torch-npu"
+msgstr "torch-npu"
+
+#: ../../installation.md
+msgid ">= 2.5.1.post1.dev20250619"
+msgstr ">= 2.5.1.post1.dev20250619"
+
+#: ../../installation.md
+msgid ""
+"Required for vllm-ascend, No need to install manually, it will be auto "
+"installed in below steps"
+msgstr "vllm-ascend 必需，无需手动安装，后续步骤会自动安装。"
+
+#: ../../installation.md
+msgid "torch"
+msgstr "torch"
+
+#: ../../installation.md
+msgid ">= 2.5.1"
+msgstr ">= 2.5.1"
+
+#: ../../installation.md
+msgid "Required for torch-npu and vllm"
+msgstr "torch-npu 和 vllm 所需"
+
+#: ../../installation.md:18
+msgid "You have 2 way to install:"
+msgstr "你有两种安装方式："
+
+#: ../../installation.md:19
+msgid ""
+"**Using pip**: first prepare env manually or via CANN image, then install "
+"`vllm-ascend` using pip."
+msgstr ""
+"**使用 pip**：首先手动准备环境或通过 CANN 镜像准备环境，然后使用 pip 安装 "
+"`vllm-ascend`。"
+
+#: ../../installation.md:20
+msgid ""
+"**Using docker**: use the `vllm-ascend` pre-built docker image directly."
+msgstr "**使用 docker**：直接使用 `vllm-ascend` 预构建的 docker 镜像。"
+
+#: ../../installation.md:22
+msgid "Configure a new environment"
+msgstr "配置一个新环境"
+
+#: ../../installation.md:24
+msgid ""
+"Before installing, you need to make sure firmware/driver and CANN are "
+"installed correctly, refer to [link](https://ascend.github.io/docs/sources/"
+"ascend/quick_install.html) for more details."
+msgstr ""
+"在安装之前，您需要确保固件/驱动和 CANN 已正确安装，更多详情请参考 [链接]"
+"(https://ascend.github.io/docs/sources/ascend/quick_install.html)。"
+
+#: ../../installation.md:26
+msgid "Configure hardware environment"
+msgstr "配置硬件环境"
+
+#: ../../installation.md:28
+msgid ""
+"To verify that the Ascend NPU firmware and driver were correctly installed, "
+"run:"
+msgstr "要验证 Ascend NPU 固件和驱动程序是否正确安装，请运行："
+
+#: ../../installation.md:34
+msgid ""
+"Refer to [Ascend Environment Setup Guide](https://ascend.github.io/docs/"
+"sources/ascend/quick_install.html) for more details."
+msgstr ""
+"更多详情请参考[Ascend环境搭建指南](https://ascend.github.io/docs/sources/"
+"ascend/quick_install.html)。"
+
+#: ../../installation.md:36
+msgid "Configure software environment"
+msgstr "配置软件环境"
+
+#: ../../installation.md
+msgid "Before using pip"
+msgstr "在使用 pip 之前"
+
+#: ../../installation.md:46
+msgid ""
+"The easiest way to prepare your software environment is using CANN image "
+"directly:"
+msgstr "最简单的方式是直接使用 CANN 镜像来准备您的软件环境："
+
+#: ../../installation.md
+msgid "Click here to see \"Install CANN manually\""
+msgstr "点击此处查看“手动安装 CANN”"
+
+#: ../../installation.md:72
+msgid "You can also install CANN manually:"
+msgstr "你也可以手动安装 CANN："
+
+#: ../../installation.md
+msgid "Before using docker"
+msgstr "在使用 docker 之前"
+
+#: ../../installation.md:104
+msgid ""
+"No more extra step if you are using `vllm-ascend` prebuilt docker image."
+msgstr "如果你使用 `vllm-ascend` 预构建的 docker 镜像，就无需额外的步骤。"
+
+#: ../../installation.md:108
+msgid "Once it's done, you can start to set up `vllm` and `vllm-ascend`."
+msgstr "完成后，你可以开始配置 `vllm` 和 `vllm-ascend`。"
+
+#: ../../installation.md:110
+msgid "Setup vllm and vllm-ascend"
+msgstr "安装 vllm 和 vllm-ascend"
+
+#: ../../installation.md
+msgid "Using pip"
+msgstr "使用 pip"
+
+#: ../../installation.md:121
+msgid "First install system dependencies and config pip mirror:"
+msgstr "首先安装系统依赖并配置 pip 镜像："
+
+#: ../../installation.md:133
+msgid ""
+"**[Optional]** Then config the extra-index of `pip` if you are working on a "
+"x86 machine or using torch-npu dev version:"
+msgstr ""
+"**[可选]** 如果你在 x86 机器上工作或使用 torch-npu 开发版，请配置 `pip` 的额"
+"外索引："
+
+#: ../../installation.md:140
+msgid ""
+"Then you can install `vllm` and `vllm-ascend` from **pre-built wheel**:"
+msgstr "然后你可以从**预编译的 wheel 包**安装 `vllm` 和 `vllm-ascend`："
+
+#: ../../installation.md
+msgid "Click here to see \"Build from source code\""
+msgstr "点击此处查看“从源代码构建”"
+
+#: ../../installation.md:153
+msgid "or build from **source code**:"
+msgstr "或者从**源代码**构建："
+
+#: ../../installation.md:171
+msgid ""
+"vllm-ascend will build custom ops by default. If you don't want to build "
+"it, set `COMPILE_CUSTOM_KERNELS=0` environment to disable it."
+msgstr ""
+"vllm-ascend 默认会编译自定义算子。如果你不想编译它，可以设置环境变量 "
+"`COMPILE_CUSTOM_KERNELS=0` 来禁用。"
+
+#: ../../installation.md:175
+msgid ""
+"If you are building from v0.7.3-dev and intend to use sleep mode feature, "
+"you should set `COMPILE_CUSTOM_KERNELS=1` manually. To build custom ops, "
+"gcc/g++ higher than 8 and c++ 17 or higher is required. If you're using "
+"`pip install -e .` and encourage a torch-npu version conflict, please "
+"install with `pip install --no-build-isolation -e .` to build on system "
+"env. If you encounter other problems during compiling, it is probably "
+"because unexpected compiler is being used, you may export `CXX_COMPILER` "
+"and `C_COMPILER` in env to specify your g++ and gcc locations before "
+"compiling."
+msgstr ""
+"如果你是从 v0.7.3-dev 版本开始构建，并且打算使用休眠模式功能，你需要手动设"
+"置 `COMPILE_CUSTOM_KERNELS=1`。构建自定义算子时，要求 gcc/g++ 版本高于 8 且"
+"支持 c++ 17 或更高标准。如果你正在使用 `pip install -e .` 并且出现了 torch-"
+"npu 版本冲突，请使用 `pip install --no-build-isolation -e .` 在系统环境下进"
+"行安装。如果在编译过程中遇到其它问题，可能是因为使用了非预期的编译器，你可以"
+"在编译前通过环境变量导出 `CXX_COMPILER` 和 `C_COMPILER`，以指定你的 g++ 和 "
+"gcc 路径。"
+
+#: ../../installation.md
+msgid "Using docker"
+msgstr "使用 docker"
+
+#: ../../installation.md:184
+msgid "You can just pull the **prebuilt image** and run it with bash."
+msgstr "你可以直接拉取**预构建镜像**并用 bash 运行它。"
+
+#: ../../installation.md
+msgid "Click here to see \"Build from Dockerfile\""
+msgstr "点击这里查看“从 Dockerfile 构建”"
+
+#: ../../installation.md:187
+msgid "or build IMAGE from **source code**:"
+msgstr "或从**源代码**构建 IMAGE："
+
+#: ../../installation.md:218
+msgid ""
+"The default workdir is `/workspace`, vLLM and vLLM Ascend code are placed "
+"in `/vllm-workspace` and installed in [development mode](https://setuptools."
+"pypa.io/en/latest/userguide/development_mode.html)(`pip install -e`) to "
+"help developer immediately take place changes without requiring a new "
+"installation."
+msgstr ""
+"默认的工作目录是 `/workspace`，vLLM 和 vLLM Ascend 代码被放置在 `/vllm-"
+"workspace`，并以[开发模式](https://setuptools.pypa.io/en/latest/userguide/"
+"development_mode.html)（`pip install -e`）安装，以便开发者能够即时生效更改，"
+"而无需重新安装。"
+
+#: ../../installation.md:222
+msgid "Extra information"
+msgstr "额外信息"
+
+#: ../../installation.md:224
+msgid "Verify installation"
+msgstr "验证安装"
+
+#: ../../installation.md:226
+msgid "Create and run a simple inference test. The `example.py` can be like:"
+msgstr "创建并运行一个简单的推理测试。`example.py` 可以如下："
+
+#: ../../installation.md:251
+msgid "Then run:"
+msgstr "然后运行："
+
+#: ../../installation.md:259
+msgid "The output will be like:"
+msgstr "输出将会像这样："
--- a/docs/source/locale/zh_CN/LC_MESSAGES/quick_start.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/quick_start.po
@@ -0,0 +1,149 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-ascend team
+# This file is distributed under the same license as the vllm-ascend
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-ascend\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: 2025-07-18 10:09+0800\n"
+"Last-Translator: \n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+"X-Generator: Poedit 3.5\n"
+
+#: ../../quick_start.md:1
+msgid "Quickstart"
+msgstr "快速入门"
+
+#: ../../quick_start.md:3
+msgid "Prerequisites"
+msgstr "先决条件"
+
+#: ../../quick_start.md:5
+msgid "Supported Devices"
+msgstr "支持的设备"
+
+#: ../../quick_start.md:6
+msgid ""
+"Atlas A2 Training series (Atlas 800T A2, Atlas 900 A2 PoD, Atlas 200T A2 "
+"Box16, Atlas 300T A2)"
+msgstr ""
+"Atlas A2 训练系列（Atlas 800T A2，Atlas 900 A2 PoD，Atlas 200T A2 Box16，"
+"Atlas 300T A2）"
+
+#: ../../quick_start.md:7
+msgid "Atlas 800I A2 Inference series (Atlas 800I A2)"
+msgstr "Atlas 800I A2 推理系列（Atlas 800I A2）"
+
+#: ../../quick_start.md:9
+msgid "Setup environment using container"
+msgstr "使用容器设置环境"
+
+#: ../../quick_start.md
+msgid "Ubuntu"
+msgstr "Ubuntu"
+
+#: ../../quick_start.md
+msgid "openEuler"
+msgstr "openEuler"
+
+#: ../../quick_start.md:69
+msgid ""
+"The default workdir is `/workspace`, vLLM and vLLM Ascend code are placed "
+"in `/vllm-workspace` and installed in [development mode](https://setuptools."
+"pypa.io/en/latest/userguide/development_mode.html)(`pip install -e`) to "
+"help developer immediately take place changes without requiring a new "
+"installation."
+msgstr ""
+"默认的工作目录是 `/workspace`，vLLM 和 vLLM Ascend 代码被放置在 `/vllm-"
+"workspace`，并以[开发模式](https://setuptools.pypa.io/en/latest/userguide/"
+"development_mode.html)（`pip install -e`）安装，以便开发者能够即时生效更改，"
+"而无需重新安装。"
+
+#: ../../quick_start.md:71
+msgid "Usage"
+msgstr "用法"
+
+#: ../../quick_start.md:73
+msgid "You can use Modelscope mirror to speed up download:"
+msgstr "你可以使用 Modelscope 镜像来加速下载："
+
+#: ../../quick_start.md:80
+msgid "There are two ways to start vLLM on Ascend NPU:"
+msgstr "在昇腾 NPU 上启动 vLLM 有两种方式："
+
+#: ../../quick_start.md
+msgid "Offline Batched Inference"
+msgstr "离线批量推理"
+
+#: ../../quick_start.md:86
+msgid ""
+"With vLLM installed, you can start generating texts for list of input "
+"prompts (i.e. offline batch inferencing)."
+msgstr ""
+"安装了 vLLM 后，您可以开始为一系列输入提示生成文本（即离线批量推理）。"
+
+#: ../../quick_start.md:88
+msgid ""
+"Try to run below Python script directly or use `python3` shell to generate "
+"texts:"
+msgstr ""
+"尝试直接运行下面的 Python 脚本，或者使用 `python3` 交互式命令行来生成文本："
+
+#: ../../quick_start.md
+msgid "OpenAI Completions API"
+msgstr "OpenAI Completions API"
+
+#: ../../quick_start.md:114
+msgid ""
+"vLLM can also be deployed as a server that implements the OpenAI API "
+"protocol. Run the following command to start the vLLM server with the [Qwen/"
+"Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct) "
+"model:"
+msgstr ""
+"vLLM 也可以作为实现 OpenAI API 协议的服务器进行部署。运行以下命令，使用 "
+"[Qwen/Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-"
+"Instruct) 模型启动 vLLM 服务器："
+
+#: ../../quick_start.md:124
+msgid "If you see log as below:"
+msgstr "如果你看到如下日志："
+
+#: ../../quick_start.md:132
+msgid "Congratulations, you have successfully started the vLLM server!"
+msgstr "恭喜，你已经成功启动了 vLLM 服务器！"
+
+#: ../../quick_start.md:134
+msgid "You can query the list the models:"
+msgstr "你可以查询模型列表："
+
+#: ../../quick_start.md:141
+msgid "You can also query the model with input prompts:"
+msgstr "你也可以通过输入提示来查询模型："
+
+#: ../../quick_start.md:155
+msgid ""
+"vLLM is serving as background process, you can use `kill -2 $VLLM_PID` to "
+"stop the background process gracefully, it's equal to `Ctrl-C` to stop "
+"foreground vLLM process:"
+msgstr ""
+"vLLM 正作为后台进程运行，你可以使用 `kill -2 $VLLM_PID` 来优雅地停止后台进"
+"程，这等同于使用 `Ctrl-C` 停止前台 vLLM 进程："
+
+#: ../../quick_start.md:164
+msgid "You will see output as below:"
+msgstr "你将会看到如下输出："
+
+#: ../../quick_start.md:172
+msgid "Finally, you can exit container by using `ctrl-D`."
+msgstr "最后，你可以通过按 `ctrl-D` 退出容器。"
--- a/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/index.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/index.po
@@ -0,0 +1,29 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-ascend team
+# This file is distributed under the same license as the vllm-ascend
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-ascend\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../tutorials/index.md:3
+msgid "Deployment"
+msgstr "部署"
+
+#: ../../tutorials/index.md:1
+msgid "Tutorials"
+msgstr "教程"
--- a/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/multi_node.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/multi_node.po
@@ -0,0 +1,192 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-ascend team
+# This file is distributed under the same license as the vllm-ascend
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-ascend\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../tutorials/multi_node.md:1
+msgid "Multi-Node-DP (DeepSeek)"
+msgstr "多节点分布式处理（DeepSeek）"
+
+#: ../../tutorials/multi_node.md:3
+msgid "Getting Start"
+msgstr "快速开始"
+
+#: ../../tutorials/multi_node.md:4
+msgid ""
+"vLLM-Ascend now supports Data Parallel (DP) deployment, enabling model "
+"weights to be replicated across multiple NPUs or instances, each processing "
+"independent batches of requests. This is particularly useful for scaling "
+"throughput across devices while maintaining high resource utilization."
+msgstr ""
+"vLLM-Ascend 现在支持数据并行（DP）部署，可以在多个 NPU "
+"或实例之间复制模型权重，每个实例处理独立的请求批次。这对于在保证高资源利用率的同时，实现跨设备的吞吐量扩展特别有用。"
+
+#: ../../tutorials/multi_node.md:6
+msgid ""
+"Each DP rank is deployed as a separate “core engine” process which "
+"communicates with front-end process(es) via ZMQ sockets. Data Parallel can "
+"be combined with Tensor Parallel, in which case each DP engine owns a number"
+" of per-NPU worker processes equal to the TP size."
+msgstr ""
+"每个 DP 进程作为一个单独的“核心引擎”进程部署，并通过 ZMQ 套接字与前端进程通信。数据并行可以与张量并行结合使用，此时每个 DP "
+"引擎拥有数量等于 TP 大小的每 NPU 工作进程。"
+
+#: ../../tutorials/multi_node.md:8
+msgid ""
+"For Mixture-of-Experts (MoE) models — especially advanced architectures like"
+" DeepSeek that utilize Multi-head Latent Attention (MLA) — a hybrid "
+"parallelism approach is recommended:     - Use **Data Parallelism (DP)** for"
+" attention layers, which are replicated across devices and handle separate "
+"batches.     - Use **Expert or Tensor Parallelism (EP/TP)** for expert "
+"layers, which are sharded across devices to distribute the computation."
+msgstr ""
+"对于混合专家（Mixture-of-Experts, MoE）模型——尤其是像 DeepSeek 这样采用多头潜在注意力（Multi-head Latent Attention, MLA）的高级架构——推荐使用混合并行策略：\n"
+"    - 对于注意力层，使用 **数据并行（Data Parallelism, DP）**，这些层会在各设备间复刻，并处理不同的批次。\n"
+"    - 对于专家层，使用 **专家并行或张量并行（Expert or Tensor Parallelism, EP/TP）**，这些层会在设备间分片，从而分担计算。"
+
+#: ../../tutorials/multi_node.md:12
+msgid ""
+"This division enables attention layers to be replicated across Data Parallel"
+" (DP) ranks, enabling them to process different batches independently. "
+"Meanwhile, expert layers are partitioned (sharded) across devices using "
+"Expert or Tensor Parallelism(DP*TP), maximizing hardware utilization and "
+"efficiency."
+msgstr ""
+"这种划分使得注意力层能够在数据并行（DP）组内复制，从而能够独立处理不同的批次。同时，专家层通过专家或张量并行（DP*TP）在设备间进行分区（切片），最大化硬件利用率和效率。"
+
+#: ../../tutorials/multi_node.md:14
+msgid ""
+"In these cases the data parallel ranks are not completely independent, "
+"forward passes must be aligned and expert layers across all ranks are "
+"required to synchronize during every forward pass, even if there are fewer "
+"requests to be processed than DP ranks."
+msgstr ""
+"在这些情况下，数据并行的各个 rank 不是完全独立的，前向传播必须对齐，并且所有 rank "
+"上的专家层在每次前向传播时都需要同步，即使待处理的请求数量少于 DP rank 的数量。"
+
+#: ../../tutorials/multi_node.md:16
+msgid ""
+"For MoE models, when any requests are in progress in any rank, we must "
+"ensure that empty “dummy” forward passes are performed in all ranks which "
+"don’t currently have any requests scheduled. This is handled via a separate "
+"DP `Coordinator` process which communicates with all of the ranks, and a "
+"collective operation performed every N steps to determine when all ranks "
+"become idle and can be paused. When TP is used in conjunction with DP, "
+"expert layers form an EP or TP group of size (DP x TP)."
+msgstr ""
+"对于 MoE 模型，当任何一个 rank 有请求正在进行时，必须确保所有当前没有请求的 rank 都执行空的“虚拟”前向传播。这是通过一个单独的 DP "
+"`Coordinator` 协调器进程来实现的，该进程与所有 rank 通信，并且每隔 N 步执行一次集体操作，以判断所有 rank "
+"是否都处于空闲状态并可以暂停。当 TP 与 DP 结合使用时，专家层会组成一个规模为（DP x TP）的 EP 或 TP 组。"
+
+#: ../../tutorials/multi_node.md:18
+msgid "Verify Multi-Node Communication Environment"
+msgstr "验证多节点通信环境"
+
+#: ../../tutorials/multi_node.md:20
+msgid "Physical Layer Requirements:"
+msgstr "物理层要求："
+
+#: ../../tutorials/multi_node.md:22
+msgid ""
+"The physical machines must be located on the same WLAN, with network "
+"connectivity."
+msgstr "物理机器必须位于同一个 WLAN 中，并且具有网络连接。"
+
+#: ../../tutorials/multi_node.md:23
+msgid ""
+"All NPUs are connected with optical modules, and the connection status must "
+"be normal."
+msgstr "所有 NPU 都通过光模块连接，且连接状态必须正常。"
+
+#: ../../tutorials/multi_node.md:25
+msgid "Verification Process:"
+msgstr "验证流程："
+
+#: ../../tutorials/multi_node.md:27
+msgid ""
+"Execute the following commands on each node in sequence. The results must "
+"all be `success` and the status must be `UP`:"
+msgstr "在每个节点上依次执行以下命令。所有结果必须为 `success` 且状态必须为 `UP`："
+
+#: ../../tutorials/multi_node.md:44
+msgid "NPU Interconnect Verification:"
+msgstr "NPU 互连验证："
+
+#: ../../tutorials/multi_node.md:45
+msgid "1. Get NPU IP Addresses"
+msgstr "1. 获取 NPU IP 地址"
+
+#: ../../tutorials/multi_node.md:50
+msgid "2. Cross-Node PING Test"
+msgstr "2. 跨节点PING测试"
+
+#: ../../tutorials/multi_node.md:56
+msgid "Run with docker"
+msgstr "用 docker 运行"
+
+#: ../../tutorials/multi_node.md:57
+msgid ""
+"Assume you have two Atlas 800 A2(64G*8) nodes, and want to deploy the "
+"`deepseek-v3-w8a8` quantitative model across multi-node."
+msgstr "假设你有两台 Atlas 800 A2（64G*8）节点，并且想要在多节点上部署 `deepseek-v3-w8a8` 量化模型。"
+
+#: ../../tutorials/multi_node.md:92
+msgid ""
+"Before launch the inference server, ensure some environment variables are "
+"set for multi node communication"
+msgstr "在启动推理服务器之前，确保已经为多节点通信设置了一些环境变量。"
+
+#: ../../tutorials/multi_node.md:95
+msgid "Run the following scripts on two nodes respectively"
+msgstr "分别在两台节点上运行以下脚本"
+
+#: ../../tutorials/multi_node.md:97
+msgid "**node0**"
+msgstr "**节点0**"
+
+#: ../../tutorials/multi_node.md:137
+msgid "**node1**"
+msgstr "**节点1**"
+
+#: ../../tutorials/multi_node.md:176
+msgid ""
+"The Deployment view looks like:  ![alt text](../assets/multi_node_dp.png)"
+msgstr "部署视图如下所示：![替代文本](../assets/multi_node_dp.png)"
+
+#: ../../tutorials/multi_node.md:176
+msgid "alt text"
+msgstr "替代文本"
+
+#: ../../tutorials/multi_node.md:179
+msgid ""
+"Once your server is started, you can query the model with input prompts:"
+msgstr "一旦你的服务器启动，你可以通过输入提示词来查询模型："
+
+#: ../../tutorials/multi_node.md:192
+msgid "Run benchmarks"
+msgstr "运行基准测试"
+
+#: ../../tutorials/multi_node.md:193
+msgid ""
+"For details please refer to [benchmark](https://github.com/vllm-"
+"project/vllm-ascend/tree/main/benchmarks)"
+msgstr ""
+"详细信息请参阅 [benchmark](https://github.com/vllm-project/vllm-"
+"ascend/tree/main/benchmarks)"
--- a/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/multi_npu.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/multi_npu.po
@@ -0,0 +1,62 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-ascend team
+# This file is distributed under the same license as the vllm-ascend
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-ascend\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../tutorials/multi_npu.md:1
+msgid "Multi-NPU (QwQ 32B)"
+msgstr "多-NPU（QwQ 32B）"
+
+#: ../../tutorials/multi_npu.md:3
+msgid "Run vllm-ascend on Multi-NPU"
+msgstr "在多NPU上运行 vllm-ascend"
+
+#: ../../tutorials/multi_npu.md:5
+msgid "Run docker container:"
+msgstr "运行 docker 容器："
+
+#: ../../tutorials/multi_npu.md:30
+msgid "Setup environment variables:"
+msgstr "设置环境变量："
+
+#: ../../tutorials/multi_npu.md:40
+msgid "Online Inference on Multi-NPU"
+msgstr "多NPU的在线推理"
+
+#: ../../tutorials/multi_npu.md:42
+msgid "Run the following script to start the vLLM server on Multi-NPU:"
+msgstr "运行以下脚本，在多NPU上启动 vLLM 服务器："
+
+#: ../../tutorials/multi_npu.md:48
+msgid ""
+"Once your server is started, you can query the model with input prompts"
+msgstr "一旦服务器启动，就可以通过输入提示词来查询模型。"
+
+#: ../../tutorials/multi_npu.md:63
+msgid "Offline Inference on Multi-NPU"
+msgstr "多NPU离线推理"
+
+#: ../../tutorials/multi_npu.md:65
+msgid "Run the following script to execute offline inference on multi-NPU:"
+msgstr "运行以下脚本以在多NPU上执行离线推理："
+
+#: ../../tutorials/multi_npu.md:102
+msgid "If you run this script successfully, you can see the info shown below:"
+msgstr "如果你成功运行此脚本，你可以看到如下所示的信息："
--- a/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/multi_npu_moge.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/multi_npu_moge.po
@@ -0,0 +1,86 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-ascend team
+# This file is distributed under the same license as the vllm-ascend
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-ascend\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../tutorials/multi_npu_moge.md:1
+msgid "Multi-NPU (Pangu Pro MoE)"
+msgstr "多NPU（Pangu Pro MoE）"
+
+#: ../../tutorials/multi_npu_moge.md:3
+msgid "Run vllm-ascend on Multi-NPU"
+msgstr "在多NPU上运行 vllm-ascend"
+
+#: ../../tutorials/multi_npu_moge.md:5
+msgid "Run container:"
+msgstr "运行容器："
+
+#: ../../tutorials/multi_npu_moge.md:30
+msgid "Setup environment variables:"
+msgstr "设置环境变量："
+
+#: ../../tutorials/multi_npu_moge.md:37
+msgid "Download the model:"
+msgstr "下载该模型："
+
+#: ../../tutorials/multi_npu_moge.md:44
+msgid "Online Inference on Multi-NPU"
+msgstr "多NPU上的在线推理"
+
+#: ../../tutorials/multi_npu_moge.md:46
+msgid "Run the following script to start the vLLM server on Multi-NPU:"
+msgstr "运行以下脚本，在多NPU上启动 vLLM 服务器："
+
+#: ../../tutorials/multi_npu_moge.md:55
+msgid ""
+"Once your server is started, you can query the model with input prompts:"
+msgstr "一旦你的服务器启动，你可以通过输入提示词来查询模型："
+
+#: ../../tutorials/multi_npu_moge.md
+msgid "v1/completions"
+msgstr "v1/补全"
+
+#: ../../tutorials/multi_npu_moge.md
+msgid "v1/chat/completions"
+msgstr "v1/chat/completions"
+
+#: ../../tutorials/multi_npu_moge.md:96
+msgid "If you run this successfully, you can see the info shown below:"
+msgstr "如果你成功运行这个，你可以看到如下所示的信息："
+
+#: ../../tutorials/multi_npu_moge.md:102
+msgid "Offline Inference on Multi-NPU"
+msgstr "多NPU离线推理"
+
+#: ../../tutorials/multi_npu_moge.md:104
+msgid "Run the following script to execute offline inference on multi-NPU:"
+msgstr "运行以下脚本以在多NPU上执行离线推理："
+
+#: ../../tutorials/multi_npu_moge.md
+msgid "Graph Mode"
+msgstr "图模式"
+
+#: ../../tutorials/multi_npu_moge.md
+msgid "Eager Mode"
+msgstr "即时模式"
+
+#: ../../tutorials/multi_npu_moge.md:230
+msgid "If you run this script successfully, you can see the info shown below:"
+msgstr "如果你成功运行此脚本，你可以看到如下所示的信息："
--- a/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/multi_npu_quantization.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/multi_npu_quantization.po
@@ -0,0 +1,82 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-ascend team
+# This file is distributed under the same license as the vllm-ascend
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-ascend\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../tutorials/multi_npu_quantization.md:1
+msgid "Multi-NPU (QwQ 32B W8A8)"
+msgstr "多NPU（QwQ 32B W8A8）"
+
+#: ../../tutorials/multi_npu_quantization.md:3
+msgid "Run docker container"
+msgstr "运行 docker 容器"
+
+#: ../../tutorials/multi_npu_quantization.md:5
+msgid "w8a8 quantization feature is supported by v0.8.4rc2 or higher"
+msgstr "w8a8 量化功能由 v0.8.4rc2 或更高版本支持"
+
+#: ../../tutorials/multi_npu_quantization.md:31
+msgid "Install modelslim and convert model"
+msgstr "安装 modelslim 并转换模型"
+
+#: ../../tutorials/multi_npu_quantization.md:33
+msgid ""
+"You can choose to convert the model yourself or use the quantized model we "
+"uploaded,  see https://www.modelscope.cn/models/vllm-ascend/QwQ-32B-W8A8"
+msgstr ""
+"你可以选择自己转换模型，或者使用我们上传的量化模型，详见 https://www.modelscope.cn/models/vllm-"
+"ascend/QwQ-32B-W8A8"
+
+#: ../../tutorials/multi_npu_quantization.md:56
+msgid "Verify the quantized model"
+msgstr "验证量化模型"
+
+#: ../../tutorials/multi_npu_quantization.md:57
+msgid "The converted model files looks like:"
+msgstr "转换后的模型文件如下所示："
+
+#: ../../tutorials/multi_npu_quantization.md:70
+msgid ""
+"Run the following script to start the vLLM server with quantized model:"
+msgstr "运行以下脚本以启动带有量化模型的 vLLM 服务器："
+
+#: ../../tutorials/multi_npu_quantization.md:73
+msgid ""
+"The value \"ascend\" for \"--quantization\" argument will be supported after"
+" [a specific PR](https://github.com/vllm-project/vllm-ascend/pull/877) is "
+"merged and released, you can cherry-pick this commit for now."
+msgstr ""
+"在 [特定的PR](https://github.com/vllm-project/vllm-ascend/pull/877) 合并并发布后， \"--"
+"quantization\" 参数将支持值 \"ascend\"，你也可以现在手动挑选该提交。"
+
+#: ../../tutorials/multi_npu_quantization.md:79
+msgid ""
+"Once your server is started, you can query the model with input prompts"
+msgstr "一旦服务器启动，就可以通过输入提示词来查询模型。"
+
+#: ../../tutorials/multi_npu_quantization.md:93
+msgid ""
+"Run the following script to execute offline inference on multi-NPU with "
+"quantized model:"
+msgstr "运行以下脚本，在多NPU上使用量化模型执行离线推理："
+
+#: ../../tutorials/multi_npu_quantization.md:96
+msgid "To enable quantization for ascend, quantization method must be \"ascend\""
+msgstr "要在ascend上启用量化，量化方法必须为“ascend”。"
--- a/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/multi_npu_qwen3_moe.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/multi_npu_qwen3_moe.po
@@ -0,0 +1,71 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-ascend team
+# This file is distributed under the same license as the vllm-ascend
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-ascend\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../tutorials/multi_npu_qwen3_moe.md:1
+msgid "Multi-NPU (Qwen3-30B-A3B)"
+msgstr "多NPU（Qwen3-30B-A3B）"
+
+#: ../../tutorials/multi_npu_qwen3_moe.md:3
+msgid "Run vllm-ascend on Multi-NPU with Qwen3 MoE"
+msgstr "在多NPU上运行带有Qwen3 MoE的vllm-ascend"
+
+#: ../../tutorials/multi_npu_qwen3_moe.md:5
+msgid "Run docker container:"
+msgstr "运行 docker 容器："
+
+#: ../../tutorials/multi_npu_qwen3_moe.md:30
+msgid "Setup environment variables:"
+msgstr "设置环境变量："
+
+#: ../../tutorials/multi_npu_qwen3_moe.md:40
+msgid "Online Inference on Multi-NPU"
+msgstr "多NPU的在线推理"
+
+#: ../../tutorials/multi_npu_qwen3_moe.md:42
+msgid "Run the following script to start the vLLM server on Multi-NPU:"
+msgstr "运行以下脚本以在多NPU上启动 vLLM 服务器："
+
+#: ../../tutorials/multi_npu_qwen3_moe.md:44
+msgid ""
+"For an Atlas A2 with 64GB of NPU card memory, tensor-parallel-size should be"
+" at least 2, and for 32GB of memory, tensor-parallel-size should be at least"
+" 4."
+msgstr ""
+"对于拥有64GB NPU卡内存的Atlas A2，tensor-parallel-size 至少应为2；对于32GB内存的NPU卡，tensor-"
+"parallel-size 至少应为4。"
+
+#: ../../tutorials/multi_npu_qwen3_moe.md:50
+msgid ""
+"Once your server is started, you can query the model with input prompts"
+msgstr "一旦服务器启动，就可以通过输入提示词来查询模型。"
+
+#: ../../tutorials/multi_npu_qwen3_moe.md:65
+msgid "Offline Inference on Multi-NPU"
+msgstr "多NPU离线推理"
+
+#: ../../tutorials/multi_npu_qwen3_moe.md:67
+msgid "Run the following script to execute offline inference on multi-NPU:"
+msgstr "运行以下脚本以在多NPU上执行离线推理："
+
+#: ../../tutorials/multi_npu_qwen3_moe.md:104
+msgid "If you run this script successfully, you can see the info shown below:"
+msgstr "如果你成功运行此脚本，你可以看到如下所示的信息："
--- a/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/single_node_300i.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/single_node_300i.po
@@ -0,0 +1,110 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-ascend team
+# This file is distributed under the same license as the vllm-ascend
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-ascend\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../tutorials/single_node_300i.md:1
+msgid "Single Node (Atlas 300I series)"
+msgstr "单节点（Atlas 300I 系列）"
+
+#: ../../tutorials/single_node_300i.md:4
+msgid ""
+"This Atlas 300I series is currently experimental. In future versions, there "
+"may be behavioral changes around model coverage, performance improvement."
+msgstr "Atlas 300I 系列目前处于实验阶段。在未来的版本中，模型覆盖范围和性能提升方面可能会有行为上的变化。"
+
+#: ../../tutorials/single_node_300i.md:7
+msgid "Run vLLM on Altlas 300I series"
+msgstr "在 Altlas 300I 系列上运行 vLLM"
+
+#: ../../tutorials/single_node_300i.md:9
+msgid "Run docker container:"
+msgstr "运行 docker 容器："
+
+#: ../../tutorials/single_node_300i.md:38
+msgid "Setup environment variables:"
+msgstr "设置环境变量："
+
+#: ../../tutorials/single_node_300i.md:48
+msgid "Online Inference on NPU"
+msgstr "在NPU上进行在线推理"
+
+#: ../../tutorials/single_node_300i.md:50
+msgid ""
+"Run the following script to start the vLLM server on NPU(Qwen3-0.6B:1 card, "
+"Qwen2.5-7B-Instruct:2 cards, Pangu-Pro-MoE-72B: 8 cards):"
+msgstr ""
+"运行以下脚本，在 NPU 上启动 vLLM 服务器（Qwen3-0.6B：1 张卡，Qwen2.5-7B-Instruct：2 张卡，Pangu-"
+"Pro-MoE-72B：8 张卡）："
+
+#: ../../tutorials/single_node_300i.md
+msgid "Qwen3-0.6B"
+msgstr "Qwen3-0.6B"
+
+#: ../../tutorials/single_node_300i.md:59
+#: ../../tutorials/single_node_300i.md:89
+#: ../../tutorials/single_node_300i.md:126
+msgid "Run the following command to start the vLLM server:"
+msgstr "运行以下命令以启动 vLLM 服务器："
+
+#: ../../tutorials/single_node_300i.md:70
+#: ../../tutorials/single_node_300i.md:100
+#: ../../tutorials/single_node_300i.md:140
+msgid ""
+"Once your server is started, you can query the model with input prompts"
+msgstr "一旦服务器启动，就可以通过输入提示词来查询模型。"
+
+#: ../../tutorials/single_node_300i.md
+msgid "Qwen/Qwen2.5-7B-Instruct"
+msgstr "Qwen/Qwen2.5-7B-Instruct"
+
+#: ../../tutorials/single_node_300i.md
+msgid "Pangu-Pro-MoE-72B"
+msgstr "Pangu-Pro-MoE-72B"
+
+#: ../../tutorials/single_node_300i.md:119
+#: ../../tutorials/single_node_300i.md:257
+msgid "Download the model:"
+msgstr "下载该模型："
+
+#: ../../tutorials/single_node_300i.md:157
+msgid "If you run this script successfully, you can see the results."
+msgstr "如果你成功运行此脚本，你就可以看到结果。"
+
+#: ../../tutorials/single_node_300i.md:159
+msgid "Offline Inference"
+msgstr "离线推理"
+
+#: ../../tutorials/single_node_300i.md:161
+msgid ""
+"Run the following script (`example.py`) to execute offline inference on NPU:"
+msgstr "运行以下脚本（`example.py`）以在 NPU 上执行离线推理："
+
+#: ../../tutorials/single_node_300i.md
+msgid "Qwen2.5-7B-Instruct"
+msgstr "Qwen2.5-7B-指令版"
+
+#: ../../tutorials/single_node_300i.md:320
+msgid "Run script:"
+msgstr "运行脚本："
+
+#: ../../tutorials/single_node_300i.md:325
+msgid "If you run this script successfully, you can see the info shown below:"
+msgstr "如果你成功运行此脚本，你可以看到如下所示的信息："
--- a/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/single_npu.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/single_npu.po
@@ -0,0 +1,107 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-ascend team
+# This file is distributed under the same license as the vllm-ascend
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-ascend\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../tutorials/single_npu.md:1
+msgid "Single NPU (Qwen3 8B)"
+msgstr "单个NPU（Qwen3 8B）"
+
+#: ../../tutorials/single_npu.md:3
+msgid "Run vllm-ascend on Single NPU"
+msgstr "在单个 NPU 上运行 vllm-ascend"
+
+#: ../../tutorials/single_npu.md:5
+msgid "Offline Inference on Single NPU"
+msgstr "在单个NPU上进行离线推理"
+
+#: ../../tutorials/single_npu.md:7
+msgid "Run docker container:"
+msgstr "运行 docker 容器："
+
+#: ../../tutorials/single_npu.md:29
+msgid "Setup environment variables:"
+msgstr "设置环境变量："
+
+#: ../../tutorials/single_npu.md:40
+msgid ""
+"`max_split_size_mb` prevents the native allocator from splitting blocks "
+"larger than this size (in MB). This can reduce fragmentation and may allow "
+"some borderline workloads to complete without running out of memory. You can"
+" find more details "
+"[<u>here</u>](https://www.hiascend.com/document/detail/zh/CANNCommunityEdition/800alpha003/apiref/envref/envref_07_0061.html)."
+msgstr ""
+"`max_split_size_mb` 防止本地分配器拆分超过此大小（以 MB "
+"为单位）的内存块。这可以减少内存碎片，并且可能让一些边缘情况下的工作负载顺利完成而不会耗尽内存。你可以在[<u>这里</u>](https://www.hiascend.com/document/detail/zh/CANNCommunityEdition/800alpha003/apiref/envref/envref_07_0061.html)找到更多详细信息。"
+
+#: ../../tutorials/single_npu.md:43
+msgid "Run the following script to execute offline inference on a single NPU:"
+msgstr "运行以下脚本以在单个 NPU 上执行离线推理："
+
+#: ../../tutorials/single_npu.md
+msgid "Graph Mode"
+msgstr "图模式"
+
+#: ../../tutorials/single_npu.md
+msgid "Eager Mode"
+msgstr "即时模式"
+
+#: ../../tutorials/single_npu.md:98
+msgid "If you run this script successfully, you can see the info shown below:"
+msgstr "如果你成功运行此脚本，你可以看到如下所示的信息："
+
+#: ../../tutorials/single_npu.md:105
+msgid "Online Serving on Single NPU"
+msgstr "单个 NPU 上的在线服务"
+
+#: ../../tutorials/single_npu.md:107
+msgid "Run docker container to start the vLLM server on a single NPU:"
+msgstr "运行 docker 容器，在单个 NPU 上启动 vLLM 服务器："
+
+#: ../../tutorials/single_npu.md:163
+msgid ""
+"Add `--max_model_len` option to avoid ValueError that the Qwen2.5-7B model's"
+" max seq len (32768) is larger than the maximum number of tokens that can be"
+" stored in KV cache (26240). This will differ with different NPU series base"
+" on the HBM size. Please modify the value according to a suitable value for "
+"your NPU series."
+msgstr ""
+"添加 `--max_model_len` 选项，以避免出现 Qwen2.5-7B 模型的最大序列长度（32768）大于 KV 缓存能存储的最大 "
+"token 数（26240）时的 ValueError。不同 NPU 系列由于 HBM 容量不同，该值也会有所不同。请根据您的 NPU "
+"系列，修改为合适的数值。"
+
+#: ../../tutorials/single_npu.md:166
+msgid "If your service start successfully, you can see the info shown below:"
+msgstr "如果你的服务启动成功，你会看到如下所示的信息："
+
+#: ../../tutorials/single_npu.md:174
+msgid ""
+"Once your server is started, you can query the model with input prompts:"
+msgstr "一旦你的服务器启动，你可以通过输入提示词来查询模型："
+
+#: ../../tutorials/single_npu.md:187
+msgid ""
+"If you query the server successfully, you can see the info shown below "
+"(client):"
+msgstr "如果你成功查询了服务器，你可以看到如下所示的信息（客户端）："
+
+#: ../../tutorials/single_npu.md:193
+msgid "Logs of the vllm server:"
+msgstr "vllm 服务器的日志："
--- a/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/single_npu_audio.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/single_npu_audio.po
@@ -0,0 +1,77 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-ascend team
+# This file is distributed under the same license as the vllm-ascend
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-ascend\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../tutorials/single_npu_audio.md:1
+msgid "Single NPU (Qwen2-Audio 7B)"
+msgstr "单个 NPU（Qwen2-Audio 7B）"
+
+#: ../../tutorials/single_npu_audio.md:3
+msgid "Run vllm-ascend on Single NPU"
+msgstr "在单个 NPU 上运行 vllm-ascend"
+
+#: ../../tutorials/single_npu_audio.md:5
+msgid "Offline Inference on Single NPU"
+msgstr "在单个NPU上进行离线推理"
+
+#: ../../tutorials/single_npu_audio.md:7
+msgid "Run docker container:"
+msgstr "运行 docker 容器："
+
+#: ../../tutorials/single_npu_audio.md:29
+msgid "Setup environment variables:"
+msgstr "设置环境变量："
+
+#: ../../tutorials/single_npu_audio.md:40
+msgid ""
+"`max_split_size_mb` prevents the native allocator from splitting blocks "
+"larger than this size (in MB). This can reduce fragmentation and may allow "
+"some borderline workloads to complete without running out of memory. You can"
+" find more details "
+"[<u>here</u>](https://www.hiascend.com/document/detail/zh/CANNCommunityEdition/800alpha003/apiref/envref/envref_07_0061.html)."
+msgstr ""
+"`max_split_size_mb` 防止本地分配器拆分超过此大小（以 MB "
+"为单位）的内存块。这可以减少内存碎片，并且可能让一些边缘情况下的工作负载顺利完成而不会耗尽内存。你可以在[<u>这里</u>](https://www.hiascend.com/document/detail/zh/CANNCommunityEdition/800alpha003/apiref/envref/envref_07_0061.html)找到更多详细信息。"
+
+#: ../../tutorials/single_npu_audio.md:43
+msgid "Install packages required for audio processing:"
+msgstr "安装音频处理所需的软件包："
+
+#: ../../tutorials/single_npu_audio.md:50
+msgid "Run the following script to execute offline inference on a single NPU:"
+msgstr "运行以下脚本以在单个 NPU 上执行离线推理："
+
+#: ../../tutorials/single_npu_audio.md:114
+msgid "If you run this script successfully, you can see the info shown below:"
+msgstr "如果你成功运行此脚本，你可以看到如下所示的信息："
+
+#: ../../tutorials/single_npu_audio.md:120
+msgid "Online Serving on Single NPU"
+msgstr "单个 NPU 上的在线服务"
+
+#: ../../tutorials/single_npu_audio.md:122
+msgid ""
+"Currently, vllm's OpenAI-compatible server doesn't support audio inputs, "
+"find more details [<u>here</u>](https://github.com/vllm-"
+"project/vllm/issues/19977)."
+msgstr ""
+"目前，vllm 的兼容 OpenAI 的服务器不支持音频输入，更多详情请查看[<u>这里</u>](https://github.com/vllm-"
+"project/vllm/issues/19977)。"
--- a/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/single_npu_multimodal.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/single_npu_multimodal.po
@@ -0,0 +1,99 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-ascend team
+# This file is distributed under the same license as the vllm-ascend
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-ascend\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../tutorials/single_npu_multimodal.md:1
+msgid "Single NPU (Qwen2.5-VL 7B)"
+msgstr "单个NPU（Qwen2.5-VL 7B）"
+
+#: ../../tutorials/single_npu_multimodal.md:3
+msgid "Run vllm-ascend on Single NPU"
+msgstr "在单个 NPU 上运行 vllm-ascend"
+
+#: ../../tutorials/single_npu_multimodal.md:5
+msgid "Offline Inference on Single NPU"
+msgstr "在单个NPU上进行离线推理"
+
+#: ../../tutorials/single_npu_multimodal.md:7
+msgid "Run docker container:"
+msgstr "运行 docker 容器："
+
+#: ../../tutorials/single_npu_multimodal.md:29
+msgid "Setup environment variables:"
+msgstr "设置环境变量："
+
+#: ../../tutorials/single_npu_multimodal.md:40
+msgid ""
+"`max_split_size_mb` prevents the native allocator from splitting blocks "
+"larger than this size (in MB). This can reduce fragmentation and may allow "
+"some borderline workloads to complete without running out of memory. You can"
+" find more details "
+"[<u>here</u>](https://www.hiascend.com/document/detail/zh/CANNCommunityEdition/800alpha003/apiref/envref/envref_07_0061.html)."
+msgstr ""
+"`max_split_size_mb` 防止本地分配器拆分超过此大小（以 MB "
+"为单位）的内存块。这可以减少内存碎片，并且可能让一些边缘情况下的工作负载顺利完成而不会耗尽内存。你可以在[<u>这里</u>](https://www.hiascend.com/document/detail/zh/CANNCommunityEdition/800alpha003/apiref/envref/envref_07_0061.html)找到更多详细信息。"
+
+#: ../../tutorials/single_npu_multimodal.md:43
+msgid "Run the following script to execute offline inference on a single NPU:"
+msgstr "运行以下脚本以在单个 NPU 上执行离线推理："
+
+#: ../../tutorials/single_npu_multimodal.md:109
+msgid "If you run this script successfully, you can see the info shown below:"
+msgstr "如果你成功运行此脚本，你可以看到如下所示的信息："
+
+#: ../../tutorials/single_npu_multimodal.md:121
+msgid "Online Serving on Single NPU"
+msgstr "单个 NPU 上的在线服务"
+
+#: ../../tutorials/single_npu_multimodal.md:123
+msgid "Run docker container to start the vLLM server on a single NPU:"
+msgstr "运行 docker 容器，在单个 NPU 上启动 vLLM 服务器："
+
+#: ../../tutorials/single_npu_multimodal.md:154
+msgid ""
+"Add `--max_model_len` option to avoid ValueError that the "
+"Qwen2.5-VL-7B-Instruct model's max seq len (128000) is larger than the "
+"maximum number of tokens that can be stored in KV cache. This will differ "
+"with different NPU series base on the HBM size. Please modify the value "
+"according to a suitable value for your NPU series."
+msgstr ""
+"新增 `--max_model_len` 选项，以避免出现 ValueError，即 Qwen2.5-VL-7B-Instruct "
+"模型的最大序列长度（128000）大于 KV 缓存可存储的最大 token 数。该数值会根据不同 NPU 系列的 HBM 大小而不同。请根据你的 NPU"
+" 系列，将该值设置为合适的数值。"
+
+#: ../../tutorials/single_npu_multimodal.md:157
+msgid "If your service start successfully, you can see the info shown below:"
+msgstr "如果你的服务启动成功，你会看到如下所示的信息："
+
+#: ../../tutorials/single_npu_multimodal.md:165
+msgid ""
+"Once your server is started, you can query the model with input prompts:"
+msgstr "一旦你的服务器启动，你可以通过输入提示词来查询模型："
+
+#: ../../tutorials/single_npu_multimodal.md:182
+msgid ""
+"If you query the server successfully, you can see the info shown below "
+"(client):"
+msgstr "如果你成功查询了服务器，你可以看到如下所示的信息（客户端）："
+
+#: ../../tutorials/single_npu_multimodal.md:188
+msgid "Logs of the vllm server:"
+msgstr "vllm 服务器的日志："
--- a/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/single_npu_qwen3_embedding.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/single_npu_qwen3_embedding.po
@@ -0,0 +1,70 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-ascend team
+# This file is distributed under the same license as the vllm-ascend
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-ascend\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../tutorials/single_npu_qwen3_embedding.md:1
+msgid "Single NPU (Qwen3-Embedding-8B)"
+msgstr "单个NPU（Qwen3-Embedding-8B）"
+
+#: ../../tutorials/single_npu_qwen3_embedding.md:3
+msgid ""
+"The Qwen3 Embedding model series is the latest proprietary model of the Qwen"
+" family, specifically designed for text embedding and ranking tasks. "
+"Building upon the dense foundational models of the Qwen3 series, it provides"
+" a comprehensive range of text embeddings and reranking models in various "
+"sizes (0.6B, 4B, and 8B). This guide describes how to run the model with "
+"vLLM Ascend. Note that only 0.9.2rc1 and higher versions of vLLM Ascend "
+"support the model."
+msgstr ""
+"Qwen3 Embedding 模型系列是 Qwen 家族最新的专有模型，专为文本嵌入和排序任务设计。在 Qwen3 "
+"系列的密集基础模型之上，它提供了多种尺寸（0.6B、4B 和 8B）的文本嵌入与重排序模型。本指南介绍如何使用 vLLM Ascend "
+"运行该模型。请注意，只有 vLLM Ascend 0.9.2rc1 及更高版本才支持该模型。"
+
+#: ../../tutorials/single_npu_qwen3_embedding.md:5
+msgid "Run docker container"
+msgstr "运行 docker 容器"
+
+#: ../../tutorials/single_npu_qwen3_embedding.md:7
+msgid ""
+"Take Qwen3-Embedding-8B model as an example, first run the docker container "
+"with the following command:"
+msgstr "以 Qwen3-Embedding-8B 模型为例，首先使用以下命令运行 docker 容器："
+
+#: ../../tutorials/single_npu_qwen3_embedding.md:29
+msgid "Setup environment variables:"
+msgstr "设置环境变量："
+
+#: ../../tutorials/single_npu_qwen3_embedding.md:39
+msgid "Online Inference"
+msgstr "在线推理"
+
+#: ../../tutorials/single_npu_qwen3_embedding.md:45
+msgid ""
+"Once your server is started, you can query the model with input prompts"
+msgstr "一旦服务器启动，就可以通过输入提示词来查询模型。"
+
+#: ../../tutorials/single_npu_qwen3_embedding.md:56
+msgid "Offline Inference"
+msgstr "离线推理"
+
+#: ../../tutorials/single_npu_qwen3_embedding.md:92
+msgid "If you run this script successfully, you can see the info shown below:"
+msgstr "如果你成功运行此脚本，你可以看到如下所示的信息："
--- a/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/configuration/additional_config.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/configuration/additional_config.po
@@ -0,0 +1,290 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-ascend team
+# This file is distributed under the same license as the vllm-ascend
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-ascend\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../user_guide/configuration/additional_config.md:1
+msgid "Additional Configuration"
+msgstr "附加配置"
+
+#: ../../user_guide/configuration/additional_config.md:3
+msgid ""
+"additional configuration is a mechanism provided by vLLM to allow plugins to"
+" control inner behavior by their own. vLLM Ascend uses this mechanism to "
+"make the project more flexible."
+msgstr "额外配置是 vLLM 提供的一种机制，允许插件自行控制内部行为。vLLM Ascend 利用这种机制使项目更加灵活。"
+
+#: ../../user_guide/configuration/additional_config.md:5
+msgid "How to use"
+msgstr "如何使用"
+
+#: ../../user_guide/configuration/additional_config.md:7
+msgid ""
+"With either online mode or offline mode, users can use additional "
+"configuration. Take Qwen3 as an example:"
+msgstr "无论是在线模式还是离线模式，用户都可以使用额外的配置。以 Qwen3 为例："
+
+#: ../../user_guide/configuration/additional_config.md:9
+msgid "**Online mode**:"
+msgstr "**在线模式**："
+
+#: ../../user_guide/configuration/additional_config.md:15
+msgid "**Offline mode**:"
+msgstr "**离线模式**："
+
+#: ../../user_guide/configuration/additional_config.md:23
+msgid "Configuration options"
+msgstr "配置选项"
+
+#: ../../user_guide/configuration/additional_config.md:25
+msgid ""
+"The following table lists the additional configuration options available in "
+"vLLM Ascend:"
+msgstr "下表列出了 vLLM Ascend 中可用的其他配置选项："
+
+#: ../../user_guide/configuration/additional_config.md
+msgid "Name"
+msgstr "名称"
+
+#: ../../user_guide/configuration/additional_config.md
+msgid "Type"
+msgstr "类型"
+
+#: ../../user_guide/configuration/additional_config.md
+msgid "Default"
+msgstr "默认"
+
+#: ../../user_guide/configuration/additional_config.md
+msgid "Description"
+msgstr "描述"
+
+#: ../../user_guide/configuration/additional_config.md
+msgid "`torchair_graph_config`"
+msgstr "`torchair_graph_config`"
+
+#: ../../user_guide/configuration/additional_config.md
+msgid "dict"
+msgstr "dict"
+
+#: ../../user_guide/configuration/additional_config.md
+#, python-brace-format
+msgid "`{}`"
+msgstr "`{}`"
+
+#: ../../user_guide/configuration/additional_config.md
+msgid "The config options for torchair graph mode"
+msgstr "torchair 图模式的配置选项"
+
+#: ../../user_guide/configuration/additional_config.md
+msgid "`ascend_scheduler_config`"
+msgstr "`ascend_scheduler_config`"
+
+#: ../../user_guide/configuration/additional_config.md
+msgid "The config options for ascend scheduler"
+msgstr "ascend 调度器的配置选项"
+
+#: ../../user_guide/configuration/additional_config.md
+msgid "`expert_tensor_parallel_size`"
+msgstr "`expert_tensor_parallel_size`"
+
+#: ../../user_guide/configuration/additional_config.md
+msgid "str"
+msgstr "str"
+
+#: ../../user_guide/configuration/additional_config.md
+msgid "`0`"
+msgstr "`0`"
+
+#: ../../user_guide/configuration/additional_config.md
+msgid "Expert tensor parallel size the model to use."
+msgstr "专家张量并行的模型大小设置。"
+
+#: ../../user_guide/configuration/additional_config.md
+msgid "`refresh`"
+msgstr "`刷新`"
+
+#: ../../user_guide/configuration/additional_config.md
+msgid "bool"
+msgstr "bool"
+
+#: ../../user_guide/configuration/additional_config.md
+msgid "`false`"
+msgstr "`false`"
+
+#: ../../user_guide/configuration/additional_config.md
+msgid ""
+"Whether to refresh global ascend config content. This value is usually used "
+"by rlhf or ut/e2e test case."
+msgstr "是否刷新全局 ascend 配置信息。此值通常由 rlhf 或 ut/e2e 测试用例使用。"
+
+#: ../../user_guide/configuration/additional_config.md
+msgid "`expert_map_path`"
+msgstr "`expert_map_path`"
+
+#: ../../user_guide/configuration/additional_config.md
+msgid "`None`"
+msgstr "`None`"
+
+#: ../../user_guide/configuration/additional_config.md
+msgid ""
+"When using expert load balancing for the MOE model, an expert map path needs"
+" to be passed in."
+msgstr "在为MOE模型使用专家负载均衡时，需要传入专家映射路径。"
+
+#: ../../user_guide/configuration/additional_config.md
+msgid "`chunked_prefill_for_mla`"
+msgstr "`chunked_prefill_for_mla`"
+
+#: ../../user_guide/configuration/additional_config.md
+msgid "`False`"
+msgstr "`False`"
+
+#: ../../user_guide/configuration/additional_config.md
+msgid "Whether to enable the fused operator-like chunked_prefill."
+msgstr "是否启用类似算子融合的 chunked_prefill 功能。"
+
+#: ../../user_guide/configuration/additional_config.md
+msgid "`kv_cache_dtype`"
+msgstr "`kv_cache_dtype`"
+
+#: ../../user_guide/configuration/additional_config.md
+msgid ""
+"When using the kv cache quantization method, kv cache dtype needs to be set,"
+" currently only int8 is supported."
+msgstr "当使用kv缓存量化方法时，需要设置kv缓存的数据类型，目前仅支持int8。"
+
+#: ../../user_guide/configuration/additional_config.md:37
+msgid "The details of each config option are as follows:"
+msgstr "每个配置选项的详细信息如下："
+
+#: ../../user_guide/configuration/additional_config.md:39
+msgid "**torchair_graph_config**"
+msgstr "**torchair_graph_config**"
+
+#: ../../user_guide/configuration/additional_config.md
+msgid "`enabled`"
+msgstr "`启用`"
+
+#: ../../user_guide/configuration/additional_config.md
+msgid ""
+"Whether to enable torchair graph mode. Currently only DeepSeek series models"
+" and PanguProMoE are supported to use torchair graph mode"
+msgstr "是否启用 torchair 图模式。目前仅支持 DeepSeek 系列模型和 PanguProMoE 使用 torchair 图模式。"
+
+#: ../../user_guide/configuration/additional_config.md
+msgid "`enable_multistream_mla`"
+msgstr "`enable_multistream_mla`"
+
+#: ../../user_guide/configuration/additional_config.md
+msgid ""
+"Whether to put vector ops of MLA to another stream. This option only takes "
+"effects on models using MLA (e.g., DeepSeek)."
+msgstr "是否将MLA的向量操作放到另一个流中。此选项仅对使用MLA的模型（例如，DeepSeek）有效。"
+
+#: ../../user_guide/configuration/additional_config.md
+msgid "`enable_multistream_moe`"
+msgstr "`enable_multistream_moe`"
+
+#: ../../user_guide/configuration/additional_config.md
+msgid ""
+"Whether to enable multistream shared expert. This option only takes effects "
+"on DeepSeek moe models."
+msgstr "是否启用多流共享专家功能。此选项仅对 DeepSeek MoE 模型生效。"
+
+#: ../../user_guide/configuration/additional_config.md
+msgid "`enable_view_optimize`"
+msgstr "`enable_view_optimize` （启用视图优化）"
+
+#: ../../user_guide/configuration/additional_config.md
+msgid "`True`"
+msgstr "`True`"
+
+#: ../../user_guide/configuration/additional_config.md
+msgid "Whether to enable torchair view optimization"
+msgstr "是否启用torchair视图优化"
+
+#: ../../user_guide/configuration/additional_config.md
+msgid "`use_cached_graph`"
+msgstr "`use_cached_graph`"
+
+#: ../../user_guide/configuration/additional_config.md
+msgid "Whether to use cached graph"
+msgstr "是否使用缓存的图"
+
+#: ../../user_guide/configuration/additional_config.md
+msgid "`graph_batch_sizes`"
+msgstr "`graph_batch_sizes`"
+
+#: ../../user_guide/configuration/additional_config.md
+msgid "list[int]"
+msgstr "list[int]"
+
+#: ../../user_guide/configuration/additional_config.md
+msgid "`[]`"
+msgstr "`[]`"
+
+#: ../../user_guide/configuration/additional_config.md
+msgid "The batch size for torchair graph cache"
+msgstr "torchair 图缓存的批量大小"
+
+#: ../../user_guide/configuration/additional_config.md
+msgid "`graph_batch_sizes_init`"
+msgstr "`graph_batch_sizes_init`"
+
+#: ../../user_guide/configuration/additional_config.md
+msgid "Init graph batch size dynamically if `graph_batch_sizes` is empty"
+msgstr "如果 `graph_batch_sizes` 为空，则动态初始化图批大小"
+
+#: ../../user_guide/configuration/additional_config.md
+msgid "`enable_kv_nz`"
+msgstr "`enable_kv_nz`"
+
+#: ../../user_guide/configuration/additional_config.md
+msgid ""
+"Whether to enable kvcache NZ layout. This option only takes effects on "
+"models using MLA (e.g., DeepSeek)."
+msgstr "是否启用 kvcache NZ 布局。此选项仅对使用 MLA 的模型（例如 DeepSeek）生效。"
+
+#: ../../user_guide/configuration/additional_config.md:52
+msgid "**ascend_scheduler_config**"
+msgstr "**ascend_scheduler_config**"
+
+#: ../../user_guide/configuration/additional_config.md
+msgid "Whether to enable ascend scheduler for V1 engine"
+msgstr "是否为 V1 引擎启用 ascend 调度器"
+
+#: ../../user_guide/configuration/additional_config.md:58
+msgid ""
+"ascend_scheduler_config also support the options from [vllm scheduler "
+"config](https://docs.vllm.ai/en/stable/api/vllm/config.html#vllm.config.SchedulerConfig)."
+" For example, you can add `enable_chunked_prefill: True` to "
+"ascend_scheduler_config as well."
+msgstr ""
+"ascend_scheduler_config 也支持来自 [vllm scheduler "
+"config](https://docs.vllm.ai/en/stable/api/vllm/config.html#vllm.config.SchedulerConfig)"
+" 的选项。例如，你也可以在 ascend_scheduler_config 中添加 `enable_chunked_prefill: True`。"
+
+#: ../../user_guide/configuration/additional_config.md:60
+msgid "Example"
+msgstr "示例"
+
+#: ../../user_guide/configuration/additional_config.md:62
+msgid "An example of additional configuration is as follows:"
+msgstr "以下是额外配置的一个示例："
--- a/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/configuration/env_vars.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/configuration/env_vars.po
@@ -0,0 +1,28 @@
+# Translations template for PROJECT.
+# Copyright (C) 2025 ORGANIZATION
+# This file is distributed under the same license as the PROJECT project.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: PROJECT VERSION\n"
+"Report-Msgid-Bugs-To: EMAIL@ADDRESS\n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: LANGUAGE <LL@li.org>\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../user_guide/configuration/env_vars.md:1
+msgid "Environment Variables"
+msgstr "环境变量"
+
+#: ../../user_guide/configuration/env_vars.md:3
+msgid ""
+"vllm-ascend uses the following environment variables to configure the "
+"system:"
+msgstr "vllm-ascend 使用以下环境变量来配置系统："
--- a/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/configuration/index.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/configuration/index.po
@@ -0,0 +1,30 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-ascend team
+# This file is distributed under the same license as the vllm-ascend
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-ascend\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../user_guide/configuration/index.md:1
+#: ../../user_guide/configuration/index.md:5
+msgid "Configuration Guide"
+msgstr "配置指南"
+
+#: ../../user_guide/configuration/index.md:3
+msgid "This section provides a detailed configuration guide of vLLM Ascend."
+msgstr "本节提供了 vLLM Ascend 的详细配置指南。"
--- a/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/feature_guide/graph_mode.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/feature_guide/graph_mode.po
@@ -0,0 +1,121 @@
+# Translations template for PROJECT.
+# Copyright (C) 2025 ORGANIZATION
+# This file is distributed under the same license as the PROJECT project.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: PROJECT VERSION\n"
+"Report-Msgid-Bugs-To: EMAIL@ADDRESS\n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: LANGUAGE <LL@li.org>\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../user_guide/feature_guide/graph_mode.md:1
+msgid "Graph Mode Guide"
+msgstr "图模式指南"
+
+#: ../../user_guide/feature_guide/graph_mode.md:4
+msgid ""
+"This feature is currently experimental. In future versions, there may be "
+"behavioral changes around configuration, coverage, performance improvement."
+msgstr "此功能目前为实验性功能。在未来的版本中，配置、覆盖率和性能改进等方面的行为可能会有变化。"
+
+#: ../../user_guide/feature_guide/graph_mode.md:7
+msgid ""
+"This guide provides instructions for using Ascend Graph Mode with vLLM "
+"Ascend. Please note that graph mode is only available on V1 Engine. And only"
+" Qwen, DeepSeek series models are well tested from 0.9.0rc1. We'll make it "
+"stable and generalize in the next release."
+msgstr ""
+"本指南提供了在 vLLM Ascend 上使用 Ascend 图模式的操作说明。请注意，图模式仅在 V1 引擎上可用，并且从 0.9.0rc1 起，仅对"
+" Qwen、DeepSeek 系列模型进行了充分测试。我们将在下一个版本中使其更加稳定和通用。"
+
+#: ../../user_guide/feature_guide/graph_mode.md:9
+msgid "Getting Started"
+msgstr "快速入门"
+
+#: ../../user_guide/feature_guide/graph_mode.md:11
+msgid ""
+"From v0.9.1rc1 with V1 Engine, vLLM Ascend will run models in graph mode by "
+"default to keep the same behavior with vLLM. If you hit any issues, please "
+"feel free to open an issue on GitHub and fallback to eager mode temporarily "
+"by set `enforce_eager=True` when initializing the model."
+msgstr ""
+"从 v0.9.1rc1 版本起，使用 V1 引擎时，vLLM Ascend 默认将在图模式下运行模型，以保持与 vLLM "
+"同样的行为。如果遇到任何问题，欢迎在 GitHub 上提交 issue，并在初始化模型时通过设置 `enforce_eager=True` 临时切换回 "
+"eager 模式。"
+
+#: ../../user_guide/feature_guide/graph_mode.md:13
+msgid "There are two kinds for graph mode supported by vLLM Ascend:"
+msgstr "vLLM Ascend 支持两种图模式："
+
+#: ../../user_guide/feature_guide/graph_mode.md:14
+msgid ""
+"**ACLGraph**: This is the default graph mode supported by vLLM Ascend. In "
+"v0.9.1rc1, only Qwen series models are well tested."
+msgstr ""
+"**ACLGraph**：这是 vLLM Ascend 支持的默认图模式。在 v0.9.1rc1 版本中，只有 Qwen 系列模型得到了充分测试。"
+
+#: ../../user_guide/feature_guide/graph_mode.md:15
+msgid ""
+"**TorchAirGraph**: This is the GE graph mode. In v0.9.1rc1, only DeepSeek "
+"series models are supported."
+msgstr "**TorchAirGraph**：这是GE图模式。在v0.9.1rc1版本中，仅支持DeepSeek系列模型。"
+
+#: ../../user_guide/feature_guide/graph_mode.md:17
+msgid "Using ACLGraph"
+msgstr "使用 ACLGraph"
+
+#: ../../user_guide/feature_guide/graph_mode.md:18
+msgid ""
+"ACLGraph is enabled by default. Take Qwen series models as an example, just "
+"set to use V1 Engine is enough."
+msgstr "ACLGraph 默认启用。以 Qwen 系列模型为例，只需设置为使用 V1 引擎即可。"
+
+#: ../../user_guide/feature_guide/graph_mode.md:20
+#: ../../user_guide/feature_guide/graph_mode.md:41
+#: ../../user_guide/feature_guide/graph_mode.md:64
+msgid "offline example:"
+msgstr "离线示例："
+
+#: ../../user_guide/feature_guide/graph_mode.md:31
+#: ../../user_guide/feature_guide/graph_mode.md:52
+#: ../../user_guide/feature_guide/graph_mode.md:74
+msgid "online example:"
+msgstr "在线示例："
+
+#: ../../user_guide/feature_guide/graph_mode.md:37
+msgid "Using TorchAirGraph"
+msgstr "使用 TorchAirGraph"
+
+#: ../../user_guide/feature_guide/graph_mode.md:39
+msgid ""
+"If you want to run DeepSeek series models with graph mode, you should use "
+"[TorchAirGraph](https://www.hiascend.com/document/detail/zh/Pytorch/700/modthirdparty/torchairuseguide/torchair_0002.html)."
+" In this case, additional config is required."
+msgstr ""
+"如果你想通过图模式运行 DeepSeek 系列模型，你应该使用 "
+"[TorchAirGraph](https://www.hiascend.com/document/detail/zh/Pytorch/700/modthirdparty/torchairuseguide/torchair_0002.html)。在这种情况下，需要额外的配置。"
+
+#: ../../user_guide/feature_guide/graph_mode.md:58
+msgid ""
+"You can find more detail about additional config "
+"[here](../configuration/additional_config.md)."
+msgstr "你可以在[这里](../configuration/additional_config.md)找到关于附加配置的更多详细信息。"
+
+#: ../../user_guide/feature_guide/graph_mode.md:60
+msgid "Fallback to Eager Mode"
+msgstr "回退到 Eager 模式"
+
+#: ../../user_guide/feature_guide/graph_mode.md:62
+msgid ""
+"If both `ACLGraph` and `TorchAirGraph` fail to run, you should fallback to "
+"eager mode."
+msgstr "如果 `ACLGraph` 和 `TorchAirGraph` 都无法运行，你应该退回到 eager 模式。"
--- a/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/feature_guide/index.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/feature_guide/index.po
@@ -0,0 +1,30 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-ascend team
+# This file is distributed under the same license as the vllm-ascend
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-ascend\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../user_guide/feature_guide/index.md:1
+#: ../../user_guide/feature_guide/index.md:5
+msgid "Feature Guide"
+msgstr "功能指南"
+
+#: ../../user_guide/feature_guide/index.md:3
+msgid "This section provides a detailed usage guide of vLLM Ascend features."
+msgstr "本节提供了 vLLM Ascend 功能的详细使用指南。"
--- a/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/feature_guide/lora.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/feature_guide/lora.po
@@ -0,0 +1,58 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-ascend team
+# This file is distributed under the same license as the vllm-ascend
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-ascend\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../user_guide/feature_guide/lora.md:1
+msgid "LoRA Adapters Guide"
+msgstr "LoRA 适配器指南"
+
+#: ../../user_guide/feature_guide/lora.md:3
+msgid ""
+"Like vLLM, vllm-ascend supports LoRA as well. The usage and more details can"
+" be found in [vLLM official "
+"document](https://docs.vllm.ai/en/latest/features/lora.html)."
+msgstr ""
+"与 vLLM 类似，vllm-ascend 也支持 LoRA。用法及更多详情可参见 [vLLM "
+"官方文档](https://docs.vllm.ai/en/latest/features/lora.html)。"
+
+#: ../../user_guide/feature_guide/lora.md:5
+msgid ""
+"You can also refer to "
+"[this](https://docs.vllm.ai/en/latest/models/supported_models.html#list-of-"
+"text-only-language-models) to find which models support LoRA in vLLM."
+msgstr ""
+"你也可以参考[这个链接](https://docs.vllm.ai/en/latest/models/supported_models.html#list-"
+"of-text-only-language-models)来查找哪些模型在 vLLM 中支持 LoRA。"
+
+#: ../../user_guide/feature_guide/lora.md:7
+msgid "Tips"
+msgstr "提示"
+
+#: ../../user_guide/feature_guide/lora.md:8
+msgid ""
+"If you fail to run vllm-ascend with LoRA, you may follow [this "
+"instruction](https://vllm-"
+"ascend.readthedocs.io/en/latest/user_guide/feature_guide/graph_mode.html#fallback-"
+"to-eager-mode) to disable graph mode and try again."
+msgstr ""
+"如果你在使用 LoRA 运行 vllm-ascend 时失败，可以按照[此说明](https://vllm-"
+"ascend.readthedocs.io/en/latest/user_guide/feature_guide/graph_mode.html#fallback-"
+"to-eager-mode)禁用图模式后再重试。"
--- a/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/feature_guide/quantization.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/feature_guide/quantization.po
@@ -0,0 +1,183 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-ascend team
+# This file is distributed under the same license as the vllm-ascend
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-ascend\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../user_guide/feature_guide/quantization.md:1
+msgid "Quantization Guide"
+msgstr "量化指南"
+
+#: ../../user_guide/feature_guide/quantization.md:3
+msgid ""
+"Model quantization is a technique that reduces the size and computational "
+"requirements of a model by lowering the data precision of the weights and "
+"activation values in the model, thereby saving the memory and improving the "
+"inference speed."
+msgstr "模型量化是一种通过降低模型中权重和激活值的数据精度，从而减少模型大小和计算需求的技术，这样可以节省内存并提高推理速度。"
+
+#: ../../user_guide/feature_guide/quantization.md:5
+msgid ""
+"Since 0.9.0rc2 version, quantization feature is experimentally supported in "
+"vLLM Ascend. Users can enable quantization feature by specifying "
+"`--quantization ascend`. Currently, only Qwen, DeepSeek series models are "
+"well tested. We’ll support more quantization algorithm and models in the "
+"future."
+msgstr ""
+"自 0.9.0rc2 版本起，vLLM Ascend 实验性地支持量化特性。用户可以通过指定 `--quantization ascend` "
+"启用量化功能。目前，只有 Qwen、DeepSeek 系列模型经过了充分测试。未来我们将支持更多的量化算法和模型。"
+
+#: ../../user_guide/feature_guide/quantization.md:7
+msgid "Install modelslim"
+msgstr "安装 modelslim"
+
+#: ../../user_guide/feature_guide/quantization.md:9
+msgid ""
+"To quantize a model, users should install "
+"[ModelSlim](https://gitee.com/ascend/msit/blob/master/msmodelslim/README.md)"
+" which is the Ascend compression and acceleration tool. It is an affinity-"
+"based compression tool designed for acceleration, using compression as its "
+"core technology and built upon the Ascend platform."
+msgstr ""
+"要对模型进行量化，用户应安装[ModelSlim](https://gitee.com/ascend/msit/blob/master/msmodelslim/README.md)，这是昇腾的压缩与加速工具。它是一种基于亲和性的压缩工具，专为加速设计，以压缩为核心技术，并基于昇腾平台构建。"
+
+#: ../../user_guide/feature_guide/quantization.md:11
+msgid ""
+"Currently, only the specific tag [modelslim-"
+"VLLM-8.1.RC1.b020_001](https://gitee.com/ascend/msit/blob/modelslim-"
+"VLLM-8.1.RC1.b020_001/msmodelslim/README.md) of modelslim works with vLLM "
+"Ascend. Please do not install other version until modelslim master version "
+"is available for vLLM Ascend in the future."
+msgstr ""
+"目前，只有 modelslim 的特定标签 [modelslim-"
+"VLLM-8.1.RC1.b020_001](https://gitee.com/ascend/msit/blob/modelslim-"
+"VLLM-8.1.RC1.b020_001/msmodelslim/README.md) 支持 vLLM Ascend。在未来 modelslim "
+"的主版本支持 vLLM Ascend 之前，请不要安装其他版本。"
+
+#: ../../user_guide/feature_guide/quantization.md:13
+msgid "Install modelslim:"
+msgstr "安装 modelslim："
+
+#: ../../user_guide/feature_guide/quantization.md:21
+msgid "Quantize model"
+msgstr "量化模型"
+
+#: ../../user_guide/feature_guide/quantization.md:23
+#, python-format
+msgid ""
+"Take [DeepSeek-V2-Lite](https://modelscope.cn/models/deepseek-"
+"ai/DeepSeek-V2-Lite) as an example, you just need to download the model, and"
+" then execute the convert command. The command is shown below. More info can"
+" be found in modelslim doc [deepseek w8a8 dynamic quantization "
+"docs](https://gitee.com/ascend/msit/blob/modelslim-"
+"VLLM-8.1.RC1.b020_001/msmodelslim/example/DeepSeek/README.md#deepseek-v2-w8a8-dynamic%E9%87%8F%E5%8C%96)."
+msgstr ""
+"以 [DeepSeek-V2-Lite](https://modelscope.cn/models/deepseek-"
+"ai/DeepSeek-V2-Lite) 为例，你只需要下载模型，然后执行转换命令。命令如下所示。更多信息可参考 modelslim 文档 "
+"[deepseek w8a8 动态量化文档](https://gitee.com/ascend/msit/blob/modelslim-"
+"VLLM-8.1.RC1.b020_001/msmodelslim/example/DeepSeek/README.md#deepseek-v2-w8a8-dynamic%E9%87%8F%E5%8C%96)。"
+
+#: ../../user_guide/feature_guide/quantization.md:32
+msgid ""
+"You can also download the quantized model that we uploaded. Please note that"
+" these weights should be used for test only. For example, "
+"https://www.modelscope.cn/models/vllm-ascend/DeepSeek-V2-Lite-W8A8"
+msgstr ""
+"你也可以下载我们上传的量化模型。请注意，这些权重仅应用于测试。例如：https://www.modelscope.cn/models/vllm-"
+"ascend/DeepSeek-V2-Lite-W8A8"
+
+#: ../../user_guide/feature_guide/quantization.md:35
+msgid "Once convert action is done, there are two important files generated."
+msgstr "转换操作完成后，会生成两个重要的文件。"
+
+#: ../../user_guide/feature_guide/quantization.md:37
+msgid ""
+"[config.json](https://www.modelscope.cn/models/vllm-"
+"ascend/DeepSeek-V2-Lite-W8A8/file/view/master/config.json?status=1). Please "
+"make sure that there is no `quantization_config` field in it."
+msgstr ""
+"[config.json](https://www.modelscope.cn/models/vllm-"
+"ascend/DeepSeek-V2-Lite-W8A8/file/view/master/config.json?status=1)。请确保其中没有 "
+"`quantization_config` 字段。"
+
+#: ../../user_guide/feature_guide/quantization.md:39
+msgid ""
+"[quant_model_description.json](https://www.modelscope.cn/models/vllm-"
+"ascend/DeepSeek-V2-Lite-W8A8/file/view/master/quant_model_description.json?status=1)."
+" All the converted weights info are recorded in this file."
+msgstr ""
+"[quant_model_description.json](https://www.modelscope.cn/models/vllm-"
+"ascend/DeepSeek-V2-Lite-W8A8/file/view/master/quant_model_description.json?status=1)。所有被转换的权重信息都记录在该文件中。"
+
+#: ../../user_guide/feature_guide/quantization.md:41
+msgid "Here is the full converted model files:"
+msgstr "以下是完整转换后的模型文件："
+
+#: ../../user_guide/feature_guide/quantization.md:60
+msgid "Run the model"
+msgstr "运行模型"
+
+#: ../../user_guide/feature_guide/quantization.md:62
+msgid ""
+"Now, you can run the quantized models with vLLM Ascend. Here is the example "
+"for online and offline inference."
+msgstr "现在，你可以使用 vLLM Ascend 运行量化模型。下面是在线和离线推理的示例。"
+
+#: ../../user_guide/feature_guide/quantization.md:64
+msgid "Offline inference"
+msgstr "离线推理"
+
+#: ../../user_guide/feature_guide/quantization.md:90
+msgid "Online inference"
+msgstr "在线推理"
+
+#: ../../user_guide/feature_guide/quantization.md:97
+msgid "FAQs"
+msgstr "常见问题解答"
+
+#: ../../user_guide/feature_guide/quantization.md:99
+msgid ""
+"1. How to solve the KeyError: 'xxx.layers.0.self_attn.q_proj.weight' "
+"problem?"
+msgstr "1. 如何解决 KeyError: 'xxx.layers.0.self_attn.q_proj.weight' 问题？"
+
+#: ../../user_guide/feature_guide/quantization.md:101
+msgid ""
+"First, make sure you specify `ascend` quantization method. Second, check if "
+"your model is converted by this `modelslim-VLLM-8.1.RC1.b020_001` modelslim "
+"version. Finally, if it still doesn't work, please submit a issue, maybe "
+"some new models need to be adapted."
+msgstr ""
+"首先，请确保你指定了 `ascend` 量化方法。其次，检查你的模型是否由 `modelslim-VLLM-8.1.RC1.b020_001` 这个 "
+"modelslim 版本转换。如果仍然无法使用，请提交一个 issue，可能有一些新模型需要适配。"
+
+#: ../../user_guide/feature_guide/quantization.md:104
+msgid ""
+"2. How to solve the error \"Could not locate the "
+"configuration_deepseek.py\"?"
+msgstr "2. 如何解决“无法找到 configuration_deepseek.py”错误？"
+
+#: ../../user_guide/feature_guide/quantization.md:106
+msgid ""
+"Please convert DeepSeek series models using `modelslim-"
+"VLLM-8.1.RC1.b020_001` modelslim, this version has fixed the missing "
+"configuration_deepseek.py error."
+msgstr ""
+"请使用 `modelslim-VLLM-8.1.RC1.b020_001` 的 modelslim 转换 DeepSeek 系列模型，该版本已修复缺少 "
+"configuration_deepseek.py 的错误。"
--- a/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/feature_guide/sleep_mode.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/feature_guide/sleep_mode.po
@@ -0,0 +1,156 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-ascend team
+# This file is distributed under the same license as the vllm-ascend
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-ascend\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../user_guide/feature_guide/sleep_mode.md:1
+msgid "Sleep Mode Guide"
+msgstr "睡眠模式指南"
+
+#: ../../user_guide/feature_guide/sleep_mode.md:3
+msgid "Overview"
+msgstr "概述"
+
+#: ../../user_guide/feature_guide/sleep_mode.md:5
+msgid ""
+"Sleep Mode is an API designed to offload model weights and discard KV cache "
+"from NPU memory. This functionality is essential for reinforcement learning "
+"(RL) post-training workloads, particularly in online algorithms such as PPO,"
+" GRPO, or DPO. During training, the policy model typically performs auto-"
+"regressive generation using inference engines like vLLM, followed by forward"
+" and backward passes for optimization."
+msgstr ""
+"Sleep Mode 是一个用于卸载模型权重并清除 NPU 内存中 KV 缓存的 API。此功能对于强化学习（RL）后训练任务尤其重要，特别是在 "
+"PPO、GRPO 或 DPO 等在线算法中。在训练过程中，策略模型通常会使用像 vLLM "
+"这样的推理引擎进行自回归生成，然后进行前向和反向传播以进行优化。"
+
+#: ../../user_guide/feature_guide/sleep_mode.md:7
+msgid ""
+"Since the generation and training phases may employ different model "
+"parallelism strategies, it becomes crucial to free KV cache and even offload"
+" model parameters stored within vLLM during training. This ensures efficient"
+" memory utilization and avoids resource contention on the NPU."
+msgstr ""
+"由于生成和训练阶段可能采用不同的模型并行策略，因此在训练过程中及时释放 KV 缓存，甚至卸载存储在 vLLM "
+"内的模型参数变得至关重要。这可以确保内存的高效利用，并避免 NPU 上的资源争用。"
+
+#: ../../user_guide/feature_guide/sleep_mode.md:10
+msgid "Getting started"
+msgstr "快速上手"
+
+#: ../../user_guide/feature_guide/sleep_mode.md:12
+#, python-brace-format
+msgid ""
+"With `enable_sleep_mode=True`, the way we manage memory(malloc, free) in "
+"vllm will under a specific memory pool, during loading model and initialize "
+"kv_caches, we tag the memory as a map: `{\"weight\": data, \"kv_cache\": "
+"data}`."
+msgstr ""
+"当 `enable_sleep_mode=True` 时，我们在 vllm 中管理内存（malloc, "
+"free）的方式会在一个特定的内存池下进行，在加载模型和初始化 kv_caches "
+"期间，我们会将内存打上标签，组织成一个映射：`{\"weight\": data, \"kv_cache\": data}`。"
+
+#: ../../user_guide/feature_guide/sleep_mode.md:14
+msgid ""
+"The engine(v0/v1) supports two sleep levels to manage memory during idle "
+"periods:"
+msgstr "该引擎（v0/v1）支持两种睡眠等级，以在空闲期间管理内存："
+
+#: ../../user_guide/feature_guide/sleep_mode.md:16
+msgid "Level 1 Sleep"
+msgstr "一级睡眠"
+
+#: ../../user_guide/feature_guide/sleep_mode.md:17
+msgid "Action: Offloads model weights and discards the KV cache."
+msgstr "操作：卸载模型权重并清除KV缓存。"
+
+#: ../../user_guide/feature_guide/sleep_mode.md:18
+msgid "Memory: Model weights are moved to CPU memory; KV cache is forgotten."
+msgstr "内存：模型权重被移动到CPU内存；KV缓存被清除。"
+
+#: ../../user_guide/feature_guide/sleep_mode.md:19
+msgid "Use Case: Suitable when reusing the same model later."
+msgstr "用例：适用于之后需要重复使用同一个模型的情况。"
+
+#: ../../user_guide/feature_guide/sleep_mode.md:20
+msgid ""
+"Note: Ensure sufficient CPU memory is available to hold the model weights."
+msgstr "注意：请确保有足够的CPU内存来存储模型权重。"
+
+#: ../../user_guide/feature_guide/sleep_mode.md:22
+msgid "Level 2 Sleep"
+msgstr "二级睡眠"
+
+#: ../../user_guide/feature_guide/sleep_mode.md:23
+msgid "Action: Discards both model weights and KV cache."
+msgstr "操作：同时丢弃模型权重和KV缓存。"
+
+#: ../../user_guide/feature_guide/sleep_mode.md:24
+msgid ""
+"Memory: The content of both the model weights and kv cache is forgotten."
+msgstr "内存：模型权重和kv缓存的内容都会被遗忘。"
+
+#: ../../user_guide/feature_guide/sleep_mode.md:25
+msgid ""
+"Use Case: Ideal when switching to a different model or updating the current "
+"one."
+msgstr "用例：当切换到不同的模型或更新当前模型时非常理想。"
+
+#: ../../user_guide/feature_guide/sleep_mode.md:27
+msgid ""
+"Since this feature uses the low-level API "
+"[AscendCL](https://www.hiascend.com/document/detail/zh/CANNCommunityEdition/82RC1alpha002/API/appdevgapi/appdevgapi_07_0000.html),"
+" in order to use sleep mode, you should follow the [installation "
+"guide](https://vllm-ascend.readthedocs.io/en/latest/installation.html) and "
+"building from source, if you are using v0.7.3, remember to set `export "
+"COMPILE_CUSTOM_KERNELS=1`, for the latest version(v0.9.x+), the environment "
+"variable `COMPILE_CUSTOM_KERNELS` will be set 1 by default while building "
+"from source."
+msgstr ""
+"由于此功能使用了底层 API "
+"[AscendCL](https://www.hiascend.com/document/detail/zh/CANNCommunityEdition/82RC1alpha002/API/appdevgapi/appdevgapi_07_0000.html)，为了使用休眠模式，你应按照[安装指南](https://vllm-"
+"ascend.readthedocs.io/en/latest/installation.html)进行操作，并从源码编译。如果你使用的是 "
+"v0.7.3，请记得设置 `export COMPILE_CUSTOM_KERNELS=1` ；对于最新版本（v0.9.x+），在从源码编译时环境变量 "
+"`COMPILE_CUSTOM_KERNELS` 默认会被设置为 1。"
+
+#: ../../user_guide/feature_guide/sleep_mode.md:29
+msgid "Usage"
+msgstr "用法"
+
+#: ../../user_guide/feature_guide/sleep_mode.md:31
+msgid "The following is a simple example of how to use sleep mode."
+msgstr "以下是如何使用睡眠模式的一个简单示例。"
+
+#: ../../user_guide/feature_guide/sleep_mode.md:33
+msgid "offline inference:"
+msgstr "离线推理："
+
+#: ../../user_guide/feature_guide/sleep_mode.md:72
+msgid "online serving:"
+msgstr "在线服务："
+
+#: ../../user_guide/feature_guide/sleep_mode.md:74
+msgid ""
+"Considering there may be a risk of malicious access, please make sure you "
+"are under a dev-mode, and explicit specify the develop env: "
+"`VLLM_SERVER_DEV_MODE` to expose these endpoints(sleep/wake up)."
+msgstr ""
+"鉴于可能存在恶意访问的风险，请确保您处于开发模式，并明确指定开发环境：`VLLM_SERVER_DEV_MODE`，以便开放这些端点（sleep/wake"
+" up）。"
--- a/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/feature_guide/structured_output.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/feature_guide/structured_output.po
@@ -0,0 +1,220 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-ascend team
+# This file is distributed under the same license as the vllm-ascend
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-ascend\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../user_guide/feature_guide/structured_output.md:1
+msgid "Structured Output Guide"
+msgstr "结构化输出指南"
+
+#: ../../user_guide/feature_guide/structured_output.md:3
+msgid "Overview"
+msgstr "概述"
+
+#: ../../user_guide/feature_guide/structured_output.md:5
+msgid "What is Structured Output?"
+msgstr "什么是结构化输出？"
+
+#: ../../user_guide/feature_guide/structured_output.md:7
+msgid ""
+"LLMs can be unpredictable when you need output in specific formats. Think of"
+" asking a model to generate JSON - without guidance, it might produce valid "
+"text that breaks JSON specification. **Structured Output (also called Guided"
+" Decoding)** enables LLMs to generate outputs that follow a desired "
+"structure while preserving the non-deterministic nature of the system."
+msgstr ""
+"当你需要特定格式输出时，大型语言模型（LLMs）可能表现出不可预测性。比如让模型生成 "
+"JSON，如果没有指导，模型可能会生成有效的文本，但这些文本却不符合 JSON 规范。**结构化输出（也称为引导解码）** "
+"能让大型语言模型生成符合预期结构的输出，同时保留系统的非确定性特性。"
+
+#: ../../user_guide/feature_guide/structured_output.md:9
+msgid ""
+"In simple terms, structured decoding gives LLMs a “template” to follow. "
+"Users provide a schema that “influences” the model’s output, ensuring "
+"compliance with the desired structure."
+msgstr "简单来说，结构化解码为LLM提供了一个“模板”来遵循。用户提供一个模式来“影响”模型的输出，从而确保输出符合期望的结构。"
+
+#: ../../user_guide/feature_guide/structured_output.md:11
+msgid "![structured decoding](./images/structured_output_1.png)"
+msgstr "![结构化解码](./images/structured_output_1.png)"
+
+#: ../../user_guide/feature_guide/structured_output.md:11
+msgid "structured decoding"
+msgstr "结构化解码"
+
+#: ../../user_guide/feature_guide/structured_output.md:13
+msgid "Structured Output in vllm-ascend"
+msgstr "vllm-ascend 中的结构化输出"
+
+#: ../../user_guide/feature_guide/structured_output.md:15
+msgid ""
+"Currently, vllm-ascend supports **xgrammar** and **guidance** backend for "
+"structured output with vllm v1 engine."
+msgstr "目前，vllm-ascend 支持 vllm v1 引擎的结构化输出，后端包括 **xgrammar** 和 **guidance**。"
+
+#: ../../user_guide/feature_guide/structured_output.md:17
+msgid ""
+"XGrammar introduces a new technique that batch constrained decoding via "
+"pushdown automaton (PDA). You can think of a PDA as a “collection of FSMs, "
+"and each FSM represents a context-free grammar (CFG).” One significant "
+"advantage of PDA is its recursive nature, allowing us to execute multiple "
+"state transitions. They also include additional optimisation (for those who "
+"are interested) to reduce grammar compilation overhead. Besides, you can "
+"also find more details about guidance by yourself."
+msgstr ""
+"XGrammar 引入了一种通过下推自动机（PDA）进行批量约束解码的新技术。你可以把 PDA 理解为“有限状态机（FSM）的集合，每个 FSM "
+"代表一个上下文无关文法（CFG）。” PDA 的一个重要优点是其递归特性，使我们能够执行多次状态转移。此外，PDA "
+"还包含了额外的优化（供感兴趣的用户参考），以减少语法编译的开销。除此之外，你还可以自己找到更多关于指导的信息。"
+
+#: ../../user_guide/feature_guide/structured_output.md:19
+msgid "How to Use Structured Output?"
+msgstr "如何使用结构化输出？"
+
+#: ../../user_guide/feature_guide/structured_output.md:21
+msgid "Online Inference"
+msgstr "在线推理"
+
+#: ../../user_guide/feature_guide/structured_output.md:23
+msgid ""
+"You can also generate structured outputs using the OpenAI's Completions and "
+"Chat API. The following parameters are supported, which must be added as "
+"extra parameters:"
+msgstr "你也可以使用 OpenAI 的 Completions 和 Chat API 生成结构化输出。支持以下参数，这些参数必须作为额外参数添加："
+
+#: ../../user_guide/feature_guide/structured_output.md:25
+msgid "`guided_choice`: the output will be exactly one of the choices."
+msgstr "`guided_choice`：输出将会是其中一个选项。"
+
+#: ../../user_guide/feature_guide/structured_output.md:26
+msgid "`guided_regex`: the output will follow the regex pattern."
+msgstr "`guided_regex`：输出将遵循正则表达式模式。"
+
+#: ../../user_guide/feature_guide/structured_output.md:27
+msgid "`guided_json`: the output will follow the JSON schema."
+msgstr "`guided_json`：输出将遵循 JSON 架构。"
+
+#: ../../user_guide/feature_guide/structured_output.md:28
+msgid "`guided_grammar`: the output will follow the context free grammar."
+msgstr "`guided_grammar`：输出将遵循上下文无关文法。"
+
+#: ../../user_guide/feature_guide/structured_output.md:30
+msgid ""
+"Structured outputs are supported by default in the OpenAI-Compatible Server."
+" You can choose to specify the backend to use by setting the `--guided-"
+"decoding-backend` flag to vllm serve. The default backend is `auto`, which "
+"will try to choose an appropriate backend based on the details of the "
+"request. You may also choose a specific backend, along with some options."
+msgstr ""
+"OpenAI 兼容服务器默认支持结构化输出。你可以通过设置 `--guided-decoding-backend` 标志为 vllm serve "
+"来指定要使用的后端。默认后端为 `auto`，它会根据请求的详细信息尝试选择合适的后端。你也可以选择特定的后端，并设置一些选项。"
+
+#: ../../user_guide/feature_guide/structured_output.md:32
+msgid ""
+"Now let´s see an example for each of the cases, starting with the "
+"guided_choice, as it´s the easiest one:"
+msgstr "现在让我们来看每种情况的示例，首先是 guided_choice，因为它是最简单的："
+
+#: ../../user_guide/feature_guide/structured_output.md:51
+msgid ""
+"The next example shows how to use the guided_regex. The idea is to generate "
+"an email address, given a simple regex template:"
+msgstr "下一个例子展示了如何使用 guided_regex。其思路是基于一个简单的正则表达式模板生成一个电子邮件地址："
+
+#: ../../user_guide/feature_guide/structured_output.md:67
+msgid ""
+"One of the most relevant features in structured text generation is the "
+"option to generate a valid JSON with pre-defined fields and formats. For "
+"this we can use the guided_json parameter in two different ways:"
+msgstr ""
+"在结构化文本生成中，最相关的特性之一是能够生成具有预定义字段和格式的有效 JSON。为此，我们可以通过两种不同的方式使用 guided_json 参数："
+
+#: ../../user_guide/feature_guide/structured_output.md:69
+msgid "Using a JSON Schema."
+msgstr "使用 JSON 架构。"
+
+#: ../../user_guide/feature_guide/structured_output.md:70
+msgid "Defining a Pydantic model and then extracting the JSON Schema from it."
+msgstr "定义一个 Pydantic 模型，然后从中提取 JSON Schema。"
+
+#: ../../user_guide/feature_guide/structured_output.md:72
+msgid ""
+"The next example shows how to use the guided_json parameter with a Pydantic "
+"model:"
+msgstr "下一个示例展示了如何将 guided_json 参数与 Pydantic 模型一起使用："
+
+#: ../../user_guide/feature_guide/structured_output.md:104
+msgid ""
+"Finally we have the guided_grammar option, which is probably the most "
+"difficult to use, but it´s really powerful. It allows us to define complete "
+"languages like SQL queries. It works by using a context free EBNF grammar. "
+"As an example, we can use to define a specific format of simplified SQL "
+"queries:"
+msgstr ""
+"最后，我们有 guided_grammar 选项，这可能是最难使用的，但它非常强大。它允许我们定义完整的语言，比如 SQL 查询。它通过使用上下文无关的"
+" EBNF 语法来实现。例如，我们可以用它来定义一种简化 SQL 查询的特定格式："
+
+#: ../../user_guide/feature_guide/structured_output.md:134
+msgid ""
+"Find more examples [here](https://github.com/vllm-"
+"project/vllm/blob/main/examples/offline_inference/structured_outputs.py)."
+msgstr ""
+"在[这里](https://github.com/vllm-"
+"project/vllm/blob/main/examples/offline_inference/structured_outputs.py)可以找到更多示例。"
+
+#: ../../user_guide/feature_guide/structured_output.md:136
+msgid "Offline Inference"
+msgstr "离线推理"
+
+#: ../../user_guide/feature_guide/structured_output.md:138
+msgid ""
+"To use Structured Output, we'll need to configure the guided decoding using "
+"the class `GuidedDecodingParams` inside `SamplingParams`. The main available"
+" options inside `GuidedDecodingParams` are:"
+msgstr ""
+"要使用结构化输出，我们需要在 `SamplingParams` 内通过 `GuidedDecodingParams` "
+"类配置引导解码。`GuidedDecodingParams` 中主要可用的选项有："
+
+#: ../../user_guide/feature_guide/structured_output.md:140
+msgid "json"
+msgstr "json"
+
+#: ../../user_guide/feature_guide/structured_output.md:141
+msgid "regex"
+msgstr "正则表达式"
+
+#: ../../user_guide/feature_guide/structured_output.md:142
+msgid "choice"
+msgstr "选择"
+
+#: ../../user_guide/feature_guide/structured_output.md:143
+msgid "grammar"
+msgstr "语法"
+
+#: ../../user_guide/feature_guide/structured_output.md:145
+msgid "One example for the usage of the choice parameter is shown below:"
+msgstr "choice 参数用法的一个示例如下："
+
+#: ../../user_guide/feature_guide/structured_output.md:163
+msgid ""
+"Find more examples of other usages [here](https://github.com/vllm-"
+"project/vllm/blob/main/examples/offline_inference/structured_outputs.py)."
+msgstr ""
+"查看更多其他用法的示例 [在这里](https://github.com/vllm-"
+"project/vllm/blob/main/examples/offline_inference/structured_outputs.py)。"
--- a/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/release_notes.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/release_notes.po
--- a/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/support_matrix/index.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/support_matrix/index.po
@@ -0,0 +1,30 @@
+# Translations template for PROJECT.
+# Copyright (C) 2025 ORGANIZATION
+# This file is distributed under the same license as the PROJECT project.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: PROJECT VERSION\n"
+"Report-Msgid-Bugs-To: EMAIL@ADDRESS\n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: LANGUAGE <LL@li.org>\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../user_guide/support_matrix/index.md:5
+msgid "Support Matrix"
+msgstr "支持矩阵"
+
+#: ../../user_guide/support_matrix/index.md:1
+msgid "Features and models"
+msgstr "特性与模型"
+
+#: ../../user_guide/support_matrix/index.md:3
+msgid "This section provides a detailed supported matrix by vLLM Ascend."
+msgstr "本节提供了 vLLM Ascend 的详细支持矩阵。"
--- a/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/support_matrix/supported_features.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/support_matrix/supported_features.po
@@ -0,0 +1,264 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-ascend team
+# This file is distributed under the same license as the vllm-ascend
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-ascend\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../user_guide/support_matrix/supported_features.md:1
+msgid "Feature Support"
+msgstr "功能支持"
+
+#: ../../user_guide/support_matrix/supported_features.md:3
+msgid ""
+"The feature support principle of vLLM Ascend is: **aligned with the vLLM**. "
+"We are also actively collaborating with the community to accelerate support."
+msgstr "vLLM Ascend 的特性支持原则是：**与 vLLM 保持一致**。我们也在积极与社区合作，加快支持进度。"
+
+#: ../../user_guide/support_matrix/supported_features.md:5
+msgid ""
+"You can check the [support status of vLLM V1 Engine][v1_user_guide]. Below "
+"is the feature support status of vLLM Ascend:"
+msgstr "你可以查看 [vLLM V1 引擎的支持状态][v1_user_guide]。下面是 vLLM Ascend 的功能支持情况："
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "Feature"
+msgstr "特性"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "vLLM V0 Engine"
+msgstr "vLLM V0 引擎"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "vLLM V1 Engine"
+msgstr "vLLM V1 引擎"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "Next Step"
+msgstr "下一步"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "Chunked Prefill"
+msgstr "分块预填充"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "🟢 Functional"
+msgstr "🟢 功能性"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "Functional, see detail note: [Chunked Prefill][cp]"
+msgstr "功能性，详见说明：[分块预填充][cp]"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "Automatic Prefix Caching"
+msgstr "自动前缀缓存"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "Functional, see detail note: [vllm-ascend#732][apc]"
+msgstr "可用，请参见详细说明：[vllm-ascend#732][apc]"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "LoRA"
+msgstr "LoRA"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "[vllm-ascend#396][multilora], [vllm-ascend#893][v1 multilora]"
+msgstr "[vllm-ascend#396][multilora]，[vllm-ascend#893][v1 multilora]"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "Prompt adapter"
+msgstr "提示适配器"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "🔴 No plan"
+msgstr "🔴 无计划"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "This feature has been deprecated by vllm."
+msgstr "此功能已被 vllm 弃用。"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "Speculative decoding"
+msgstr "猜测式解码"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "Basic support"
+msgstr "基础支持"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "Pooling"
+msgstr "池化"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "🟡 Planned"
+msgstr "🟡 计划中"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "CI needed and adapting more models; V1 support rely on vLLM support."
+msgstr "需要持续集成（CI）并适配更多模型；V1 的支持依赖于 vLLM 的支持。"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "Enc-dec"
+msgstr "Enc-dec（编码-解码）"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "🔴 NO plan"
+msgstr "🔴 没有计划"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "Plan in 2025.06.30"
+msgstr "2025.06.30 的计划"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "Multi Modality"
+msgstr "多模态"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "[Tutorial][multimodal], optimizing and adapting more models"
+msgstr "[教程][multimodal]，优化和适配更多模型"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "LogProbs"
+msgstr "LogProbs"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "CI needed"
+msgstr "需要持续集成（CI）"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "Prompt logProbs"
+msgstr "提示 logProbs"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "Async output"
+msgstr "异步输出"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "Multi step scheduler"
+msgstr "多步调度器"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "🔴 Deprecated"
+msgstr "🔴 已弃用"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "[vllm#8779][v1_rfc], replaced by [vLLM V1 Scheduler][v1_scheduler]"
+msgstr "[vllm#8779][v1_rfc]，已被 [vLLM V1 调度器][v1_scheduler] 替代"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "Best of"
+msgstr "精选"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "[vllm#13361][best_of], CI needed"
+msgstr "[vllm#13361][best_of]，需要持续集成（CI）"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "Beam search"
+msgstr "束搜索"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "Guided Decoding"
+msgstr "引导解码"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "[vllm-ascend#177][guided_decoding]"
+msgstr "[vllm-ascend#177][guided_decoding]"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "Tensor Parallel"
+msgstr "张量并行"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "Pipeline Parallel"
+msgstr "流水线并行"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "Expert Parallel"
+msgstr "专家并行"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "CI needed; No plan on V0 support"
+msgstr "需要持续集成；没有支持V0的计划"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "Data Parallel"
+msgstr "数据并行"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "CI needed;  No plan on V0 support"
+msgstr "需要 CI；暂无 V0 支持计划"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "Prefill Decode Disaggregation"
+msgstr "预填充 解码 拆分"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "1P1D available, working on xPyD and V1 support."
+msgstr "1P1D 已可用，正在开发 xPyD 和 V1 支持。"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "Quantization"
+msgstr "量化"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "W8A8 available, CI needed; working on more quantization method support"
+msgstr "W8A8 已可用，需要持续集成（CI）；正在开发对更多量化方法的支持。"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "Graph Mode"
+msgstr "图模式"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "🔵 Experimental"
+msgstr "🔵 实验性"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "Experimental, see detail note: [vllm-ascend#767][graph_mode]"
+msgstr "实验性功能，详见说明：[vllm-ascend#767][graph_mode]"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "Sleep Mode"
+msgstr "睡眠模式"
+
+#: ../../user_guide/support_matrix/supported_features.md
+msgid "level=1 available, CI needed, working on V1 support"
+msgstr "level=1 可用，需要CI，正在开发 V1 支持"
+
+#: ../../user_guide/support_matrix/supported_features.md:33
+msgid "🟢 Functional: Fully operational, with ongoing optimizations."
+msgstr "🟢 功能性：完全可用，正在持续优化中。"
+
+#: ../../user_guide/support_matrix/supported_features.md:34
+msgid ""
+"🔵 Experimental: Experimental support, interfaces and functions may change."
+msgstr "🔵 实验性：实验性支持，接口和功能可能会发生变化。"
+
+#: ../../user_guide/support_matrix/supported_features.md:35
+msgid "🚧 WIP: Under active development, will be supported soon."
+msgstr "🚧 WIP：正在积极开发中，很快将会支持。"
+
+#: ../../user_guide/support_matrix/supported_features.md:36
+msgid ""
+"🟡 Planned: Scheduled for future implementation (some may have open "
+"PRs/RFCs)."
+msgstr "🟡 计划中：已安排将来实现（其中一些可能已有开放的PR/RFC）。"
+
+#: ../../user_guide/support_matrix/supported_features.md:37
+msgid "🔴 NO plan / Deprecated: No plan for V0 or deprecated by vLLM v1."
+msgstr "🔴 没有计划 / 已弃用：V0 没有计划或已被 vLLM v1 弃用。"
--- a/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/support_matrix/supported_models.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/support_matrix/supported_models.po
@@ -0,0 +1,214 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-ascend team
+# This file is distributed under the same license as the vllm-ascend
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-ascend\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../user_guide/support_matrix/supported_models.md:1
+msgid "Model Support"
+msgstr "模型支持"
+
+#: ../../user_guide/support_matrix/supported_models.md:3
+msgid "Text-only Language Models"
+msgstr "纯文本语言模型"
+
+#: ../../user_guide/support_matrix/supported_models.md:5
+#: ../../user_guide/support_matrix/supported_models.md:38
+msgid "Generative Models"
+msgstr "生成模型"
+
+#: ../../user_guide/support_matrix/supported_models.md
+msgid "Model"
+msgstr "模型"
+
+#: ../../user_guide/support_matrix/supported_models.md
+msgid "Supported"
+msgstr "支持"
+
+#: ../../user_guide/support_matrix/supported_models.md
+msgid "Note"
+msgstr "注释"
+
+#: ../../user_guide/support_matrix/supported_models.md
+msgid "DeepSeek v3"
+msgstr "DeepSeek v3"
+
+#: ../../user_guide/support_matrix/supported_models.md
+msgid "✅"
+msgstr "✅"
+
+#: ../../user_guide/support_matrix/supported_models.md
+msgid "DeepSeek R1"
+msgstr "DeepSeek R1"
+
+#: ../../user_guide/support_matrix/supported_models.md
+msgid "DeepSeek Distill (Qwen/LLama)"
+msgstr "DeepSeek 精炼（Qwen/LLama）"
+
+#: ../../user_guide/support_matrix/supported_models.md
+msgid "Qwen3"
+msgstr "Qwen3"
+
+#: ../../user_guide/support_matrix/supported_models.md
+msgid "Qwen3-Moe"
+msgstr "Qwen3-Moe"
+
+#: ../../user_guide/support_matrix/supported_models.md
+msgid "Qwen2.5"
+msgstr "Qwen2.5"
+
+#: ../../user_guide/support_matrix/supported_models.md
+msgid "QwQ-32B"
+msgstr "QwQ-32B"
+
+#: ../../user_guide/support_matrix/supported_models.md
+msgid "LLama3.1/3.2"
+msgstr "LLama3.1/3.2"
+
+#: ../../user_guide/support_matrix/supported_models.md
+msgid "Internlm"
+msgstr "Internlm"
+
+#: ../../user_guide/support_matrix/supported_models.md
+msgid "Baichuan"
+msgstr "百川"
+
+#: ../../user_guide/support_matrix/supported_models.md
+msgid "Phi-4-mini"
+msgstr "Phi-4-mini"
+
+#: ../../user_guide/support_matrix/supported_models.md
+msgid "MiniCPM"
+msgstr "MiniCPM"
+
+#: ../../user_guide/support_matrix/supported_models.md
+msgid "MiniCPM3"
+msgstr "MiniCPM3"
+
+#: ../../user_guide/support_matrix/supported_models.md
+msgid "LLama4"
+msgstr "LLama4"
+
+#: ../../user_guide/support_matrix/supported_models.md
+msgid "Mistral"
+msgstr "Mistral"
+
+#: ../../user_guide/support_matrix/supported_models.md
+msgid "Need test"
+msgstr "需要测试"
+
+#: ../../user_guide/support_matrix/supported_models.md
+msgid "DeepSeek v2.5"
+msgstr "DeepSeek v2.5"
+
+#: ../../user_guide/support_matrix/supported_models.md
+msgid "Gemma-2"
+msgstr "Gemma-2"
+
+#: ../../user_guide/support_matrix/supported_models.md
+msgid "Mllama"
+msgstr "Mllama"
+
+#: ../../user_guide/support_matrix/supported_models.md
+msgid "Gemma-3"
+msgstr "Gemma-3"
+
+#: ../../user_guide/support_matrix/supported_models.md
+msgid "❌"
+msgstr "❌"
+
+#: ../../user_guide/support_matrix/supported_models.md
+msgid "[#496](https://github.com/vllm-project/vllm-ascend/issues/496)"
+msgstr "[#496](https://github.com/vllm-project/vllm-ascend/issues/496)"
+
+#: ../../user_guide/support_matrix/supported_models.md
+msgid "ChatGLM"
+msgstr "ChatGLM"
+
+#: ../../user_guide/support_matrix/supported_models.md
+msgid "[#554](https://github.com/vllm-project/vllm-ascend/issues/554)"
+msgstr "[#554](https://github.com/vllm-project/vllm-ascend/issues/554)"
+
+#: ../../user_guide/support_matrix/supported_models.md:29
+msgid "Pooling Models"
+msgstr "池化模型"
+
+#: ../../user_guide/support_matrix/supported_models.md
+msgid "XLM-RoBERTa-based"
+msgstr "基于XLM-RoBERTa"
+
+#: ../../user_guide/support_matrix/supported_models.md
+msgid "Molmo"
+msgstr "Molmo"
+
+#: ../../user_guide/support_matrix/supported_models.md:36
+msgid "Multimodal Language Models"
+msgstr "多模态语言模型"
+
+#: ../../user_guide/support_matrix/supported_models.md
+msgid "Qwen2-VL"
+msgstr "Qwen2-VL"
+
+#: ../../user_guide/support_matrix/supported_models.md
+msgid "Qwen2.5-VL"
+msgstr "Qwen2.5-VL"
+
+#: ../../user_guide/support_matrix/supported_models.md
+msgid "LLaVA 1.5"
+msgstr "LLaVA 1.5"
+
+#: ../../user_guide/support_matrix/supported_models.md
+msgid "LLaVA 1.6"
+msgstr "LLaVA 1.6"
+
+#: ../../user_guide/support_matrix/supported_models.md
+msgid "[#553](https://github.com/vllm-project/vllm-ascend/issues/553)"
+msgstr "[#553](https://github.com/vllm-project/vllm-ascend/issues/553)"
+
+#: ../../user_guide/support_matrix/supported_models.md
+msgid "InternVL2"
+msgstr "InternVL2"
+
+#: ../../user_guide/support_matrix/supported_models.md
+msgid "InternVL2.5"
+msgstr "InternVL2.5"
+
+#: ../../user_guide/support_matrix/supported_models.md
+msgid "Qwen2-Audio"
+msgstr "Qwen2-Audio"
+
+#: ../../user_guide/support_matrix/supported_models.md
+msgid "LLaVA-Next"
+msgstr "LLaVA-Next"
+
+#: ../../user_guide/support_matrix/supported_models.md
+msgid "LLaVA-Next-Video"
+msgstr "LLaVA-Next-Video"
+
+#: ../../user_guide/support_matrix/supported_models.md
+msgid "Phi-3-Vison/Phi-3.5-Vison"
+msgstr "Phi-3-Vison/Phi-3.5-Vison"
+
+#: ../../user_guide/support_matrix/supported_models.md
+msgid "GLM-4v"
+msgstr "GLM-4v"
+
+#: ../../user_guide/support_matrix/supported_models.md
+msgid "Ultravox"
+msgstr "Ultravox"