提交vllm0.11.0开发分支

2025-12-10 17:51:24 +08:00
parent deab7dd0b6
commit 7c22d621fb
175 changed files with 31856 additions and 8683 deletions
--- a/docs/source/locale/zh_CN/LC_MESSAGES/community/contributors.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/community/contributors.po
--- a/docs/source/locale/zh_CN/LC_MESSAGES/community/governance.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/community/governance.po
@@ -0,0 +1,228 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version:  vllm-kunlun\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/community/governance.md:1
+msgid "Governance"
+msgstr "治理"
+
+#: ../../source/community/governance.md:3
+msgid "Mission"
+msgstr "使命"
+
+#~ msgid ""
+#~ "As a vital component of vLLM, the"
+#~ " vLLM Kunlun project is dedicated to"
+#~ " providing an easy, fast, and cheap"
+#~ " LLM Serving for Everyone on Kunlun"
+#~ " XPU, and to actively contribute to"
+#~ " the enrichment of vLLM."
+#~ msgstr ""
+#~ "作为 vLLM 的重要组成部分，vLLM Kunlun 项目致力于为所有人在 "
+#~ "Kunlun XPU 上提供简单、快速且低成本的大语言模型服务，并积极促进 vLLM "
+#~ "的丰富发展。"
+
+#~ msgid "Principles"
+#~ msgstr "原则"
+
+#~ msgid ""
+#~ "vLLM Kunlun follows the vLLM community's"
+#~ " code of conduct：[vLLM - CODE OF "
+#~ "CONDUCT](https://github.com/vllm-"
+#~ "project/vllm/blob/main/CODE_OF_CONDUCT.md)"
+#~ msgstr ""
+#~ "vLLM Kunlun 遵循 vLLM 社区的行为准则：[vLLM - "
+#~ "行为准则](https://github.com/vllm-"
+#~ "project/vllm/blob/main/CODE_OF_CONDUCT.md)"
+
+#~ msgid "Governance - Mechanics"
+#~ msgstr "治理 - 机制"
+
+#~ msgid ""
+#~ "vLLM Kunlun is an open-source "
+#~ "project under the vLLM community, where"
+#~ " the authority to appoint roles is"
+#~ " ultimately determined by the vLLM "
+#~ "community. It adopts a hierarchical "
+#~ "technical governance structure."
+#~ msgstr "vLLM Kunlun 是 vLLM 社区下的一个开源项目，其角色任命权最终由 vLLM 社区决定。它采用分层的技术治理结构。"
+
+#~ msgid "Contributor:"
+#~ msgstr "贡献者："
+
+#~ msgid ""
+#~ "**Responsibility:** Help new contributors on"
+#~ " boarding, handle and respond to "
+#~ "community questions, review RFCs, code"
+#~ msgstr "**职责：** 帮助新贡献者加入，处理和回复社区问题，审查RFC和代码"
+
+#~ msgid ""
+#~ "**Requirements:** Complete at least 1 "
+#~ "contribution. Contributor is someone who "
+#~ "consistently and actively participates in "
+#~ "a project, included but not limited "
+#~ "to issue/review/commits/community involvement."
+#~ msgstr "**要求：** 完成至少1次贡献。贡献者是指持续且积极参与项目的人，包括但不限于问题、评审、提交和社区参与。"
+
+#~ msgid ""
+#~ "Contributors will be empowered [vllm-"
+#~ "project/vllm-kunlun](https://github.com/vllm-project"
+#~ "/vllm-kunlun) Github repo `Triage` "
+#~ "permissions (`Can read and clone this"
+#~ " repository. Can also manage issues "
+#~ "and pull requests`) to help community"
+#~ " developers collaborate more efficiently."
+#~ msgstr ""
+#~ "贡献者将被赋予 [vllm-project/vllm-"
+#~ "kunlun](https://github.com/vllm-project/vllm-kunlun) "
+#~ "Github 仓库的 `Triage` "
+#~ "权限（`可读取和克隆此仓库。还可以管理问题和拉取请求`），以帮助社区开发者更加高效地协作。"
+
+#~ msgid "Maintainer:"
+#~ msgstr "维护者："
+
+#~ msgid ""
+#~ "**Responsibility:** Develop the project's "
+#~ "vision and mission. Maintainers are "
+#~ "responsible for driving the technical "
+#~ "direction of the entire project and "
+#~ "ensuring its overall success, possessing "
+#~ "code merge permissions. They formulate "
+#~ "the roadmap, review contributions from "
+#~ "community members, continuously contribute "
+#~ "code, and actively engage in community"
+#~ " activities (such as regular "
+#~ "meetings/events)."
+#~ msgstr ""
+#~ "**责任：** "
+#~ "制定项目的愿景和使命。维护者负责引领整个项目的技术方向并确保其整体成功，拥有代码合并权限。他们制定路线图，审核社区成员的贡献，持续贡献代码，并积极参与社区活动（如定期会议/活动）。"
+
+#~ msgid ""
+#~ "**Requirements:** Deep understanding of ‌vLLM‌"
+#~ " and ‌vLLM Kunlun‌ codebases, with a"
+#~ " commitment to sustained code "
+#~ "contributions. Competency in ‌design/development/PR"
+#~ " review workflows‌."
+#~ msgstr ""
+#~ "**要求：** 深入理解 ‌vLLM‌ 和 ‌vLLM Kunlun‌ "
+#~ "代码库，并承诺持续贡献代码。具备 ‌设计/开发/PR 审核流程‌ 的能力。"
+
+#~ msgid ""
+#~ "**Review Quality‌:** Actively participate in"
+#~ " community code reviews, ensuring high-"
+#~ "quality code integration."
+#~ msgstr "**评审质量：** 积极参与社区代码评审，确保高质量的代码集成。"
+
+#~ msgid ""
+#~ "**Quality Contribution‌:** Successfully develop "
+#~ "and deliver at least one major "
+#~ "feature while maintaining consistent high-"
+#~ "quality contributions."
+#~ msgstr "**质量贡献‌：** 成功开发并交付至少一个主要功能，同时持续保持高质量的贡献。"
+
+#~ msgid ""
+#~ "**Community Involvement‌:** Actively address "
+#~ "issues, respond to forum inquiries, "
+#~ "participate in discussions, and engage "
+#~ "in community-driven tasks."
+#~ msgstr "**社区参与：** 积极解决问题，回复论坛询问，参与讨论，并参与社区驱动的任务。"
+
+#~ msgid ""
+#~ "Requires approval from existing Maintainers."
+#~ " The vLLM community has the final "
+#~ "decision-making authority."
+#~ msgstr "需要现有维护者的批准。vLLM社区拥有最终决策权。"
+
+#~ msgid ""
+#~ "Maintainer will be empowered [vllm-"
+#~ "project/vllm-kunlun](https://github.com/vllm-project"
+#~ "/vllm-kunlun) Github repo write permissions"
+#~ " (`Can read, clone, and push to "
+#~ "this repository. Can also manage issues"
+#~ " and pull requests`)."
+#~ msgstr ""
+#~ "维护者将被授予 [vllm-project/vllm-"
+#~ "kunlun](https://github.com/vllm-project/vllm-kunlun) "
+#~ "Github 仓库的写入权限（`可以读取、克隆和推送到此仓库。还可以管理问题和拉取请求`）。"
+
+#~ msgid "Nominating and Removing Maintainers"
+#~ msgstr "提名和移除维护者"
+
+#~ msgid "The Principles"
+#~ msgstr "原则"
+
+#~ msgid ""
+#~ "Membership in vLLM Kunlun is given "
+#~ "to individuals on merit basis after "
+#~ "they demonstrated strong expertise of "
+#~ "the vLLM / vLLM Kunlun through "
+#~ "contributions, reviews and discussions."
+#~ msgstr ""
+#~ "vLLM Kunlun 的成员资格是基于个人能力授予的，只有在通过贡献、评审和讨论展示出对 vLLM"
+#~ " / vLLM Kunlun 的深厚专业知识后，才可获得。"
+
+#~ msgid ""
+#~ "For membership in the maintainer group"
+#~ " the individual has to demonstrate "
+#~ "strong and continued alignment with the"
+#~ " overall vLLM / vLLM Kunlun "
+#~ "principles."
+#~ msgstr "要成为维护者组成员，个人必须表现出与 vLLM / vLLM Kunlun 总体原则的高度一致并持续支持。"
+
+#~ msgid ""
+#~ "Light criteria of moving module "
+#~ "maintenance to ‘emeritus’ status if they"
+#~ " don’t actively participate over long "
+#~ "periods of time."
+#~ msgstr "如果模块维护人员在长时间内没有积极参与，可根据较宽松的标准将其维护状态转为“荣誉”状态。"
+
+#~ msgid "The membership is for an individual, not a company."
+#~ msgstr "该会员资格属于个人，而非公司。"
+
+#~ msgid "Nomination and Removal"
+#~ msgstr "提名与罢免"
+
+#~ msgid ""
+#~ "Nomination: Anyone can nominate someone "
+#~ "to become a maintainer (include self-"
+#~ "nominate). All existing maintainers are "
+#~ "responsible for evaluating the nomination. "
+#~ "The nominator should provide nominee's "
+#~ "info around the strength of the "
+#~ "candidate to be a maintainer, include"
+#~ " but not limited to review quality,"
+#~ " quality contribution, community involvement."
+#~ msgstr "提名：任何人都可以提名他人成为维护者（包括自荐）。所有现有维护者都有责任评估提名。提名人应提供被提名人成为维护者的相关优势信息，包括但不限于评审质量、优质贡献、社区参与等。"
+
+#~ msgid ""
+#~ "Removal: Anyone can nominate a person"
+#~ " to be removed from maintainer "
+#~ "position (include self-nominate). All "
+#~ "existing maintainers are responsible for "
+#~ "evaluating the nomination. The nominator "
+#~ "should provide nominee's info, include "
+#~ "but not limited to lack of "
+#~ "activity, conflict with the overall "
+#~ "direction and other information that "
+#~ "makes them unfit to be a "
+#~ "maintainer."
+#~ msgstr "移除：任何人都可以提名某人被移出维护者职位（包括自荐）。所有现有维护者都有责任评估该提名。提名者应提供被提名人的相关信息，包括但不限于缺乏活动、与整体方向冲突以及使其不适合作为维护者的其他信息。"
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/community/user_stories/index.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/community/user_stories/index.po
@@ -0,0 +1,120 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version:  vllm-kunlun\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/community/user_stories/index.md:1
+#, fuzzy
+msgid "User stories"
+msgstr "用户故事"
+
+#~ msgid "More details"
+#~ msgstr "更多细节"
+
+#~ msgid ""
+#~ "Read case studies on how users and"
+#~ " developers solves real, everyday problems"
+#~ " with vLLM Kunlun"
+#~ msgstr "阅读案例研究，了解用户和开发者如何使用 vLLM Kunlun 解决实际日常问题。"
+
+#~ msgid ""
+#~ "[LLaMA-Factory](./llamafactory.md) is an "
+#~ "easy-to-use and efficient platform "
+#~ "for training and fine-tuning large "
+#~ "language models, it supports vLLM Kunlun"
+#~ " to speed up inference since "
+#~ "[LLaMA-Factory#7739](https://github.com/hiyouga/LLaMA-"
+#~ "Factory/pull/7739), gain 2x performance "
+#~ "enhancement of inference."
+#~ msgstr ""
+#~ "[LLaMA-Factory](./llamafactory.md) "
+#~ "是一个易于使用且高效的大语言模型训练与微调平台，自 [LLaMA-"
+#~ "Factory#7739](https://github.com/hiyouga/LLaMA-"
+#~ "Factory/pull/7739) 起支持 vLLM Kunlun 加速推理，推理性能提升"
+#~ " 2 倍。"
+
+#~ msgid ""
+#~ "[Huggingface/trl](https://github.com/huggingface/trl) is a"
+#~ " cutting-edge library designed for "
+#~ "post-training foundation models using "
+#~ "advanced techniques like SFT, PPO and"
+#~ " DPO, it uses vLLM Kunlun since "
+#~ "[v0.17.0](https://github.com/huggingface/trl/releases/tag/v0.17.0) "
+#~ "to support RLHF on Kunlun XPU."
+#~ msgstr ""
+#~ "[Huggingface/trl](https://github.com/huggingface/trl) "
+#~ "是一个前沿的库，专为使用 SFT、PPO 和 DPO "
+#~ "等先进技术对基础模型进行后训练而设计。从 "
+#~ "[v0.17.0](https://github.com/huggingface/trl/releases/tag/v0.17.0) "
+#~ "版本开始，该库利用 vLLM Kunlun 来支持在 Kunlun XPU"
+#~ " 上进行 RLHF。"
+
+#~ msgid ""
+#~ "[MindIE Turbo](https://pypi.org/project/mindie-turbo) "
+#~ "is an LLM inference engine acceleration"
+#~ " plug-in library developed by Baidu"
+#~ " on Kunlun hardware, which includes "
+#~ "self-developed large language model "
+#~ "optimization algorithms and optimizations "
+#~ "related to the inference engine "
+#~ "framework. It supports vLLM Kunlun since"
+#~ " "
+#~ "[2.0rc1](https://www.hikunlun.com/document/detail/zh/mindie/20RC1/AcceleratePlugin/turbodev"
+#~ "/mindie-turbo-0001.html)."
+#~ msgstr ""
+#~ "[MindIE Turbo](https://pypi.org/project/mindie-turbo) "
+#~ "是华为在昇腾硬件上开发的一款用于加速LLM推理引擎的插件库，包含自主研发的大语言模型优化算法及与推理引擎框架相关的优化。从 "
+#~ "[2.0rc1](https://www.hikunlun.com/document/detail/zh/mindie/20RC1/AcceleratePlugin/turbodev"
+#~ "/mindie-turbo-0001.html) 起，支持 vLLM Kunlun。"
+
+#~ msgid ""
+#~ "[GPUStack](https://github.com/gpustack/gpustack) is an "
+#~ "open-source GPU cluster manager for "
+#~ "running AI models. It supports vLLM "
+#~ "Kunlun since "
+#~ "[v0.6.2](https://github.com/gpustack/gpustack/releases/tag/v0.6.2),"
+#~ " see more GPUStack performance evaluation"
+#~ " info on "
+#~ "[link](https://mp.weixin.qq.com/s/pkytJVjcH9_OnffnsFGaew)."
+#~ msgstr ""
+#~ "[GPUStack](https://github.com/gpustack/gpustack) 是一个开源的 "
+#~ "GPU 集群管理器，用于运行 AI 模型。从 "
+#~ "[v0.6.2](https://github.com/gpustack/gpustack/releases/tag/v0.6.2) "
+#~ "版本开始支持 vLLM Kunlun，更多 GPUStack 性能评测信息见 "
+#~ "[链接](https://mp.weixin.qq.com/s/pkytJVjcH9_OnffnsFGaew)。"
+
+#~ msgid ""
+#~ "[verl](https://github.com/volcengine/verl) is a "
+#~ "flexible, efficient and production-ready "
+#~ "RL training library for large language"
+#~ " models (LLMs), uses vLLM Kunlun "
+#~ "since "
+#~ "[v0.4.0](https://github.com/volcengine/verl/releases/tag/v0.4.0), "
+#~ "see more info on [verl x Kunlun"
+#~ " "
+#~ "Quickstart](https://verl.readthedocs.io/en/latest/kunlun_tutorial/kunlun_quick_start.html)."
+#~ msgstr ""
+#~ "[verl](https://github.com/volcengine/verl) "
+#~ "是一个灵活、高效且可用于生产环境的大型语言模型（LLM）强化学习训练库，自 "
+#~ "[v0.4.0](https://github.com/volcengine/verl/releases/tag/v0.4.0) "
+#~ "起支持 vLLM Kunlun，更多信息请参见 [verl x Kunlun"
+#~ " "
+#~ "快速上手](https://verl.readthedocs.io/en/latest/kunlun_tutorial/kunlun_quick_start.html)。"
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/community/user_stories/llamafactory.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/community/user_stories/llamafactory.po
@@ -0,0 +1,108 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version:  vllm-kunlun\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/community/user_stories/llamafactory.md:1
+msgid "LLaMA-Factory"
+msgstr "LLaMA-Factory"
+
+#: ../../source/community/user_stories/llamafactory.md:3
+#, fuzzy
+msgid "**Introduction**"
+msgstr "**关于 / 介绍**"
+
+#: ../../source/community/user_stories/llamafactory.md:5
+msgid ""
+"[LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) is an easy-to-"
+"use and efficient platform for training and fine-tuning large language "
+"models. With LLaMA-Factory, you can fine-tune hundreds of pre-trained "
+"models locally without writing any code."
+msgstr ""
+"[LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) "
+"是一个易于使用且高效的平台，用于训练和微调大型语言模型。有了 LLaMA-"
+"Factory，你可以在本地对数百个预训练模型进行微调，无需编写任何代码。"
+
+#: ../../source/community/user_stories/llamafactory.md:7
+#, fuzzy
+msgid ""
+"LLaMA-Facotory users need to evaluate and inference the model after fine-"
+"tuning."
+msgstr "LLaMA-Facotory 用户需要在对模型进行微调后对模型进行评估和推理。"
+
+#: ../../source/community/user_stories/llamafactory.md:9
+#, fuzzy
+msgid "**Business challenge**"
+msgstr "**业务挑战**"
+
+#: ../../source/community/user_stories/llamafactory.md:11
+#, fuzzy
+msgid ""
+"LLaMA-Factory uses Transformers to perform inference on Kunlun XPUs, but "
+"the speed is slow."
+msgstr "LLaMA-Factory 使用 transformers 在 Kunlun XPU 上进行推理，但速度较慢。"
+
+#: ../../source/community/user_stories/llamafactory.md:13
+#, fuzzy
+msgid "**Benefits with vLLM Kunlun**"
+msgstr "**通过 vLLM Kunlun 解决挑战与收益**"
+
+#: ../../source/community/user_stories/llamafactory.md:15
+msgid ""
+"With the joint efforts of LLaMA-Factory and vLLM Kunlun ([LLaMA-"
+"Factory#7739](https://github.com/hiyouga/LLaMA-Factory/pull/7739)), "
+"LLaMA-Factory has achieved significant performance gains during model "
+"inference. Benchmark results show that its inference speed is now up to "
+"2× faster compared to the Transformers implementation."
+msgstr ""
+
+#: ../../source/community/user_stories/llamafactory.md:17
+msgid "**Learn more**"
+msgstr "**了解更多**"
+
+#: ../../source/community/user_stories/llamafactory.md:19
+#, fuzzy
+msgid ""
+"See more details about LLaMA-Factory and how it uses vLLM Kunlun for "
+"inference on Kunlun XPUs in [LLaMA-Factory Kunlun XPU "
+"Inference](https://llamafactory.readthedocs.io/en/latest/advanced/npu_inference.html)."
+msgstr ""
+"在以下文档中查看更多关于 LLaMA-Factory 以及其如何在 Kunlun XPU 上使用 vLLM Kunlun 进行推理的信息"
+"：[LLaMA-Factory Kunlun XPU "
+"推理](https://llamafactory.readthedocs.io/en/latest/advanced/npu_inference.html)。"
+
+#~ msgid ""
+#~ "With the joint efforts of LLaMA-"
+#~ "Factory and vLLM Kunlun ([LLaMA-"
+#~ "Factory#7739](https://github.com/hiyouga/LLaMA-"
+#~ "Factory/pull/7739)), the performance of "
+#~ "LLaMA-Factory in the model inference "
+#~ "stage has been significantly improved. "
+#~ "According to the test results, the "
+#~ "inference speed of LLaMA-Factory has "
+#~ "been increased to 2x compared to "
+#~ "the transformers version."
+#~ msgstr ""
+#~ "在 LLaMA-Factory 和 vLLM Kunlun "
+#~ "的共同努力下（参见 [LLaMA-Factory#7739](https://github.com/hiyouga"
+#~ "/LLaMA-Factory/pull/7739)），LLaMA-Factory "
+#~ "在模型推理阶段的性能得到了显著提升。根据测试结果，LLaMA-Factory 的推理速度相比 "
+#~ "transformers 版本提升到了 2 倍。"
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/community/versioning_policy.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/community/versioning_policy.po
@@ -0,0 +1,575 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version:  vllm-kunlun\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/community/versioning_policy.md:1
+msgid "Versioning policy"
+msgstr "版本管理策略"
+
+#~ msgid ""
+#~ "Starting with vLLM 0.7.x, the vLLM "
+#~ "Kunlun Plugin ([vllm-project/vllm-"
+#~ "kunlun](https://github.com/vllm-project/vllm-kunlun)) "
+#~ "project follows the [PEP "
+#~ "440](https://peps.python.org/pep-0440/) to publish "
+#~ "matching with vLLM ([vllm-"
+#~ "project/vllm](https://github.com/vllm-project/vllm))."
+#~ msgstr ""
+#~ "从 vLLM 0.7.x 开始，vLLM Kunlun 插件（[vllm-"
+#~ "project/vllm-kunlun](https://github.com/vllm-project"
+#~ "/vllm-kunlun)）项目遵循 [PEP "
+#~ "440](https://peps.python.org/pep-0440/) ，以与 vLLM（[vllm-"
+#~ "project/vllm](https://github.com/vllm-project/vllm)）版本匹配发布。"
+
+#~ msgid "vLLM Kunlun Plugin versions"
+#~ msgstr "vLLM Kunlun 插件版本"
+
+#~ msgid ""
+#~ "Each vLLM Kunlun release will be "
+#~ "versioned: `v[major].[minor].[micro][rcN][.postN]` (such"
+#~ " as `v0.7.3rc1`, `v0.7.3`, `v0.7.3.post1`)"
+#~ msgstr ""
+#~ "每个 vLLM Kunlun "
+#~ "版本将采用以下版本格式：`v[major].[minor].[micro][rcN][.postN]`（例如 "
+#~ "`v0.7.3rc1`、`v0.7.3`、`v0.7.3.post1`）"
+
+#~ msgid ""
+#~ "**Final releases**: will typically be "
+#~ "released every **3 months**, will take"
+#~ " the vLLM upstream release plan and"
+#~ " Kunlun software product release plan "
+#~ "into comprehensive consideration."
+#~ msgstr "**正式版本**：通常每**3个月**发布一次，将综合考虑 vLLM 上游发行计划和昇腾软件产品发行计划。"
+
+#~ msgid ""
+#~ "**Pre releases**: will typically be "
+#~ "released **on demand**, ending with rcN,"
+#~ " represents the Nth release candidate "
+#~ "version, to support early testing by "
+#~ "our users prior to a final "
+#~ "release."
+#~ msgstr "**预发布版本**：通常会**按需发布**，以 rcN 结尾，表示第N个候选发布版本，旨在支持用户在正式发布前进行早期测试。"
+
+#~ msgid ""
+#~ "**Post releases**: will typically be "
+#~ "released **on demand** to support to "
+#~ "address minor errors in a final "
+#~ "release. It's different from [PEP-440 "
+#~ "post release note](https://peps.python.org/pep-0440"
+#~ "/#post-releases) suggestion, it will "
+#~ "contain actual bug fixes considering "
+#~ "that the final release version should"
+#~ " be matched strictly with the vLLM"
+#~ " final release version "
+#~ "(`v[major].[minor].[micro]`). The post version "
+#~ "has to be published as a patch "
+#~ "version of the final release."
+#~ msgstr ""
+#~ "**后续版本**：通常会根据需要发布，以支持解决正式发布中的小错误。这与 [PEP-440 "
+#~ "的后续版本说明](https://peps.python.org/pep-0440/#post-releases) "
+#~ "建议不同，它将包含实际的 bug 修复，因为最终发布版本应严格与 vLLM "
+#~ "的最终发布版本（`v[major].[minor].[micro]`）匹配。后续版本必须以正式发布的补丁版本形式发布。"
+
+#~ msgid "For example:"
+#~ msgstr "例如："
+
+#~ msgid ""
+#~ "`v0.7.x`: it's the first final release"
+#~ " to match the vLLM `v0.7.x` version."
+#~ msgstr "`v0.7.x`：这是第一个与 vLLM `v0.7.x` 版本相匹配的正式发布版本。"
+
+#~ msgid "`v0.7.3rc1`: will be the first pre version of vLLM Kunlun."
+#~ msgstr "`v0.7.3rc1`：将会是 vLLM Kunlun 的第一个预发布版本。"
+
+#~ msgid ""
+#~ "`v0.7.3.post1`: will be the post release"
+#~ " if the `v0.7.3` release has some "
+#~ "minor errors."
+#~ msgstr "`v0.7.3.post1`：如果 `v0.7.3` 版本发布有一些小错误，将作为后续修正版发布。"
+
+#~ msgid "Release Compatibility Matrix"
+#~ msgstr "版本兼容性矩阵"
+
+#~ msgid "Following is the Release Compatibility Matrix for vLLM Kunlun Plugin:"
+#~ msgstr "以下是 vLLM Kunlun 插件的版本兼容性矩阵："
+
+#~ msgid "vLLM Kunlun"
+#~ msgstr "vLLM Kunlun"
+
+#~ msgid "vLLM"
+#~ msgstr "vLLM"
+
+#~ msgid "Python"
+#~ msgstr "Python"
+
+#~ msgid "Stable CANN"
+#~ msgstr "Stable CANN"
+
+#~ msgid "PyTorch/torch_npu"
+#~ msgstr "PyTorch/torch_npu"
+
+#~ msgid "MindIE Turbo"
+#~ msgstr "MindIE Turbo"
+
+#~ msgid "v0.9.2rc1"
+#~ msgstr "v0.9.2rc1"
+
+#~ msgid "v0.9.2"
+#~ msgstr "v0.9.2"
+
+#~ msgid ">= 3.9, < 3.12"
+#~ msgstr ">= 3.9，< 3.12"
+
+#~ msgid "8.1.RC1"
+#~ msgstr "8.1.RC1"
+
+#~ msgid "2.5.1 / 2.5.1.post1.dev20250619"
+#~ msgstr "2.5.1 / 2.5.1.post1.dev20250619"
+
+#~ msgid "v0.9.1rc1"
+#~ msgstr "v0.9.1rc1"
+
+#~ msgid "v0.9.1"
+#~ msgstr "v0.9.1"
+
+#~ msgid "2.5.1 / 2.5.1.post1.dev20250528"
+#~ msgstr "2.5.1 / 2.5.1.post1.dev20250528"
+
+#~ msgid "v0.9.0rc2"
+#~ msgstr "v0.9.0rc2"
+
+#~ msgid "v0.9.0"
+#~ msgstr "v0.9.0"
+
+#~ msgid "2.5.1 / 2.5.1"
+#~ msgstr "2.5.1 / 2.5.1"
+
+#~ msgid "v0.9.0rc1"
+#~ msgstr "v0.9.0rc1"
+
+#~ msgid "v0.8.5rc1"
+#~ msgstr "v0.8.5rc1"
+
+#~ msgid "v0.8.5.post1"
+#~ msgstr "v0.8.5.post1"
+
+#~ msgid "v0.8.4rc2"
+#~ msgstr "v0.8.4rc2"
+
+#~ msgid "v0.8.4"
+#~ msgstr "v0.8.4"
+
+#~ msgid "8.0.0"
+#~ msgstr "8.0.0"
+
+#~ msgid "v0.7.3.post1"
+#~ msgstr "v0.7.3.post1"
+
+#~ msgid "v0.7.3"
+#~ msgstr "v0.7.3"
+
+#~ msgid "2.0rc1"
+#~ msgstr "2.0候选版本1"
+
+#~ msgid "Release cadence"
+#~ msgstr "发布节奏"
+
+#~ msgid "release window"
+#~ msgstr "发布窗口"
+
+#~ msgid "Date"
+#~ msgstr "日期"
+
+#~ msgid "Event"
+#~ msgstr "事件"
+
+#~ msgid "2025.07.11"
+#~ msgstr "2025.07.11"
+
+#~ msgid "Release candidates, v0.9.2rc1"
+#~ msgstr "候选发布版本，v0.9.2rc1"
+
+#~ msgid "2025.06.22"
+#~ msgstr "2025.06.22"
+
+#~ msgid "Release candidates, v0.9.1rc1"
+#~ msgstr "候选发布版本，v0.9.1rc1"
+
+#~ msgid "2025.06.10"
+#~ msgstr "2025.06.10"
+
+#~ msgid "Release candidates, v0.9.0rc2"
+#~ msgstr "候选发布版本，v0.9.0rc2"
+
+#~ msgid "2025.06.09"
+#~ msgstr "2025.06.09"
+
+#~ msgid "Release candidates, v0.9.0rc1"
+#~ msgstr "候选发布版本本，v0.9.0rc1"
+
+#~ msgid "2025.05.29"
+#~ msgstr "2025.05.29"
+
+#~ msgid "v0.7.x post release, v0.7.3.post1"
+#~ msgstr "v0.7.x 补丁版，v0.7.3.post1"
+
+#~ msgid "2025.05.08"
+#~ msgstr "2025.05.08"
+
+#~ msgid "v0.7.x Final release, v0.7.3"
+#~ msgstr "v0.7.x 正式版，v0.7.3"
+
+#~ msgid "2025.05.06"
+#~ msgstr "2025.05.06"
+
+#~ msgid "Release candidates, v0.8.5rc1"
+#~ msgstr "候选发布版本，v0.8.5rc1"
+
+#~ msgid "2025.04.28"
+#~ msgstr "2025.04.28"
+
+#~ msgid "Release candidates, v0.8.4rc2"
+#~ msgstr "候选发布版本，v0.8.4rc2"
+
+#~ msgid "2025.04.18"
+#~ msgstr "2025.04.18"
+
+#~ msgid "Release candidates, v0.8.4rc1"
+#~ msgstr "候选发布版本，v0.8.4rc1"
+
+#~ msgid "2025.03.28"
+#~ msgstr "2025.03.28"
+
+#~ msgid "Release candidates, v0.7.3rc2"
+#~ msgstr "候选发布版本，v0.7.3rc2"
+
+#~ msgid "2025.03.14"
+#~ msgstr "2025.03.14"
+
+#~ msgid "Release candidates, v0.7.3rc1"
+#~ msgstr "候选发布版本，v0.7.3rc1"
+
+#~ msgid "2025.02.19"
+#~ msgstr "2025.02.19"
+
+#~ msgid "Release candidates, v0.7.1rc1"
+#~ msgstr "候选发布版本，v0.7.1rc1"
+
+#~ msgid "Branch policy"
+#~ msgstr "分支策略"
+
+#~ msgid "vLLM Kunlun has main branch and dev branch."
+#~ msgstr "vLLM Kunlun 有主分支和开发分支。"
+
+#~ msgid ""
+#~ "**main**: main branch，corresponds to the "
+#~ "vLLM main branch and latest 1 or"
+#~ " 2 release version. It is "
+#~ "continuously monitored for quality through "
+#~ "Kunlun CI."
+#~ msgstr "**main**：main 分支，对应 vLLM 的主分支和最新的 1 或 2 个发布版本。该分支通过 Kunlun CI 持续监控质量。"
+
+#~ msgid ""
+#~ "**vX.Y.Z-dev**: development branch, created "
+#~ "with part of new releases of vLLM."
+#~ " For example, `v0.7.3-dev` is the dev"
+#~ " branch for vLLM `v0.7.3` version."
+#~ msgstr ""
+#~ "**vX.Y.Z-dev**：开发分支，是随着 vLLM 新版本的一部分一起创建的。例如，`v0.7.3-dev`"
+#~ " 是 vLLM `v0.7.3` 版本的开发分支。"
+
+#~ msgid ""
+#~ "Usually, a commit should be ONLY "
+#~ "first merged in the main branch, "
+#~ "and then backported to the dev "
+#~ "branch to reduce maintenance costs as"
+#~ " much as possible."
+#~ msgstr "通常，提交应该只先合并到主分支，然后再回溯合并到开发分支，以尽可能降低维护成本。"
+
+#~ msgid "Maintenance branch and EOL:"
+#~ msgstr "维护分支与生命周期结束（EOL）："
+
+#~ msgid "The branch status will be in one of the following states:"
+#~ msgstr "分支状态将处于以下几种状态之一："
+
+#~ msgid "Branch"
+#~ msgstr "分支"
+
+#~ msgid "Time frame"
+#~ msgstr "时间范围"
+
+#~ msgid "Summary"
+#~ msgstr "摘要"
+
+#~ msgid "Maintained"
+#~ msgstr "维护中"
+
+#~ msgid "Approximately 2-3 minor versions"
+#~ msgstr "大约 2-3 个小版本"
+
+#~ msgid "All bugfixes are appropriate. Releases produced, CI commitment."
+#~ msgstr "所有的错误修复都是合适的。正常发布版本，持续集成承诺。"
+
+#~ msgid "Unmaintained"
+#~ msgstr "无人维护"
+
+#~ msgid "Community interest driven"
+#~ msgstr "社区兴趣驱动"
+
+#~ msgid "All bugfixes are appropriate. No Releases produced, No CI commitment"
+#~ msgstr "所有的 bug 修复都是合适的。没有发布版本，不承诺持续集成（CI）。"
+
+#~ msgid "End of Life (EOL)"
+#~ msgstr "生命周期结束（EOL）"
+
+#~ msgid "N/A"
+#~ msgstr "不适用"
+
+#~ msgid "Branch no longer accepting changes"
+#~ msgstr "该分支不再接受更改"
+
+#~ msgid "Branch state"
+#~ msgstr "分支状态"
+
+#~ msgid ""
+#~ "Note that vLLM Kunlun will only be"
+#~ " released for a certain vLLM release"
+#~ " version rather than all versions. "
+#~ "Hence, You might see only part of"
+#~ " versions have dev branches (such as"
+#~ " only `0.7.1-dev` / `0.7.3-dev` but "
+#~ "no `0.7.2-dev`), this is as expected."
+#~ msgstr ""
+#~ "请注意，vLLM Kunlun 只会针对某些 vLLM "
+#~ "发布版本发布，而不是所有版本。因此，您可能会看到只有部分版本拥有开发分支（例如只有 `0.7.1-dev` /"
+#~ " `0.7.3-dev`，而没有 `0.7.2-dev`），这是正常现象。"
+
+#~ msgid ""
+#~ "Usually, each minor version of vLLM "
+#~ "(such as 0.7) will correspond to a"
+#~ " vLLM Kunlun version branch and "
+#~ "support its latest version (for example,"
+#~ " we plan to support version 0.7.3)"
+#~ " as following shown:"
+#~ msgstr ""
+#~ "通常，vLLM 的每一个小版本（例如 0.7）都会对应一个 vLLM Kunlun "
+#~ "版本分支，并支持其最新版本（例如，我们计划支持 0.7.3 版），如下所示："
+
+#~ msgid "Status"
+#~ msgstr "状态"
+
+#~ msgid "Note"
+#~ msgstr "注释"
+
+#~ msgid "main"
+#~ msgstr "main"
+
+#~ msgid "CI commitment for vLLM main branch and vLLM 0.9.2 branch"
+#~ msgstr "vLLM 主分支和 vLLM 0.9.2 分支的 CI 承诺"
+
+#~ msgid "v0.9.1-dev"
+#~ msgstr "v0.9.1-dev"
+
+#~ msgid "CI commitment for vLLM 0.9.1 version"
+#~ msgstr "vLLM 0.9.1 版本的 CI 承诺"
+
+#~ msgid "v0.7.3-dev"
+#~ msgstr "v0.7.3-dev"
+
+#~ msgid "CI commitment for vLLM 0.7.3 version"
+#~ msgstr "vLLM 0.7.3 版本的 CI 承诺"
+
+#~ msgid "v0.7.1-dev"
+#~ msgstr "v0.7.1-dev"
+
+#~ msgid "Replaced by v0.7.3-dev"
+#~ msgstr "已被 v0.7.3-dev 替代"
+
+#~ msgid "Backward compatibility"
+#~ msgstr "向后兼容性"
+
+#~ msgid ""
+#~ "For main branch, vLLM Kunlun should "
+#~ "works with vLLM main branch and "
+#~ "latest 1 or 2 release version. So"
+#~ " to ensure the backward compatibility, "
+#~ "we will do the following:"
+#~ msgstr ""
+#~ "对于主分支，vLLM Kunlun 应该与 vLLM 主分支以及最新的 1"
+#~ " 或 2 个发布版本兼容。因此，为了确保向后兼容性，我们将执行以下操作："
+
+#~ msgid ""
+#~ "Both main branch and target vLLM "
+#~ "release is tested by Kunlun E2E "
+#~ "CI. For example, currently, vLLM main"
+#~ " branch and vLLM 0.8.4 are tested "
+#~ "now."
+#~ msgstr "主分支和目标 vLLM 发行版都经过了 Kunlun E2E CI 的测试。例如，目前正在测试 vLLM 主分支和 vLLM 0.8.4。"
+
+#~ msgid ""
+#~ "For code changes, we will make "
+#~ "sure that the changes are compatible "
+#~ "with the latest 1 or 2 vLLM "
+#~ "release version as well. In this "
+#~ "case, vLLM Kunlun introduced a version"
+#~ " check machinism inner the code. "
+#~ "It'll check the version of installed "
+#~ "vLLM package first to decide which "
+#~ "code logic to use. If users hit"
+#~ " the `InvalidVersion` error, it sometimes"
+#~ " means that they have installed an"
+#~ " dev/editable version of vLLM package. "
+#~ "In this case, we provide the env"
+#~ " variable `VLLM_VERSION` to let users "
+#~ "specify the version of vLLM package "
+#~ "to use."
+#~ msgstr ""
+#~ "对于代码更改，我们也会确保这些更改与最新的 1 或 2 个 vLLM "
+#~ "发行版本兼容。在这种情况下，vLLM Kunlun 在代码中引入了版本检查机制。它会先检查已安装的 "
+#~ "vLLM 包的版本，然后决定使用哪段代码逻辑。如果用户遇到 `InvalidVersion` "
+#~ "错误，这有时意味着他们安装了 dev/可编辑版本的 vLLM 包。此时，我们提供了环境变量 "
+#~ "`VLLM_VERSION`，让用户可以指定要使用的 vLLM 包版本。"
+
+#~ msgid ""
+#~ "For documentation changes, we will make"
+#~ " sure that the changes are compatible"
+#~ " with the latest 1 or 2 vLLM"
+#~ " release version as well. Note should"
+#~ " be added if there are any "
+#~ "breaking changes."
+#~ msgstr "对于文档更改，我们会确保这些更改也兼容于最新的1个或2个 vLLM 发布版本。如果有任何重大变更，应添加说明。"
+
+#~ msgid "Document Branch Policy"
+#~ msgstr "文档分支政策"
+
+#~ msgid ""
+#~ "To reduce maintenance costs, **all "
+#~ "branch documentation content should remain "
+#~ "consistent, and version differences can "
+#~ "be controlled via variables in "
+#~ "[docs/source/conf.py](https://github.com/vllm-project/vllm-"
+#~ "kunlun/blob/main/docs/source/conf.py)**. While this "
+#~ "is not a simple task, it is "
+#~ "a principle we should strive to "
+#~ "follow."
+#~ msgstr ""
+#~ "为了减少维护成本，**所有分支的文档内容应保持一致，版本差异可以通过 "
+#~ "[docs/source/conf.py](https://github.com/vllm-project/vllm-"
+#~ "kunlun/blob/main/docs/source/conf.py) "
+#~ "中的变量进行控制**。虽然这并非易事，但这是我们应当努力遵循的原则。"
+
+#~ msgid "Version"
+#~ msgstr "版本"
+
+#~ msgid "Purpose"
+#~ msgstr "用途"
+
+#~ msgid "Code Branch"
+#~ msgstr "代码分支"
+
+#~ msgid "latest"
+#~ msgstr "最新"
+
+#~ msgid "Doc for the latest dev branch"
+#~ msgstr "最新开发分支的文档"
+
+#~ msgid "vX.Y.Z-dev (Will be `main` after the first final release)"
+#~ msgstr "vX.Y.Z-dev（在第一个正式版本发布后将成为 `main`）"
+
+#~ msgid "version"
+#~ msgstr "版本"
+
+#~ msgid "Doc for historical released versions"
+#~ msgstr "历史版本文档"
+
+#~ msgid "Git tags, like vX.Y.Z[rcN]"
+#~ msgstr "Git 标签，如 vX.Y.Z[rcN]"
+
+#~ msgid "stable（not yet released）"
+#~ msgstr "稳定版（尚未发布）"
+
+#~ msgid "Doc for latest final release branch"
+#~ msgstr "最新正式发布分支的文档"
+
+#~ msgid "Will be `vX.Y.Z-dev` after the first official release"
+#~ msgstr "首个正式发布后将会是 `vX.Y.Z-dev`"
+
+#~ msgid "As shown above:"
+#~ msgstr "如上所示："
+
+#~ msgid ""
+#~ "`latest` documentation: Matches the current"
+#~ " maintenance branch `vX.Y.Z-dev` (Will be"
+#~ " `main` after the first final "
+#~ "release). Continuously updated to ensure "
+#~ "usability for the latest release."
+#~ msgstr "`latest` 文档：匹配当前维护分支 `vX.Y.Z-dev`（在首次正式发布后将为 `main`）。持续更新，以确保适用于最新发布版本。"
+
+#~ msgid ""
+#~ "`version` documentation: Corresponds to "
+#~ "specific released versions (e.g., `v0.7.3`,"
+#~ " `v0.7.3rc1`). No further updates after "
+#~ "release."
+#~ msgstr "`version` 文档：对应特定的已发布版本（例如，`v0.7.3`、`v0.7.3rc1`）。发布后不再进行更新。"
+
+#~ msgid ""
+#~ "`stable` documentation (**not yet released**):"
+#~ " Official release documentation. Updates "
+#~ "are allowed in real-time after "
+#~ "release, typically based on vX.Y.Z-dev. "
+#~ "Once stable documentation is available, "
+#~ "non-stable versions should display a "
+#~ "header warning: `You are viewing the "
+#~ "latest developer preview docs. Click "
+#~ "here to view docs for the latest"
+#~ " stable release.`."
+#~ msgstr ""
+#~ "`stable` 文档（**尚未发布**）：官方发布版文档。发布后允许实时更新，通常基于 "
+#~ "vX.Y.Z-dev。一旦稳定版文档可用，非稳定版本应显示一个顶部警告：`您正在查看最新的开发预览文档。点击此处查看最新稳定版本文档。`"
+
+#~ msgid "Software Dependency Management"
+#~ msgstr "软件依赖管理"
+
+#~ msgid ""
+#~ "`torch-xpu`: Kunlun Extension for "
+#~ "PyTorch (torch-xpu) releases a stable"
+#~ " version to [PyPi](https://pypi.org/project/torch-"
+#~ "xpu) every 3 months, a development "
+#~ "version (aka the POC version) every "
+#~ "month, and a nightly version every "
+#~ "day. The PyPi stable version **CAN** "
+#~ "be used in vLLM Kunlun final "
+#~ "version, the monthly dev version **ONLY"
+#~ " CANN** be used in vLLM Kunlun "
+#~ "RC version for rapid iteration, the "
+#~ "nightly version **CANNOT** be used in"
+#~ " vLLM Kunlun any version and "
+#~ "branches."
+#~ msgstr ""
+#~ "`torch-xpu`：Kunlun Extension for PyTorch"
+#~ "（torch-xpu）每 3 个月会在 "
+#~ "[PyPi](https://pypi.org/project/torch-xpu) "
+#~ "上发布一个稳定版本，每个月发布一个开发版本（即 POC 版本），每天发布一个 nightly "
+#~ "版本。PyPi 上的稳定版本**可以**用于 vLLM Kunlun "
+#~ "的正式版本，月度开发版本**只能**用于 vLLM Kunlun 的 "
+#~ "RC（候选发布）版本以便快速迭代，nightly 版本**不能**用于 vLLM Kunlun "
+#~ "的任何版本和分支。"
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/contribution/index.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/contribution/index.po
@@ -0,0 +1,177 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version:  vllm-kunlun\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/developer_guide/contribution/index.md:1
+msgid "Contributing"
+msgstr "贡献"
+
+#: ../../source/developer_guide/contribution/index.md:3
+#, fuzzy
+msgid "Building and Testing"
+msgstr "构建与测试"
+
+#~ msgid "Index"
+#~ msgstr "索引"
+
+#~ msgid ""
+#~ "It's recommended to set up a local"
+#~ " development environment to build and "
+#~ "test before you submit a PR."
+#~ msgstr "建议先搭建本地开发环境来进行构建和测试，再提交 PR。"
+
+#~ msgid "Setup development environment"
+#~ msgstr "搭建开发环境"
+
+#~ msgid ""
+#~ "Theoretically, the vllm-kunlun build is"
+#~ " only supported on Linux because "
+#~ "`vllm-kunlun` dependency `torch_npu` only "
+#~ "supports Linux."
+#~ msgstr ""
+#~ "理论上，vllm-kunlun 构建仅支持 Linux，因为 `vllm-"
+#~ "kunlun` 的依赖项 `torch_npu` 只支持 Linux。"
+
+#~ msgid ""
+#~ "But you can still set up dev "
+#~ "env on Linux/Windows/macOS for linting "
+#~ "and basic test as following commands:"
+#~ msgstr "但你仍然可以在 Linux/Windows/macOS 上按照以下命令设置开发环境，用于代码规约检查和基本测试："
+
+#~ msgid "Run lint locally"
+#~ msgstr "在本地运行 lint"
+
+#~ msgid "Run CI locally"
+#~ msgstr "本地运行CI"
+
+#~ msgid "After complete \"Run lint\" setup, you can run CI locally:"
+#~ msgstr "在完成“运行 lint”设置后，你可以在本地运行 CI："
+
+#~ msgid "Submit the commit"
+#~ msgstr "提交该提交"
+
+#~ msgid ""
+#~ "🎉 Congratulations! You have completed "
+#~ "the development environment setup."
+#~ msgstr "🎉 恭喜！你已经完成了开发环境的搭建。"
+
+#~ msgid "Test locally"
+#~ msgstr "本地测试"
+
+#~ msgid ""
+#~ "You can refer to [Testing](./testing.md) "
+#~ "doc to help you setup testing "
+#~ "environment and running tests locally."
+#~ msgstr "你可以参考 [测试](./testing.md) 文档，帮助你搭建测试环境并在本地运行测试。"
+
+#~ msgid "DCO and Signed-off-by"
+#~ msgstr "DCO 和签名确认"
+
+#~ msgid ""
+#~ "When contributing changes to this "
+#~ "project, you must agree to the "
+#~ "DCO. Commits must include a `Signed-"
+#~ "off-by:` header which certifies "
+#~ "agreement with the terms of the "
+#~ "DCO."
+#~ msgstr "当为本项目贡献更改时，您必须同意 DCO。提交必须包含 `Signed-off-by:` 头部，以证明您同意 DCO 的条款。"
+
+#~ msgid "Using `-s` with `git commit` will automatically add this header."
+#~ msgstr "在使用 `git commit` 时加上 `-s` 参数会自动添加这个头部信息。"
+
+#~ msgid "PR Title and Classification"
+#~ msgstr "PR 标题与分类"
+
+#~ msgid ""
+#~ "Only specific types of PRs will be"
+#~ " reviewed. The PR title is prefixed"
+#~ " appropriately to indicate the type "
+#~ "of change. Please use one of the"
+#~ " following:"
+#~ msgstr "只有特定类型的 PR 会被审核。PR 标题应使用合适的前缀以指明更改类型。请使用以下之一："
+
+#~ msgid "`[Attention]` for new features or optimization in attention."
+#~ msgstr "`[Attention]` 用于注意力机制中新特性或优化。"
+
+#~ msgid "`[Communicator]` for new features or optimization in communicators."
+#~ msgstr "`[Communicator]` 适用于通信器中的新特性或优化。"
+
+#~ msgid "`[ModelRunner]` for new features or optimization in model runner."
+#~ msgstr "`[ModelRunner]` 用于模型运行器中的新功能或优化。"
+
+#~ msgid "`[Platform]` for new features or optimization in platform."
+#~ msgstr "`[Platform]` 用于平台中新功能或优化。"
+
+#~ msgid "`[Worker]` for new features or optimization in worker."
+#~ msgstr "`[Worker]` 用于 worker 的新功能或优化。"
+
+#~ msgid ""
+#~ "`[Core]` for new features or "
+#~ "optimization  in the core vllm-kunlun"
+#~ " logic (such as platform, attention, "
+#~ "communicators, model runner)"
+#~ msgstr "`[Core]` 用于核心 vllm-kunlun 逻辑中的新特性或优化（例如平台、注意力机制、通信器、模型运行器）。"
+
+#~ msgid "`[Kernel]` changes affecting compute kernels and ops."
+#~ msgstr "`[Kernel]` 影响计算内核和操作的更改。"
+
+#~ msgid "`[Bugfix]` for bug fixes."
+#~ msgstr "`[Bugfix]` 用于表示错误修复。"
+
+#~ msgid "`[Doc]` for documentation fixes and improvements."
+#~ msgstr "`[Doc]` 用于文档修复和改进。"
+
+#~ msgid "`[Test]` for tests (such as unit tests)."
+#~ msgstr "`[Test]` 用于测试（如单元测试）。"
+
+#~ msgid "`[CI]` for build or continuous integration improvements."
+#~ msgstr "`[CI]` 用于构建或持续集成的改进。"
+
+#~ msgid ""
+#~ "`[Misc]` for PRs that do not fit"
+#~ " the above categories. Please use "
+#~ "this sparingly."
+#~ msgstr "对于不属于上述类别的 PR，请使用 `[Misc]`。请谨慎使用此标签。"
+
+#~ msgid ""
+#~ "If the PR spans more than one "
+#~ "category, please include all relevant "
+#~ "prefixes."
+#~ msgstr "如果拉取请求（PR）涵盖多个类别，请包含所有相关的前缀。"
+
+#~ msgid "Others"
+#~ msgstr "其他"
+
+#~ msgid ""
+#~ "You may find more information about "
+#~ "contributing to vLLM Kunlun backend "
+#~ "plugin on "
+#~ "[<u>docs.vllm.ai</u>](https://docs.vllm.ai/en/latest/contributing/overview.html)."
+#~ " If you find any problem when "
+#~ "contributing, you can feel free to "
+#~ "submit a PR to improve the doc "
+#~ "to help other developers."
+#~ msgstr ""
+#~ "你可以在 "
+#~ "[<u>docs.vllm.ai</u>](https://docs.vllm.ai/en/latest/contributing/overview.html)"
+#~ " 上找到有关为 vLLM Kunlun "
+#~ "后端插件做贡献的更多信息。如果你在贡献过程中遇到任何问题，欢迎随时提交 PR 来改进文档，以帮助其他开发者。"
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/contribution/multi_node_test.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/contribution/multi_node_test.po
@@ -0,0 +1,133 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-kunlun \n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/developer_guide/contribution/multi_node_test.md:1
+msgid "Multi Node Test"
+msgstr ""
+
+#: ../../source/developer_guide/contribution/multi_node_test.md:3
+msgid ""
+"Multi-Node CI is designed to test distributed scenarios of very large "
+"models, eg: disaggregated_prefill multi DP across multi nodes and so on."
+msgstr ""
+
+#: ../../source/developer_guide/contribution/multi_node_test.md:5
+msgid "How is works"
+msgstr ""
+
+#: ../../source/developer_guide/contribution/multi_node_test.md:7
+msgid ""
+"The following picture shows the basic deployment view of the multi-node "
+"CI mechanism, It shows how the github action interact with "
+"[lws](https://lws.sigs.k8s.io/docs/overview/) (a kind of kubernetes crd "
+"resource)"
+msgstr ""
+
+#: ../../source/developer_guide/contribution/multi_node_test.md:9
+msgid "![alt text](../../assets/deployment.png)"
+msgstr ""
+
+#: ../../source/developer_guide/contribution/multi_node_test.md:9
+#: ../../source/developer_guide/contribution/multi_node_test.md:13
+msgid "alt text"
+msgstr ""
+
+#: ../../source/developer_guide/contribution/multi_node_test.md:11
+msgid ""
+"From the workflow perspective, we can see how the final test script is "
+"executed, The key point is that these two [lws.yaml and "
+"run.sh](https://github.com/vllm-project/vllm-"
+"kunlun/tree/main/tests/e2e/nightly/multi_node/scripts), The former "
+"defines how our k8s cluster is pulled up, and the latter defines the "
+"entry script when the pod is started, Each node executes different logic "
+"according to the "
+"[LWS_WORKER_INDEX](https://lws.sigs.k8s.io/docs/reference/labels-"
+"annotations-and-environment-variables/) environment variable, so that "
+"multiple nodes can form a distributed cluster to perform tasks."
+msgstr ""
+
+#: ../../source/developer_guide/contribution/multi_node_test.md:13
+msgid "![alt text](../../assets/workflow.png)"
+msgstr ""
+
+#: ../../source/developer_guide/contribution/multi_node_test.md:15
+msgid "How to contribute"
+msgstr ""
+
+#: ../../source/developer_guide/contribution/multi_node_test.md:17
+msgid "Upload custom weights"
+msgstr ""
+
+#: ../../source/developer_guide/contribution/multi_node_test.md:19
+msgid ""
+"If you need customized weights, for example, you quantized a w8a8 weight "
+"for DeepSeek-V3 and you want your weight to run on CI, Uploading weights "
+"to ModelScope's [vllm-kunlun](https://www.modelscope.cn/organization"
+"/vllm-kunlun) organization is welcome, If you do not have permission to "
+"upload, please contact @Potabk"
+msgstr ""
+
+#: ../../source/developer_guide/contribution/multi_node_test.md:21
+msgid "Add config yaml"
+msgstr ""
+
+#: ../../source/developer_guide/contribution/multi_node_test.md:23
+msgid ""
+"As the entrypoint script [run.sh](https://github.com/vllm-project/vllm-"
+"kunlun/blob/0bf3f21a987aede366ec4629ad0ffec8e32fe90d/tests/e2e/nightly/multi_node/scripts/run.sh#L106)"
+" shows, A k8s pod startup means traversing all *.yaml files in the "
+"[directory](https://github.com/vllm-project/vllm-"
+"kunlun/tree/main/tests/e2e/nightly/multi_node/config/models), reading and"
+" executing according to different configurations, so what we need to do "
+"is just add \"yamls\" like [DeepSeek-V3.yaml](https://github.com/vllm-"
+"project/vllm-"
+"kunlun/blob/main/tests/e2e/nightly/multi_node/config/models/DeepSeek-V3.yaml)."
+msgstr ""
+
+#: ../../source/developer_guide/contribution/multi_node_test.md:25
+msgid ""
+"Suppose you have **2 nodes** running a 1P1D setup (1 Prefillers + 1 "
+"Decoder):"
+msgstr ""
+
+#: ../../source/developer_guide/contribution/multi_node_test.md:27
+msgid "you may add a config file looks like:"
+msgstr ""
+
+#: ../../source/developer_guide/contribution/multi_node_test.md:69
+msgid ""
+"Add the case to nightly workflow currently, the multi-node test workflow "
+"defined in the [vllm_kunlun_test_nightly_a2/a3.yaml](https://github.com"
+"/vllm-project/vllm-"
+"kunlun/blob/main/.github/workflows/vllm_kunlun_test_nightly_a3.yaml)"
+msgstr ""
+
+#: ../../source/developer_guide/contribution/multi_node_test.md:99
+msgid ""
+"The matrix above defines all the parameters required to add a multi-"
+"machine use case, The parameters worth paying attention to (I mean if you"
+" are adding a new use case) are size and the path to the yaml "
+"configuration file. The former defines the number of nodes required for "
+"your use case, and the latter defines the path to the configuration file "
+"you have completed in step 2."
+msgstr ""
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/contribution/testing.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/contribution/testing.po
@@ -0,0 +1,265 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version:  vllm-kunlun\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/developer_guide/contribution/testing.md:1
+msgid "Testing"
+msgstr "测试"
+
+#: ../../source/developer_guide/contribution/testing.md:3
+#, fuzzy
+msgid ""
+"This document explains how to write E2E tests and unit tests to verify "
+"the implementation of your feature."
+msgstr "本节介绍如何编写端到端测试和单元测试，以验证你的功能实现。"
+
+#: ../../source/developer_guide/contribution/testing.md:5
+#, fuzzy
+msgid "Setup a test environment"
+msgstr "设置测试环境"
+
+#: ../../source/developer_guide/contribution/testing.md:7
+#, fuzzy
+msgid ""
+"The fastest way to setup a test environment is to use the main branch's "
+"container image:"
+msgstr "搭建测试环境最快的方法是使用 main 分支的容器镜像："
+
+#: ../../source/developer_guide/contribution/testing.md
+msgid "Local (CPU)"
+msgstr "本地（CPU）"
+
+#: ../../source/developer_guide/contribution/testing.md:18
+#, fuzzy
+msgid "You can run the unit tests on CPUs with the following steps:"
+msgstr "你可以按照以下步骤在 CPU 上运行单元测试："
+
+#: ../../source/developer_guide/contribution/testing.md
+msgid "Single card"
+msgstr "单张卡片"
+
+#: ../../source/developer_guide/contribution/testing.md:86
+#: ../../source/developer_guide/contribution/testing.md:125
+msgid "After starting the container, you should install the required packages:"
+msgstr "启动容器后，你应该安装所需的软件包："
+
+#: ../../source/developer_guide/contribution/testing.md
+msgid "Multi cards"
+msgstr "多卡"
+
+#: ../../source/developer_guide/contribution/testing.md:139
+msgid "Running tests"
+msgstr "运行测试"
+
+#: ../../source/developer_guide/contribution/testing.md:141
+#, fuzzy
+msgid "Unit tests"
+msgstr "单元测试"
+
+#: ../../source/developer_guide/contribution/testing.md:143
+msgid "There are several principles to follow when writing unit tests:"
+msgstr "编写单元测试时需要遵循几个原则："
+
+#: ../../source/developer_guide/contribution/testing.md:145
+#, fuzzy
+msgid ""
+"The test file path should be consistent with the source file and start "
+"with the `test_` prefix, such as: `vllm_kunlun/worker/worker_v1.py` --> "
+"`tests/ut/worker/test_worker_v1.py`"
+msgstr ""
+"测试文件的路径应与源文件保持一致，并以 `test_` 前缀开头，例如：`vllm_kunlun/worker/worker_v1.py` -->"
+" `tests/ut/worker/test_worker_v1.py`"
+
+#: ../../source/developer_guide/contribution/testing.md:146
+#, fuzzy
+msgid ""
+"The vLLM Kunlun test uses unittest framework. See "
+"[here](https://docs.python.org/3/library/unittest.html#module-unittest) "
+"to understand how to write unit tests."
+msgstr ""
+"vLLM Kunlun 测试使用 unittest "
+"框架，参见[这里](https://docs.python.org/3/library/unittest.html#module-"
+"unittest)了解如何编写单元测试。"
+
+#: ../../source/developer_guide/contribution/testing.md:147
+#, fuzzy
+msgid ""
+"All unit tests can be run on CPUs, so you must mock the device-related "
+"function to host."
+msgstr "所有单元测试都可以在 CPU 上运行，因此你必须将与设备相关的函数模拟为 host。"
+
+#: ../../source/developer_guide/contribution/testing.md:148
+msgid ""
+"Example: [tests/ut/test_kunlun_config.py](https://github.com/vllm-project"
+"/vllm-kunlun/blob/main/tests/ut/test_kunlun_config.py)."
+msgstr ""
+"示例：[tests/ut/test_kunlun_config.py](https://github.com/vllm-project/vllm-"
+"kunlun/blob/main/tests/ut/test_kunlun_config.py)。"
+
+#: ../../source/developer_guide/contribution/testing.md:149
+msgid "You can run the unit tests using `pytest`:"
+msgstr "你可以使用 `pytest` 运行单元测试："
+
+#: ../../source/developer_guide/contribution/testing.md
+#, fuzzy
+msgid "Single-card"
+msgstr "单张卡片"
+
+#: ../../source/developer_guide/contribution/testing.md
+#, fuzzy
+msgid "Multi-card"
+msgstr "多卡"
+
+#: ../../source/developer_guide/contribution/testing.md:196
+msgid "E2E test"
+msgstr "端到端测试"
+
+#: ../../source/developer_guide/contribution/testing.md:198
+#, fuzzy
+msgid ""
+"Although vllm-kunlun CI provides the [E2E test](https://github.com/vllm-"
+"project/vllm-kunlun/blob/main/.github/workflows/vllm_kunlun_test.yaml) on"
+" Kunlun CI, you can run it locally."
+msgstr ""
+"虽然 vllm-kunlun CI 在 Kunlun CI 上提供了 [端到端测试](https://github.com/vllm-"
+"project/vllm-"
+"kunlun/blob/main/.github/workflows/vllm_kunlun_test.yaml)，你也可以在本地运行它。"
+
+#: ../../source/developer_guide/contribution/testing.md:208
+#, fuzzy
+msgid "You can't run the E2E test on CPUs."
+msgstr "你无法在 CPU 上运行 e2e 测试。"
+
+#: ../../source/developer_guide/contribution/testing.md:247
+#, fuzzy
+msgid ""
+"This will reproduce the E2E test. See "
+"[vllm_kunlun_test.yaml](https://github.com/vllm-project/vllm-"
+"kunlun/blob/main/.github/workflows/vllm_kunlun_test.yaml)."
+msgstr ""
+"这将复现端到端测试：[vllm_kunlun_test.yaml](https://github.com/vllm-project/vllm-"
+"kunlun/blob/main/.github/workflows/vllm_kunlun_test.yaml)。"
+
+#: ../../source/developer_guide/contribution/testing.md:249
+msgid "E2E test example:"
+msgstr "E2E 测试示例："
+
+#: ../../source/developer_guide/contribution/testing.md:251
+msgid ""
+"Offline test example: "
+"[`tests/e2e/singlecard/test_offline_inference.py`](https://github.com"
+"/vllm-project/vllm-"
+"kunlun/blob/main/tests/e2e/singlecard/test_offline_inference.py)"
+msgstr ""
+"离线测试示例：[`tests/e2e/singlecard/test_offline_inference.py`](https://github.com"
+"/vllm-project/vllm-"
+"kunlun/blob/main/tests/e2e/singlecard/test_offline_inference.py)"
+
+#: ../../source/developer_guide/contribution/testing.md:252
+msgid ""
+"Online test examples: "
+"[`tests/e2e/singlecard/test_prompt_embedding.py`](https://github.com"
+"/vllm-project/vllm-"
+"kunlun/blob/main/tests/e2e/singlecard/test_prompt_embedding.py)"
+msgstr ""
+"在线测试示例：[`tests/e2e/singlecard/test_prompt_embedding.py`](https://github.com"
+"/vllm-project/vllm-"
+"kunlun/blob/main/tests/e2e/singlecard/test_prompt_embedding.py)"
+
+#: ../../source/developer_guide/contribution/testing.md:253
+msgid ""
+"Correctness test example: "
+"[`tests/e2e/singlecard/test_aclgraph.py`](https://github.com/vllm-project"
+"/vllm-kunlun/blob/main/tests/e2e/singlecard/test_aclgraph.py)"
+msgstr ""
+"正确性测试示例：[`tests/e2e/singlecard/test_aclgraph.py`](https://github.com"
+"/vllm-project/vllm-"
+"kunlun/blob/main/tests/e2e/singlecard/test_aclgraph.py)"
+
+#: ../../source/developer_guide/contribution/testing.md:254
+msgid ""
+"Reduced Layer model test example: [test_torchair_graph_mode.py - "
+"DeepSeek-V3-Pruning](https://github.com/vllm-project/vllm-"
+"kunlun/blob/20767a043cccb3764214930d4695e53941de87ec/tests/e2e/multicard/test_torchair_graph_mode.py#L48)"
+msgstr ""
+"简化层模型测试示例：[test_torchair_graph_mode.py - "
+"DeepSeek-V3-Pruning](https://github.com/vllm-project/vllm-"
+"kunlun/blob/20767a043cccb3764214930d4695e53941de87ec/tests/e2e/multicard/test_torchair_graph_mode.py#L48)"
+
+#: ../../source/developer_guide/contribution/testing.md:256
+#, fuzzy
+msgid ""
+"The CI resource is limited, and you might need to reduce the number of "
+"layers of a model. Below is an example of how to generate a reduced layer"
+" model:"
+msgstr "CI 资源有限，您可能需要减少模型的层数，下面是一个生成减少层数模型的示例："
+
+#: ../../source/developer_guide/contribution/testing.md:257
+#, fuzzy
+msgid ""
+"Fork the original model repo in modelscope. All the files in the repo "
+"except for weights are required."
+msgstr "在 modelscope 中 fork 原始模型仓库，我们需要仓库中的所有文件，除了权重文件。"
+
+#: ../../source/developer_guide/contribution/testing.md:258
+#, python-brace-format
+msgid ""
+"Set `num_hidden_layers` to the expected number of layers, e.g., "
+"`{\"num_hidden_layers\": 2,}`"
+msgstr "将 `num_hidden_layers` 设置为期望的层数，例如 `{\"num_hidden_layers\": 2,}`"
+
+#: ../../source/developer_guide/contribution/testing.md:259
+msgid ""
+"Copy the following python script as `generate_random_weight.py`. Set the "
+"relevant parameters `MODEL_LOCAL_PATH`, `DIST_DTYPE` and "
+"`DIST_MODEL_PATH` as needed:"
+msgstr ""
+"将以下 Python 脚本复制为 `generate_random_weight.py`。根据需要设置相关参数 "
+"`MODEL_LOCAL_PATH`、`DIST_DTYPE` 和 `DIST_MODEL_PATH`："
+
+#: ../../source/developer_guide/contribution/testing.md:277
+msgid "Run doctest"
+msgstr "运行 doctest"
+
+#: ../../source/developer_guide/contribution/testing.md:279
+#, fuzzy
+msgid ""
+"vllm-kunlun provides a `vllm-kunlun/tests/e2e/run_doctests.sh` command to"
+" run all doctests in the doc files. The doctest is a good way to make "
+"sure docs stay current and examples remain executable, which can be run "
+"locally as follows:"
+msgstr ""
+"vllm-kunlun 提供了一个 `vllm-kunlun/tests/e2e/run_doctests.sh` 命令，用于运行文档文件中的所有"
+" doctest。doctest 是确保文档保持最新且示例可执行的好方法，你可以按照以下方式在本地运行它："
+
+#: ../../source/developer_guide/contribution/testing.md:287
+#, fuzzy
+msgid ""
+"This will reproduce the same environment as the CI. See "
+"[vllm_kunlun_doctest.yaml](https://github.com/vllm-project/vllm-"
+"kunlun/blob/main/.github/workflows/vllm_kunlun_doctest.yaml)."
+msgstr ""
+"这将复现与 CI 相同的环境：[vllm_kunlun_doctest.yaml](https://github.com/vllm-project"
+"/vllm-kunlun/blob/main/.github/workflows/vllm_kunlun_doctest.yaml)。"
+
+#~ msgid "Multi cards test"
+#~ msgstr "多卡测试"
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/evaluation/accuracy_report/DeepSeek-V2-Lite.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/evaluation/accuracy_report/DeepSeek-V2-Lite.po
@@ -0,0 +1,26 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-kunlun \n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/developer_guide/evaluation/accuracy_report/DeepSeek-V2-Lite.md:1
+msgid "deepseek-ai/DeepSeek-V2-Lite"
+msgstr ""
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/evaluation/accuracy_report/Qwen2.5-VL-7B-Instruct.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/evaluation/accuracy_report/Qwen2.5-VL-7B-Instruct.po
@@ -0,0 +1,26 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-kunlun \n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/developer_guide/evaluation/accuracy_report/Qwen2.5-VL-7B-Instruct.md:1
+msgid "Qwen/Qwen2.5-VL-7B-Instruct"
+msgstr ""
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/evaluation/accuracy_report/Qwen3-30B-A3B.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/evaluation/accuracy_report/Qwen3-30B-A3B.po
@@ -0,0 +1,26 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-kunlun \n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/developer_guide/evaluation/accuracy_report/Qwen3-30B-A3B.md:1
+msgid "Qwen/Qwen3-30B-A3B"
+msgstr ""
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/evaluation/accuracy_report/Qwen3-8B-Base.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/evaluation/accuracy_report/Qwen3-8B-Base.po
@@ -0,0 +1,26 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-kunlun \n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/developer_guide/evaluation/accuracy_report/Qwen3-8B-Base.md:1
+msgid "Qwen/Qwen3-8B-Base"
+msgstr ""
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/evaluation/accuracy_report/index.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/evaluation/accuracy_report/index.po
@@ -0,0 +1,26 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-kunlun\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../developer_guide/evaluation/accuracy_report/index.md:1
+#: ../../developer_guide/evaluation/accuracy_report/index.md:3
+msgid "Accuracy Report"
+msgstr "准确性报告"
--- a/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/evaluation/index.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/evaluation/index.po
@@ -0,0 +1,26 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-kunlun\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../developer_guide/evaluation/index.md:1
+#: ../../developer_guide/evaluation/index.md:3
+msgid "Accuracy"
+msgstr "准确性"
--- a/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/evaluation/using_ais_bench.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/evaluation/using_ais_bench.po
@@ -0,0 +1,26 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-kunlun \n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/developer_guide/evaluation/using_ais_bench.md:1
+msgid "Using AISBench"
+msgstr ""
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/evaluation/using_evalscope.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/evaluation/using_evalscope.po
@@ -0,0 +1,100 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version:  vllm-kunlun\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/developer_guide/evaluation/using_evalscope.md:1
+msgid "Using EvalScope"
+msgstr "使用 EvalScope"
+
+#~ msgid ""
+#~ "This document will guide you have "
+#~ "model inference stress testing and "
+#~ "accuracy testing using "
+#~ "[EvalScope](https://github.com/modelscope/evalscope)."
+#~ msgstr ""
+#~ "本文档将指导您如何使用 [EvalScope](https://github.com/modelscope/evalscope)"
+#~ " 进行模型推理压力测试和精度测试。"
+
+#~ msgid "1. Online serving"
+#~ msgstr "1. 在线服务"
+
+#~ msgid "You can run docker container to start the vLLM server on a single XPU:"
+#~ msgstr "你可以运行 docker 容器，在单个 XPU 上启动 vLLM 服务器："
+
+#~ msgid "If your service start successfully, you can see the info shown below:"
+#~ msgstr "如果你的服务启动成功，你会看到如下所示的信息："
+
+#~ msgid ""
+#~ "Once your server is started, you "
+#~ "can query the model with input "
+#~ "prompts in new terminal:"
+#~ msgstr "一旦你的服务器启动后，你可以在新的终端中用输入提示词查询模型："
+
+#~ msgid "2. Install EvalScope using pip"
+#~ msgstr "2. 使用 pip 安装 EvalScope"
+
+#~ msgid "You can install EvalScope by using:"
+#~ msgstr "你可以使用以下方式安装 EvalScope："
+
+#~ msgid "3. Run gsm8k accuracy test using EvalScope"
+#~ msgstr "3. 使用 EvalScope 运行 gsm8k 准确率测试"
+
+#~ msgid "You can `evalscope eval` run gsm8k accuracy test:"
+#~ msgstr "你可以使用 `evalscope eval` 运行 gsm8k 准确率测试："
+
+#~ msgid "After 1-2 mins, the output is as shown below:"
+#~ msgstr "1-2 分钟后，输出如下所示："
+
+#~ msgid ""
+#~ "See more detail in: [EvalScope doc "
+#~ "- Model API Service "
+#~ "Evaluation](https://evalscope.readthedocs.io/en/latest/get_started/basic_usage.html"
+#~ "#model-api-service-evaluation)."
+#~ msgstr ""
+#~ "更多详情请见：[EvalScope 文档 - 模型 API "
+#~ "服务评测](https://evalscope.readthedocs.io/en/latest/get_started/basic_usage.html"
+#~ "#model-api-service-evaluation)。"
+
+#~ msgid "4. Run model inference stress testing using EvalScope"
+#~ msgstr "4. 使用 EvalScope 运行模型推理压力测试"
+
+#~ msgid "Install EvalScope[perf] using pip"
+#~ msgstr "使用 pip 安装 EvalScope[perf]"
+
+#~ msgid "Basic usage"
+#~ msgstr "基本用法"
+
+#~ msgid "You can use `evalscope perf` run perf test:"
+#~ msgstr "你可以使用 `evalscope perf` 运行性能测试："
+
+#~ msgid "Output results"
+#~ msgstr "输出结果"
+
+#~ msgid ""
+#~ "See more detail in: [EvalScope doc "
+#~ "- Model Inference Stress "
+#~ "Testing](https://evalscope.readthedocs.io/en/latest/user_guides/stress_test/quick_start.html"
+#~ "#basic-usage)."
+#~ msgstr ""
+#~ "更多详情见：[EvalScope 文档 - "
+#~ "模型推理压力测试](https://evalscope.readthedocs.io/en/latest/user_guides/stress_test/quick_start.html"
+#~ "#basic-usage)。"
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/evaluation/using_lm_eval.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/evaluation/using_lm_eval.po
@@ -0,0 +1,62 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version:  vllm-kunlun\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/developer_guide/evaluation/using_lm_eval.md:1
+msgid "Using lm-eval"
+msgstr "使用 lm-eval"
+
+#~ msgid ""
+#~ "This document will guide you have "
+#~ "a accuracy testing using [lm-"
+#~ "eval](https://github.com/EleutherAI/lm-evaluation-"
+#~ "harness)."
+#~ msgstr ""
+#~ "本文将指导你如何使用 [lm-eval](https://github.com/EleutherAI/lm-"
+#~ "evaluation-harness) 进行准确率测试。"
+
+#~ msgid "1. Run docker container"
+#~ msgstr "1. 运行 docker 容器"
+
+#~ msgid "You can run docker container on a single XPU:"
+#~ msgstr "你可以在单个XPU上运行docker容器："
+
+#~ msgid "2. Run ceval accuracy test using lm-eval"
+#~ msgstr "2. 使用 lm-eval 运行 ceval 准确性测试"
+
+#~ msgid "Install lm-eval in the container."
+#~ msgstr "在容器中安装 lm-eval。"
+
+#~ msgid "Run the following command:"
+#~ msgstr "运行以下命令："
+
+#~ msgid "After 1-2 mins, the output is as shown below:"
+#~ msgstr "1-2 分钟后，输出如下所示："
+
+#~ msgid ""
+#~ "You can see more usage on [Lm-"
+#~ "eval Docs](https://github.com/EleutherAI/lm-evaluation-"
+#~ "harness/blob/main/docs/README.md)."
+#~ msgstr ""
+#~ "你可以在 [Lm-eval 文档](https://github.com/EleutherAI"
+#~ "/lm-evaluation-harness/blob/main/docs/README.md) "
+#~ "上查看更多用法。"
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/evaluation/using_opencompass.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/evaluation/using_opencompass.po
@@ -0,0 +1,77 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version:  vllm-kunlun\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/developer_guide/evaluation/using_opencompass.md:1
+msgid "Using OpenCompass"
+msgstr "使用 OpenCompass"
+
+#~ msgid ""
+#~ "This document will guide you have "
+#~ "a accuracy testing using "
+#~ "[OpenCompass](https://github.com/open-compass/opencompass)."
+#~ msgstr ""
+#~ "本文档将指导你如何使用 [OpenCompass](https://github.com/open-"
+#~ "compass/opencompass) 进行准确率测试。"
+
+#~ msgid "1. Online Serving"
+#~ msgstr "1. 在线服务"
+
+#~ msgid "You can run docker container to start the vLLM server on a single XPU:"
+#~ msgstr "你可以运行 docker 容器，在单个 XPU 上启动 vLLM 服务器："
+
+#~ msgid "If your service start successfully, you can see the info shown below:"
+#~ msgstr "如果你的服务启动成功，你会看到如下所示的信息："
+
+#~ msgid ""
+#~ "Once your server is started, you "
+#~ "can query the model with input "
+#~ "prompts in new terminal:"
+#~ msgstr "一旦你的服务器启动后，你可以在新的终端中用输入提示词查询模型："
+
+#~ msgid "2. Run ceval accuracy test using OpenCompass"
+#~ msgstr "2. 使用 OpenCompass 运行 ceval 准确率测试"
+
+#~ msgid ""
+#~ "Install OpenCompass and configure the "
+#~ "environment variables in the container."
+#~ msgstr "在容器中安装 OpenCompass 并配置环境变量。"
+
+#~ msgid ""
+#~ "Add `opencompass/configs/eval_vllm_kunlun_demo.py` with"
+#~ " the following content:"
+#~ msgstr "添加 `opencompass/configs/eval_vllm_kunlun_demo.py`，内容如下："
+
+#~ msgid "Run the following command:"
+#~ msgstr "运行以下命令："
+
+#~ msgid "After 1-2 mins, the output is as shown below:"
+#~ msgstr "1-2 分钟后，输出如下所示："
+
+#~ msgid ""
+#~ "You can see more usage on "
+#~ "[OpenCompass "
+#~ "Docs](https://opencompass.readthedocs.io/en/latest/index.html)."
+#~ msgstr ""
+#~ "你可以在 [OpenCompass "
+#~ "文档](https://opencompass.readthedocs.io/en/latest/index.html) "
+#~ "查看更多用法。"
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/feature_guide/ACL_Graph.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/feature_guide/ACL_Graph.po
@@ -0,0 +1,26 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-kunlun \n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/developer_guide/feature_guide/ACL_Graph.md:1
+msgid "Graph"
+msgstr ""
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/feature_guide/KV_Cache_Pool_Guide.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/feature_guide/KV_Cache_Pool_Guide.po
@@ -0,0 +1,30 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-kunlun \n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/developer_guide/feature_guide/KV_Cache_Pool_Guide.md:1
+msgid "KV Cache Pool"
+msgstr ""
+
+#: ../../source/developer_guide/feature_guide/KV_Cache_Pool_Guide.md:3
+msgid "Why KV Cache Pool?"
+msgstr ""
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/feature_guide/ModelRunner_prepare_inputs.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/feature_guide/ModelRunner_prepare_inputs.po
@@ -0,0 +1,26 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-kunlun \n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/developer_guide/feature_guide/ModelRunner_prepare_inputs.md:1
+msgid "Prepare inputs for model forwarding"
+msgstr ""
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/feature_guide/Multi_Token_Prediction.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/feature_guide/Multi_Token_Prediction.po
@@ -0,0 +1,26 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-kunlun \n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/developer_guide/feature_guide/Multi_Token_Prediction.md:1
+msgid "Multi Token Prediction (MTP)"
+msgstr ""
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/feature_guide/eplb_swift_balancer.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/feature_guide/eplb_swift_balancer.po
@@ -0,0 +1,26 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-kunlun \n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/developer_guide/feature_guide/eplb_swift_balancer.md:1
+msgid "Expert Parallelism Load Balancer (EPLB)"
+msgstr ""
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/feature_guide/index.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/feature_guide/index.po
@@ -0,0 +1,33 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-kunlun\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../developer_guide/feature_guide/index.md:1
+#: ../../developer_guide/feature_guide/index.md:5
+msgid "Feature Guide"
+msgstr "功能指南"
+
+#: ../../developer_guide/feature_guide/index.md:3
+msgid ""
+"This section provides an overview of the features implemented in vLLM "
+"Kunlun. Developers can refer to this guide to understand how vLLM Kunlun "
+"works."
+msgstr "本节概述了 vLLM Kunlun 中实现的功能。开发者可以参考本指南以了解 vLLM Kunlun 的工作原理。"
--- a/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/feature_guide/patch.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/feature_guide/patch.po
@@ -0,0 +1,288 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version:  vllm-kunlun\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/developer_guide/feature_guide/patch.md:1
+#, fuzzy
+msgid "Patch in vLLM"
+msgstr "在 vLLM Kunlun 中的补丁"
+
+#~ msgid ""
+#~ "vLLM Kunlun is a platform plugin "
+#~ "for vLLM. Due to the release cycle"
+#~ " of vLLM and vLLM Kunlun is "
+#~ "different, and the hardware limitation "
+#~ "in some case, we need to patch "
+#~ "some code in vLLM to make it "
+#~ "compatible with vLLM Kunlun."
+#~ msgstr ""
+#~ "vLLM Kunlun 是 vLLM 的一个平台插件。由于 vLLM "
+#~ "和 vLLM Kunlun 的发布周期不同，并且在某些情况下存在硬件限制，我们需要对 "
+#~ "vLLM 进行一些代码补丁，以使其能够兼容 vLLM Kunlun。"
+
+#~ msgid ""
+#~ "In vLLM Kunlun code, we provide a"
+#~ " patch module `vllm_kunlun/patch` to "
+#~ "address the change for vLLM."
+#~ msgstr "在 vLLM Kunlun 代码中，我们提供了一个补丁模块 `vllm_kunlun/patch` 用于应对 vLLM 的变更。"
+
+#~ msgid "Principle"
+#~ msgstr "原理"
+
+#~ msgid ""
+#~ "We should keep in mind that Patch"
+#~ " is not the best way to make"
+#~ " vLLM Kunlun compatible. It's just a"
+#~ " temporary solution. The best way is"
+#~ " to contribute the change to vLLM "
+#~ "to make it compatible with vLLM "
+#~ "Kunlun originally. In vLLM Kunlun, we"
+#~ " have the basic principle for Patch"
+#~ " strategy:"
+#~ msgstr ""
+#~ "我们需要记住，Patch 不是让 vLLM 兼容 Kunlun "
+#~ "的最佳方式，这只是一个临时的解决方案。最好的方法是将修改贡献到 vLLM 项目中，从而让 vLLM"
+#~ " 原生支持 Kunlun。对于 vLLM Kunlun，我们对 Patch "
+#~ "策略有一个基本原则："
+
+#~ msgid "Less is more. Please do not patch unless it's the only way currently."
+#~ msgstr "少即是多。请不要打补丁，除非这是目前唯一的方法。"
+
+#~ msgid ""
+#~ "Once a patch is added, it's "
+#~ "required to describe the future plan "
+#~ "for removing the patch."
+#~ msgstr "一旦补丁被添加，必须说明将来移除该补丁的计划。"
+
+#~ msgid "Anytime, clean the patch code is welcome."
+#~ msgstr "任何时候，欢迎清理补丁代码。"
+
+#~ msgid "How it works"
+#~ msgstr "工作原理"
+
+#~ msgid "In `vllm_kunlun/patch`, you can see the code structure as follows:"
+#~ msgstr "在 `vllm_kunlun/patch` 目录中，你可以看到如下代码结构："
+
+#~ msgid ""
+#~ "**platform**: The patch code in this "
+#~ "directory is for patching the code "
+#~ "in vLLM main process. It's called "
+#~ "by `vllm_kunlun/platform::XPUPlatform::pre_register_and_update`"
+#~ " very early when vLLM is initialized."
+#~ msgstr ""
+#~ "**platform**：此目录下的补丁代码用于修补 vLLM 主进程中的代码。当 vLLM "
+#~ "初始化时，会在很早的阶段由 "
+#~ "`vllm_kunlun/platform::XPUPlatform::pre_register_and_update` 调用。"
+
+#~ msgid ""
+#~ "For online mode, vLLM process calls "
+#~ "the platform patch here "
+#~ "`vllm/vllm/engine/arg_utils.py::AsyncEngineArgs.add_cli_args` "
+#~ "when parsing the cli args."
+#~ msgstr ""
+#~ "对于在线模式，vLLM 进程在解析命令行参数时，会在 "
+#~ "`vllm/vllm/engine/arg_utils.py::AsyncEngineArgs.add_cli_args` "
+#~ "这里调用平台补丁。"
+
+#~ msgid ""
+#~ "For offline mode, vLLM process calls "
+#~ "the platform patch here "
+#~ "`vllm/vllm/engine/arg_utils.py::EngineArgs.create_engine_config` "
+#~ "when parsing the input parameters."
+#~ msgstr ""
+#~ "对于离线模式，vLLM 进程在解析输入参数时，会在此处调用平台补丁 "
+#~ "`vllm/vllm/engine/arg_utils.py::EngineArgs.create_engine_config`。"
+
+#~ msgid ""
+#~ "**worker**: The patch code in this "
+#~ "directory is for patching the code "
+#~ "in vLLM worker process. It's called "
+#~ "by `vllm_kunlun/worker/worker_v1::XPUWorker::__init__` "
+#~ "when the vLLM worker process is "
+#~ "initialized."
+#~ msgstr ""
+#~ "**worker**：此目录中的补丁代码用于修补 vLLM worker 进程中的代码。在初始化 "
+#~ "vLLM worker 进程时，会被 "
+#~ "`vllm_kunlun/worker/worker_v1::XPUWorker::__init__` 调用。"
+
+#~ msgid ""
+#~ "For both online and offline mode, "
+#~ "vLLM engine core process calls the "
+#~ "worker patch here "
+#~ "`vllm/vllm/worker/worker_base.py::WorkerWrapperBase.init_worker` "
+#~ "when initializing the worker process."
+#~ msgstr ""
+#~ "无论是在线还是离线模式，vLLM 引擎核心进程在初始化 worker 进程时，都会在这里调用 "
+#~ "worker "
+#~ "补丁：`vllm/vllm/worker/worker_base.py::WorkerWrapperBase.init_worker`。"
+
+#~ msgid ""
+#~ "In both **platform** and **worker** "
+#~ "folder, there are several patch modules."
+#~ " They are used for patching different"
+#~ " version of vLLM."
+#~ msgstr "在 **platform** 和 **worker** 文件夹中都有一些补丁模块。它们用于修补不同版本的 vLLM。"
+
+#~ msgid ""
+#~ "`patch_0_9_2`: This module is used for"
+#~ " patching vLLM 0.9.2. The version is"
+#~ " always the nearest version of vLLM."
+#~ " Once vLLM is released, we will "
+#~ "drop this patch module and bump to"
+#~ " a new version. For example, "
+#~ "`patch_0_9_2` is used for patching vLLM"
+#~ " 0.9.2."
+#~ msgstr ""
+#~ "`patch_0_9_2`：此模块用于修补 vLLM 0.9.2。该版本始终对应于 vLLM "
+#~ "的最近版本。一旦 vLLM 发布新版本，我们将移除此补丁模块并升级到新版本。例如，`patch_0_9_2` "
+#~ "就是用于修补 vLLM 0.9.2 的。"
+
+#~ msgid ""
+#~ "`patch_main`: This module is used for"
+#~ " patching the code in vLLM main "
+#~ "branch."
+#~ msgstr "`patch_main`：该模块用于修补 vLLM 主分支代码。"
+
+#~ msgid ""
+#~ "`patch_common`: This module is used for"
+#~ " patching both vLLM 0.9.2 and vLLM"
+#~ " main branch."
+#~ msgstr "`patch_common`：此模块用于同时修补 vLLM 0.9.2 版本和 vLLM 主分支。"
+
+#~ msgid "How to write a patch"
+#~ msgstr "如何撰写补丁"
+
+#~ msgid ""
+#~ "Before writing a patch, following the"
+#~ " principle above, we should patch the"
+#~ " least code. If it's necessary, we"
+#~ " can patch the code in either "
+#~ "**platform** and **worker** folder. Here "
+#~ "is an example to patch `distributed` "
+#~ "module in vLLM."
+#~ msgstr ""
+#~ "在编写补丁之前，遵循上述原则，我们应尽量修改最少的代码。如果有必要，我们可以修改 **platform** 和"
+#~ " **worker** 文件夹中的代码。下面是一个在 vLLM 中修改 "
+#~ "`distributed` 模块的示例。"
+
+#~ msgid ""
+#~ "Decide which version of vLLM we "
+#~ "should patch. For example, after "
+#~ "analysis, here we want to patch "
+#~ "both 0.9.2 and main of vLLM."
+#~ msgstr "决定我们应该修补哪个版本的 vLLM。例如，经过分析后，这里我们想要同时修补 vLLM 的 0.9.2 版和主分支（main）。"
+
+#~ msgid ""
+#~ "Decide which process we should patch."
+#~ " For example, here `distributed` belongs"
+#~ " to the vLLM main process, so "
+#~ "we should patch `platform`."
+#~ msgstr "决定我们应该修补哪个进程。例如，这里 `distributed` 属于 vLLM 主进程，所以我们应该修补 `platform`。"
+
+#~ msgid ""
+#~ "Create the patch file in the right"
+#~ " folder. The file should be named "
+#~ "as `patch_{module_name}.py`. The example here"
+#~ " is "
+#~ "`vllm_kunlun/patch/platform/patch_common/patch_distributed.py`."
+#~ msgstr ""
+#~ "在正确的文件夹中创建补丁文件。文件应命名为 `patch_{module_name}.py`。此处的示例是 "
+#~ "`vllm_kunlun/patch/platform/patch_common/patch_distributed.py`。"
+
+#~ msgid "Write your patch code in the new file. Here is an example:"
+#~ msgstr "在新文件中编写你的补丁代码。以下是一个示例："
+
+#~ msgid ""
+#~ "Import the patch file in `__init__.py`."
+#~ " In this example, add `import "
+#~ "vllm_kunlun.patch.platform.patch_common.patch_distributed` into"
+#~ " `vllm_kunlun/patch/platform/patch_common/__init__.py`."
+#~ msgstr ""
+#~ "在 `__init__.py` 中导入补丁文件。在这个示例中，将 `import "
+#~ "vllm_kunlun.patch.platform.patch_common.patch_distributed` 添加到"
+#~ " `vllm_kunlun/patch/platform/patch_common/__init__.py` 中。"
+
+#~ msgid ""
+#~ "Add the description of the patch "
+#~ "in `vllm_kunlun/patch/__init__.py`. The description"
+#~ " format is as follows:"
+#~ msgstr "在 `vllm_kunlun/patch/__init__.py` 中添加补丁的描述。描述格式如下："
+
+#~ msgid ""
+#~ "Add the Unit Test and E2E Test."
+#~ " Any newly added code in vLLM "
+#~ "Kunlun should contain the Unit Test "
+#~ "and E2E Test as well. You can "
+#~ "find more details in [test "
+#~ "guide](../contribution/testing.md)"
+#~ msgstr ""
+#~ "添加单元测试和端到端（E2E）测试。在 vLLM Kunlun "
+#~ "中新增的任何代码也应包含单元测试和端到端测试。更多详情请参见 "
+#~ "[测试指南](../contribution/testing.md)。"
+
+#~ msgid "Limitation"
+#~ msgstr "限制"
+
+#~ msgid ""
+#~ "In V1 Engine, vLLM starts three "
+#~ "kinds of process: Main process, "
+#~ "EngineCore process and Worker process. "
+#~ "Now vLLM Kunlun only support patch "
+#~ "the code in Main process and "
+#~ "Worker process by default. If you "
+#~ "want to patch the code runs in "
+#~ "EngineCore process, you should patch "
+#~ "EngineCore process entirely during setup, "
+#~ "the entry code is here "
+#~ "`vllm.v1.engine.core`. Please override "
+#~ "`EngineCoreProc` and `DPEngineCoreProc` entirely."
+#~ msgstr ""
+#~ "在 V1 引擎中，vLLM 会启动三种类型的进程：主进程、EngineCore 进程和"
+#~ " Worker 进程。现在 vLLM Kunlun 默认只支持在主进程和 "
+#~ "Worker 进程中打补丁代码。如果你想要在 EngineCore 进程中打补丁，你需要在设置阶段对"
+#~ " EngineCore 进程整体打补丁，入口代码在 `vllm.v1.engine.core`。请完全重写"
+#~ " `EngineCoreProc` 和 `DPEngineCoreProc`。"
+
+#~ msgid ""
+#~ "If you are running an edited vLLM"
+#~ " code, the version of the vLLM "
+#~ "may be changed automatically. For "
+#~ "example, if you runs an edited "
+#~ "vLLM based on v0.9.n, the version "
+#~ "of vLLM may be change to "
+#~ "v0.9.nxxx, in this case, the patch "
+#~ "for v0.9.n in vLLM Kunlun would "
+#~ "not work as expect, because that "
+#~ "vLLM Kunlun can't distinguish the "
+#~ "version of vLLM you're using. In "
+#~ "this case, you can set the "
+#~ "environment variable `VLLM_VERSION` to specify"
+#~ " the version of vLLM you're using,"
+#~ " then the patch for v0.9.2 should "
+#~ "work."
+#~ msgstr ""
+#~ "如果你运行的是经过编辑的 vLLM 代码，vLLM 的版本可能会被自动更改。例如，如果你基于 "
+#~ "v0.9.n 运行了编辑后的 vLLM，vLLM 的版本可能会变为 "
+#~ "v0.9.nxxx，在这种情况下，vLLM Kunlun 的 v0.9.n "
+#~ "补丁将无法正常工作，因为 vLLM Kunlun 无法区分你所使用的 vLLM "
+#~ "版本。这时，你可以设置环境变量 `VLLM_VERSION` 来指定你所使用的 vLLM "
+#~ "版本，这样对 v0.9.2 的补丁就应该可以正常工作。"
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/modeling/adding_a_new_model.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/modeling/adding_a_new_model.po
@@ -0,0 +1,333 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-kunlun\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:1
+msgid "Adding a New Model"
+msgstr "添加新模型"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:3
+msgid ""
+"This guide demonstrates how to integrate a novel or customized model into "
+"vllm-kunlun. For foundational concepts, it is highly recommended to refer to"
+" [vllm official doc: Adding a New "
+"Model](https://docs.vllm.ai/en/stable/contributing/model/) first."
+msgstr ""
+"本指南演示如何将新颖或自定义的模型集成到 vllm-kunlun 中。对于基础概念，强烈建议先参考 [vllm "
+"官方文档：添加新模型](https://docs.vllm.ai/en/stable/contributing/model/)。"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:6
+msgid "Step 1: Implementing Models with `torch` and `torch_npu`"
+msgstr "步骤 1：使用 `torch` 和 `torch_npu` 实现模型"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:8
+msgid ""
+"This section provides instructions for implementing new models compatible "
+"with vllm and vllm-kunlun."
+msgstr "本节提供了实现与 vllm 和 vllm-kunlun 兼容的新模型的相关说明。"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:10
+msgid "**Before starting:**"
+msgstr "**开始之前：**"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:12
+msgid ""
+"Verify whether your model already exists in vllm's "
+"[models](https://github.com/vllm-"
+"project/vllm/tree/main/vllm/model_executor/models) directory."
+msgstr ""
+"请确认你的模型是否已经存在于 vllm 的 [models](https://github.com/vllm-"
+"project/vllm/tree/main/vllm/model_executor/models) 目录中。"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:13
+msgid ""
+"Use existing models' implementation as templates to accelerate your "
+"development."
+msgstr "使用已有模型的实现作为模板以加速您的开发。"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:15
+msgid "Method 1: Implementing New Models from Scratch"
+msgstr "方法一：从零开始实现新模型"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:17
+msgid ""
+"Follow vllm's [OPT model "
+"adaptation](https://docs.vllm.ai/en/stable/contributing/model/basic.html) "
+"example for guidance."
+msgstr ""
+"请参考 vllm 的 [OPT "
+"模型适配](https://docs.vllm.ai/en/stable/contributing/model/basic.html) 示例进行操作。"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:19
+msgid "**Key implementation requirements:**"
+msgstr "**关键实现要求：**"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:21
+msgid "Place model files in `vllm_kunlun/models/` directory."
+msgstr "请将模型文件放在 `vllm_kunlun/models/` 目录下。"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:23
+msgid ""
+"Standard module structure for decoder-only LLMs (please checkout vllm's "
+"implementations for other kinds of model):"
+msgstr "解码器-only LLMs 的标准模块结构（请参考 vllm 对其他类型模型的实现）："
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:25
+msgid "`*ModelForCausalLM` (top-level wrapper)"
+msgstr "`*ModelForCausalLM`（顶层包装器）"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:26
+msgid "`*Model` (main architecture)"
+msgstr "`*Model`（主架构）"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:27
+msgid "`*DecoderLayer` (transformer block)"
+msgstr "`*DecoderLayer` （transformer 块）"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:28
+msgid "`*Attention` and `*MLP` (specific computation unit)"
+msgstr "`*Attention` 和 `*MLP`（特定计算单元）"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:31
+msgid "`*` denotes your model's unique identifier."
+msgstr "`*` 表示你的模型的唯一标识符。"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:34
+msgid "Critical Implementation Details:"
+msgstr "关键实现细节："
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:36
+msgid "All modules must include a `prefix` argument in `__init__()`."
+msgstr "所有模块在 `__init__()` 方法中都必须包含一个 `prefix` 参数。"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:38
+msgid "**Required interfaces:**"
+msgstr "**必需的接口：**"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:30
+msgid "Module Type"
+msgstr "模块类型"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:30
+msgid "Required Methods"
+msgstr "必需的方法"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:30
+msgid "`*ModelForCausalLM`"
+msgstr "`*ModelForCausalLM`"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:30
+msgid "`get_input_embeddings`, `compute_logits`, `load_weights`"
+msgstr "`get_input_embeddings`，`compute_logits`，`load_weights`"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:30
+msgid "`*Model`"
+msgstr "`*模型`"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:30
+msgid "`get_input_embeddings`, `load_weights`"
+msgstr "`get_input_embeddings`，`load_weights`"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:45
+msgid "Attention Backend Integration:"
+msgstr "注意后端集成："
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:47
+msgid ""
+"Importing attention via `from vllm.attention import Attention` can "
+"automatically leverage the attention backend routing of vllm-kunlun (see: "
+"`get_attn_backend_cls()` in `vllm_kunlun/platform.py`)."
+msgstr ""
+"通过 `from vllm.attention import Attention` 导入 attention 可以自动利用 vllm-kunlun "
+"的注意力后端路由（详见：`vllm_kunlun/platform.py` 中的 `get_attn_backend_cls()`）。"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:49
+msgid "Tensor Parallelism:"
+msgstr "张量并行："
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:51
+msgid ""
+"Use vllm's parallel layers (`ColumnParallelLinear`, "
+"`VocabParallelEmbedding`, etc.) to implement models supporting tensor "
+"parallelism. Note that Kunlun-specific customizations are implemented in "
+"`vllm_kunlun/ops/` directory (RMSNorm, VocabParallelEmbedding, etc.)."
+msgstr ""
+"使用 vllm 的并行层（如 `ColumnParallelLinear`、`VocabParallelEmbedding` "
+"等）来实现支持张量并行的模型。需要注意的是，Kunlun 特有的自定义实现（如 RMSNorm、VocabParallelEmbedding 等）位于 "
+"`vllm_kunlun/ops/` 目录下。"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:53
+msgid ""
+"**Reference Implementation Template** (assumed path: "
+"`vllm_kunlun/models/custom_model.py`):"
+msgstr "**参考实现模板**（假定路径：`vllm_kunlun/models/custom_model.py`）："
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:135
+msgid "Method 2: Customizing Existing vLLM Models"
+msgstr "方法二：自定义已有的 vLLM 模型"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:137
+msgid ""
+"For most use cases, extending existing implementations is preferable. We "
+"demonstrate an example to inherit from base classes and implement a custom "
+"deepseek model below (assumed path: `vllm_kunlun/models/deepseek_v2.py`)."
+msgstr ""
+"对于大多数使用场景，建议扩展已有的实现。我们在下面演示了一个示例，通过继承基类并实现一个自定义的 deepseek "
+"模型（假定路径：`vllm_kunlun/models/deepseek_v2.py`）。"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:175
+msgid ""
+"For a complete implementation reference, see: "
+"`vllm_kunlun/models/deepseek_v2.py`."
+msgstr "完整的实现参考请见：`vllm_kunlun/models/deepseek_v2.py`。"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:178
+msgid "Step 2: Registering Custom Models using ModelRegistry Plugins in vLLM"
+msgstr "第2步：使用 vLLM 中的 ModelRegistry 插件注册自定义模型"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:180
+msgid ""
+"vllm provides a plugin mechanism for registering externally implemented "
+"models without modifying its codebase."
+msgstr "vllm 提供了一种插件机制，可用于注册外部实现的模型，而无需修改其代码库。"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:182
+msgid ""
+"To integrate your implemented model from `vllm_kunlun/models/` directory:"
+msgstr "要集成你在 `vllm_kunlun/models/` 目录下实现的模型："
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:184
+msgid ""
+"Import your model implementation in `vllm_kunlun/models/__init__.py` using "
+"relative imports."
+msgstr "使用相对导入在 `vllm_kunlun/models/__init__.py` 中导入你的模型实现。"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:185
+msgid ""
+"Register the model wrapper class via `vllm.ModelRegistry.register_model()` "
+"function."
+msgstr "通过 `vllm.ModelRegistry.register_model()` 函数注册模型包装类。"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:187
+msgid ""
+"**Reference Registration Template** (an example of registering new models in"
+" `vllm_kunlun/models/__init__.py`):"
+msgstr "**参考注册模板**（在 `vllm_kunlun/models/__init__.py` 注册新模型的示例）："
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:210
+msgid ""
+"The first argument of `vllm.ModelRegistry.register_model()` indicates the "
+"unique architecture identifier which must match `architectures` in "
+"`config.json` of the model."
+msgstr ""
+"`vllm.ModelRegistry.register_model()` 的第一个参数表示唯一的架构标识符，这个标识符必须与模型的 "
+"`config.json` 文件中的 `architectures` 匹配。"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:221
+msgid "Step 3: Verification"
+msgstr "第 3 步：验证"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:223
+msgid "Case 1: Overriding Existing vLLM Model Architecture"
+msgstr "案例 1：重载已有的 vLLM 模型架构"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:225
+msgid ""
+"If you're registering a customized model architecture based on vllm's "
+"existing implementation (overriding vllm's original class), when executing "
+"vllm offline/online inference (using any model), you'll observe warning logs"
+" similar to the following output from "
+"`vllm/models_executor/models/registry.py`."
+msgstr ""
+"如果你基于 vllm 的现有实现注册了一个自定义的模型架构（覆盖了 vllm 的原始类），在执行 vllm "
+"的离线/在线推理（无论使用哪个模型）时，你会看到类似于 `vllm/models_executor/models/registry.py` "
+"输出的警告日志。"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:231
+msgid "Case 2: Registering New Model Architecture"
+msgstr "案例2：注册新模型架构"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:233
+msgid ""
+"If you're registering a novel model architecture not present in vllm "
+"(creating a completely new class), current logs won't provide explicit "
+"confirmation by default. It's recommended to add the following logging "
+"statement at the end of the `register_model` method in "
+"`vllm/models_executor/models/registry.py`."
+msgstr ""
+"如果你注册了 vllm 中不存在的新模型架构（创建一个全新的类），当前日志默认不会提供明确的确认信息。建议在 "
+"`vllm/models_executor/models/registry.py` 文件中的 `register_model` "
+"方法末尾添加如下日志语句。"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:239
+msgid ""
+"After adding this line, you will see confirmation logs shown below when "
+"running vllm offline/online inference (using any model)."
+msgstr "添加这一行之后，当你运行 vllm 离线/在线推理（使用任何模型）时，将会看到如下确认日志。"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:245
+msgid ""
+"This log output confirms your novel model architecture has been successfully"
+" registered in vllm."
+msgstr "该日志输出确认了你的新模型架构已成功在 vllm 中注册。"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:247
+msgid "Step 4: Testing"
+msgstr "第4步：测试"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:249
+msgid ""
+"After adding a new model, we should do basic functional test (offline/online"
+" inference), accuracy test and performance benchmark for the model."
+msgstr "在添加新模型后，我们应对该模型进行基本功能测试（离线/在线推理）、准确率测试和性能基准测试。"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:251
+msgid "Find more details at:"
+msgstr "更多详情请见："
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:253
+msgid ""
+"[Accuracy test guide](https://vllm-"
+"kunlun.readthedocs.io/en/latest/developer_guide/evaluation/index.html)"
+msgstr ""
+"[精度测试指南](https://vllm-"
+"kunlun.readthedocs.io/en/latest/developer_guide/evaluation/index.html)"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:254
+msgid ""
+"[Performance benchmark guide](https://vllm-"
+"kunlun.readthedocs.io/en/latest/developer_guide/performance/performance_benchmark.html)"
+msgstr ""
+"[性能基准指南](https://vllm-"
+"kunlun.readthedocs.io/en/latest/developer_guide/performance/performance_benchmark.html)"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:256
+msgid "Step 5: Updating Supported Models Doc"
+msgstr "第5步：更新支持的模型文档"
+
+#: ../../developer_guide/modeling/adding_a_new_model.md:258
+msgid ""
+"At last, if all the steps above are completed, you should add the new model "
+"into our [Supported Models](https://vllm-"
+"kunlun.readthedocs.io/en/latest/user_guide/supported_models.html) doc."
+msgstr ""
+"最后，如果以上所有步骤都已完成，你应该将新模型添加到我们的[支持的模型](https://vllm-"
+"kunlun.readthedocs.io/en/latest/user_guide/supported_models.html)文档中。"
--- a/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/modeling/adding_a_new_multimodal_model.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/modeling/adding_a_new_multimodal_model.po
@@ -0,0 +1,29 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-kunlun\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../developer_guide/modeling/adding_a_new_multimodal_model.md:1
+msgid "Adding a New Multi-Modal Model"
+msgstr "添加新的多模态模型"
+
+#: ../../developer_guide/modeling/adding_a_new_multimodal_model.md:3
+msgid "**_Comming soon ..._**"
+msgstr "**_敬请期待 ..._**"
--- a/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/modeling/index.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/modeling/index.po
@@ -0,0 +1,32 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-kunlun\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../developer_guide/modeling/index.md:1
+#: ../../developer_guide/modeling/index.md:5
+msgid "Modeling"
+msgstr "新模型"
+
+#: ../../developer_guide/modeling/index.md:3
+msgid ""
+"This section provides tutorials of how to implement and register a new model"
+" into vllm-kunlun."
+msgstr "本节提供了如何在 vllm-kunlun 中实现并注册新模型的教程。"
--- a/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/performance/index.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/performance/index.po
@@ -0,0 +1,26 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-kunlun\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../developer_guide/performance/index.md:1
+#: ../../developer_guide/performance/index.md:3
+msgid "Performance"
+msgstr "性能"
--- a/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/performance/optimization_and_tuning.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/performance/optimization_and_tuning.po
@@ -0,0 +1,26 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-kunlun \n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/developer_guide/performance/optimization_and_tuning.md:1
+msgid "Optimization and Tuning"
+msgstr ""
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/performance/performance_benchmark.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/performance/performance_benchmark.po
@@ -0,0 +1,92 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version:  vllm-kunlun\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/developer_guide/performance/performance_benchmark.md:1
+msgid "Performance Benchmark"
+msgstr "性能基准"
+
+#~ msgid ""
+#~ "This document details the benchmark "
+#~ "methodology for vllm-kunlun, aimed at"
+#~ " evaluating the performance under a "
+#~ "variety of workloads. To maintain "
+#~ "alignment with vLLM, we use the "
+#~ "[benchmark](https://github.com/vllm-"
+#~ "project/vllm/tree/main/benchmarks) script provided "
+#~ "by the vllm project."
+#~ msgstr ""
+#~ "本文档详细说明了 vllm-kunlun 的基准测试方法，旨在评估其在多种工作负载下的性能。为了与"
+#~ " vLLM 保持一致，我们使用 vllm 项目提供的 "
+#~ "[benchmark](https://github.com/vllm-"
+#~ "project/vllm/tree/main/benchmarks) 脚本。"
+
+#~ msgid ""
+#~ "**Benchmark Coverage**: We measure offline "
+#~ "e2e latency and throughput, and "
+#~ "fixed-QPS online serving benchmarks, for"
+#~ " more details see [vllm-kunlun "
+#~ "benchmark scripts](https://github.com/vllm-project"
+#~ "/vllm-kunlun/tree/main/benchmarks)."
+#~ msgstr ""
+#~ "**基准测试覆盖范围**：我们测量离线端到端延迟和吞吐量，以及固定 QPS 的在线服务基准测试。更多详情请参见"
+#~ " [vllm-kunlun 基准测试脚本](https://github.com/vllm-"
+#~ "project/vllm-kunlun/tree/main/benchmarks)。"
+
+#~ msgid "1. Run docker container"
+#~ msgstr "1. 运行 docker 容器"
+
+#~ msgid "2. Install dependencies"
+#~ msgstr "2. 安装依赖项"
+
+#~ msgid "3. (Optional)Prepare model weights"
+#~ msgstr "3.（可选）准备模型权重"
+
+#~ msgid ""
+#~ "For faster running speed, we recommend"
+#~ " downloading the model in advance："
+#~ msgstr "为了更快的运行速度，建议提前下载模型："
+
+#~ msgid ""
+#~ "You can also replace all model "
+#~ "paths in the [json](https://github.com/vllm-"
+#~ "project/vllm-kunlun/tree/main/benchmarks/tests) files "
+#~ "with your local paths:"
+#~ msgstr ""
+#~ "你也可以将 [json](https://github.com/vllm-project/vllm-"
+#~ "kunlun/tree/main/benchmarks/tests) 文件中的所有模型路径替换为你的本地路径："
+
+#~ msgid "4. Run benchmark script"
+#~ msgstr "4. 运行基准测试脚本"
+
+#~ msgid "Run benchmark script:"
+#~ msgstr "运行基准测试脚本："
+
+#~ msgid "After about 10 mins, the output is as shown below:"
+#~ msgstr "大约 10 分钟后，输出如下所示："
+
+#~ msgid ""
+#~ "The result json files are generated "
+#~ "into the path `benchmark/results` These "
+#~ "files contain detailed benchmarking results"
+#~ " for further analysis."
+#~ msgstr "结果 json 文件会生成到路径 `benchmark/results`。这些文件包含了用于进一步分析的详细基准测试结果。"
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/performance/profile_execute_duration.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/developer_guide/performance/profile_execute_duration.po
@@ -0,0 +1,86 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version:  vllm-kunlun\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/developer_guide/performance/profile_execute_duration.md:1
+msgid "Profile Execute Duration"
+msgstr "配置执行持续时间"
+
+#~ msgid ""
+#~ "The execution duration of each stage "
+#~ "(including pre/post-processing, model forward,"
+#~ " etc.) usually needs to be captured"
+#~ " during a complete inference process. "
+#~ "Typically, this is done by using "
+#~ "`torch.xpu.synchronize()` and obtaining CPU "
+#~ "timestamps, which increases the performance"
+#~ " overhead of host/device synchronization."
+#~ msgstr ""
+#~ "在完整的推理过程中，通常需要记录每个阶段（包括前/后处理、模型前向等）的执行时长。一般通过使用 "
+#~ "`torch.xpu.synchronize()` 并获取 CPU "
+#~ "时间戳来实现，这会增加主机/设备同步的性能开销。"
+
+#~ msgid ""
+#~ "**To reduce the performance overhead, we"
+#~ " add this feature, using the XPU "
+#~ "event timestamp mechanism to observe the"
+#~ " device execution time asynchronously.**"
+#~ msgstr "**为了减少性能开销，我们添加了此功能，使用 XPU 事件时间戳机制异步观测设备的执行时间。**"
+
+#~ msgid "Usage"
+#~ msgstr "用法"
+
+#~ msgid ""
+#~ "Use the environment variable "
+#~ "`VLLM_KUNLUN_MODEL_EXECUTE_TIME_OBSERVE` to enable "
+#~ "this feature."
+#~ msgstr "使用环境变量 `VLLM_KUNLUN_MODEL_EXECUTE_TIME_OBSERVE` 来启用此功能。"
+
+#~ msgid ""
+#~ "Use the non-blocking API "
+#~ "`ProfileExecuteDuration().capture_async` to set "
+#~ "observation points asynchronously when you "
+#~ "need to observe the execution duration."
+#~ msgstr ""
+#~ "当你需要观察执行时长时，可以使用非阻塞 API "
+#~ "`ProfileExecuteDuration().capture_async` 异步设置观察点。"
+
+#~ msgid ""
+#~ "Use the blocking API "
+#~ "`ProfileExecuteDuration().pop_captured_sync` at an "
+#~ "appropriate time to get and print "
+#~ "the execution durations of all observed"
+#~ " stages."
+#~ msgstr ""
+#~ "在适当的时机使用阻塞式 API "
+#~ "`ProfileExecuteDuration().pop_captured_sync` 获取并打印所有已观察到阶段的执行时长。"
+
+#~ msgid ""
+#~ "**We have instrumented the key inference"
+#~ " stages (including pre-processing, model"
+#~ " forward pass, etc.) for execute "
+#~ "duration profiling. Execute the script "
+#~ "as follows:**"
+#~ msgstr "**我们已经对关键的推理阶段（包括预处理、模型前向传递等）进行了执行时长分析的检测。请按如下方式执行脚本：**"
+
+#~ msgid "Example Output"
+#~ msgstr "示例输出"
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/faqs.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/faqs.po
@@ -0,0 +1,507 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version:  vllm-kunlun\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/faqs.md:1
+msgid "FAQs"
+msgstr ""
+
+#: ../../source/faqs.md:3
+msgid "Version Specific FAQs"
+msgstr "特定版本常见问题"
+
+#~ msgid ""
+#~ "[[v0.7.3.post1] FAQ & Feedback](https://github.com"
+#~ "/vllm-project/vllm-kunlun/issues/1007)"
+#~ msgstr ""
+#~ "[[v0.7.3.post1] 常见问题与反馈](https://github.com/vllm-project"
+#~ "/vllm-kunlun/issues/1007)"
+
+#~ msgid ""
+#~ "[[v0.9.2rc1] FAQ & Feedback](https://github.com"
+#~ "/vllm-project/vllm-kunlun/issues/1742)"
+#~ msgstr ""
+#~ "[[v0.9.2rc1] 常见问题与反馈](https://github.com/vllm-project"
+#~ "/vllm-kunlun/issues/1742)"
+
+#~ msgid "General FAQs"
+#~ msgstr "常见问题解答"
+
+#~ msgid "1. What devices are currently supported?"
+#~ msgstr "1. 目前支持哪些设备？"
+
+#~ msgid ""
+#~ "Currently, **ONLY** Atlas A2 series(Kunlun-"
+#~ "cann-kernels-910b) and Atlas 300I"
+#~ "(Kunlun-cann-kernels-310p) series are "
+#~ "supported:"
+#~ msgstr ""
+#~ "目前，**仅**支持 Atlas A2 系列（Kunlun-cann-"
+#~ "kernels-910b）和 Atlas 300I（Kunlun-cann-"
+#~ "kernels-310p）系列："
+
+#~ msgid ""
+#~ "Atlas A2 Training series (Atlas 800T "
+#~ "A2, Atlas 900 A2 PoD, Atlas 200T"
+#~ " A2 Box16, Atlas 300T A2)"
+#~ msgstr ""
+#~ "Atlas A2 训练系列（Atlas 800T A2，Atlas 900"
+#~ " A2 PoD，Atlas 200T A2 Box16，Atlas "
+#~ "300T A2）"
+
+#~ msgid "Atlas 800I A2 Inference series (Atlas 800I A2)"
+#~ msgstr "Atlas 800I A2 推理系列（Atlas 800I A2）"
+
+#~ msgid "Atlas 300I Inference series (Atlas 300I Duo)"
+#~ msgstr "Atlas 300I 推理系列（Atlas 300I Duo）"
+
+#~ msgid "Below series are NOT supported yet:"
+#~ msgstr "以下系列目前尚不受支持："
+
+#~ msgid "Atlas 200I A2 (Kunlun-cann-kernels-310b) unplanned yet"
+#~ msgstr "Atlas 200I A2（Kunlun-cann-kernels-310b）尚未计划"
+
+#~ msgid "Kunlun 910, Kunlun 910 Pro B (Kunlun-cann-kernels-910) unplanned yet"
+#~ msgstr "Kunlun 910，Kunlun 910 Pro B（Kunlun-cann-kernels-910）尚未计划"
+
+#~ msgid ""
+#~ "From a technical view, vllm-kunlun "
+#~ "support would be possible if the "
+#~ "torch-xpu is supported. Otherwise, we "
+#~ "have to implement it by using "
+#~ "custom ops. We are also welcome to"
+#~ " join us to improve together."
+#~ msgstr ""
+#~ "从技术角度来看，如果支持 torch-xpu，则可以支持 vllm-"
+#~ "kunlun。否则，我们需要通过自定义算子来实现。我们也欢迎大家一起加入，共同改进。"
+
+#~ msgid "2. How to get our docker containers?"
+#~ msgstr "2. 如何获取我们的 docker 容器？"
+
+#~ msgid ""
+#~ "You can get our containers at "
+#~ "`Quay.io`, e.g., [<u>vllm-"
+#~ "kunlun</u>](https://quay.io/repository/kunlun/vllm-"
+#~ "kunlun?tab=tags) and "
+#~ "[<u>cann</u>](https://quay.io/repository/kunlun/cann?tab=tags)."
+#~ msgstr ""
+#~ "你可以在 `Quay.io` 获取我们的容器，例如，[<u>vllm-"
+#~ "kunlun</u>](https://quay.io/repository/kunlun/vllm-"
+#~ "kunlun?tab=tags) 和 "
+#~ "[<u>cann</u>](https://quay.io/repository/kunlun/cann?tab=tags)。"
+
+#~ msgid ""
+#~ "If you are in China, you can "
+#~ "use `daocloud` to accelerate your "
+#~ "downloading:"
+#~ msgstr "如果你在中国，可以使用 `daocloud` 来加速下载："
+
+#~ msgid "3. What models does vllm-kunlun supports?"
+#~ msgstr "3. vllm-kunlun 支持哪些模型？"
+
+#~ msgid ""
+#~ "Find more details [<u>here</u>](https://vllm-"
+#~ "kunlun.readthedocs.io/en/latest/user_guide/support_matrix/supported_models.html)."
+#~ msgstr ""
+#~ "在[<u>此处</u>](https://vllm-"
+#~ "kunlun.readthedocs.io/en/latest/user_guide/support_matrix/supported_models.html)查看更多详细信息。"
+
+#~ msgid "4. How to get in touch with our community?"
+#~ msgstr "4. 如何与我们的社区取得联系？"
+
+#~ msgid ""
+#~ "There are many channels that you "
+#~ "can communicate with our community "
+#~ "developers / users:"
+#~ msgstr "你可以通过多种渠道与我们的社区开发者/用户进行交流："
+
+#~ msgid ""
+#~ "Submit a GitHub [<u>issue</u>](https://github.com"
+#~ "/vllm-project/vllm-kunlun/issues?page=1)."
+#~ msgstr ""
+#~ "提交一个 GitHub [<u>issue</u>](https://github.com/vllm-"
+#~ "project/vllm-kunlun/issues?page=1)。"
+
+#~ msgid ""
+#~ "Join our [<u>weekly "
+#~ "meeting</u>](https://docs.google.com/document/d/1hCSzRTMZhIB8vRq1_qOOjx4c9uYUxvdQvDsMV2JcSrw/edit?tab=t.0#heading=h.911qu8j8h35z)"
+#~ " and share your ideas."
+#~ msgstr "加入我们的[<u>每周会议</u>](https://docs.google.com/document/d/1hCSzRTMZhIB8vRq1_qOOjx4c9uYUxvdQvDsMV2JcSrw/edit?tab=t.0#heading=h.911qu8j8h35z)，并分享你的想法。"
+
+#~ msgid ""
+#~ "Join our [<u>WeChat</u>](https://github.com/vllm-"
+#~ "project/vllm-kunlun/issues/227) group and ask"
+#~ " your quenstions."
+#~ msgstr ""
+#~ "加入我们的 [<u>微信群</u>](https://github.com/vllm-project"
+#~ "/vllm-kunlun/issues/227) 并提问你的问题。"
+
+#~ msgid ""
+#~ "Join our kunlun channel in [<u>vLLM "
+#~ "forums</u>](https://discuss.vllm.ai/c/hardware-support/vllm-"
+#~ "kunlun-support/6) and publish your "
+#~ "topics."
+#~ msgstr ""
+#~ "加入我们在 [<u>vLLM 论坛</u>](https://discuss.vllm.ai/c"
+#~ "/hardware-support/vllm-kunlun-support/6) 的 "
+#~ "kunlun 频道并发布你的话题。"
+
+#~ msgid "5. What features does vllm-kunlun V1 supports?"
+#~ msgstr "5. vllm-kunlun V1 支持哪些功能？"
+
+#~ msgid ""
+#~ "Find more details [<u>here</u>](https://vllm-"
+#~ "kunlun.readthedocs.io/en/latest/user_guide/support_matrix/supported_features.html)."
+#~ msgstr ""
+#~ "在[<u>这里</u>](https://vllm-"
+#~ "kunlun.readthedocs.io/en/latest/user_guide/support_matrix/supported_features.html)找到更多详细信息。"
+
+#~ msgid ""
+#~ "6. How to solve the problem of "
+#~ "\"Failed to infer device type\" or "
+#~ "\"libatb.so: cannot open shared object "
+#~ "file\"?"
+#~ msgstr "6. 如何解决“无法推断设备类型”或“libatb.so：无法打开共享对象文件”问题？"
+
+#~ msgid ""
+#~ "Basically, the reason is that the "
+#~ "XPU environment is not configured "
+#~ "correctly. You can:"
+#~ msgstr "基本上，原因是 XPU 环境没有正确配置。你可以："
+
+#~ msgid ""
+#~ "try `source /usr/local/Kunlun/nnal/atb/set_env.sh` "
+#~ "to enable NNAL package."
+#~ msgstr "尝试运行 `source /usr/local/Kunlun/nnal/atb/set_env.sh` 以启用 NNAL 包。"
+
+#~ msgid ""
+#~ "try `source /usr/local/Kunlun/kunlun-"
+#~ "toolkit/set_env.sh` to enable CANN package."
+#~ msgstr "尝试运行 `source /usr/local/Kunlun/kunlun-toolkit/set_env.sh` 以启用 CANN 包。"
+
+#~ msgid "try `xpu-smi info` to check whether the XPU is working."
+#~ msgstr "尝试运行 `xpu-smi info` 来检查 XPU 是否正常工作。"
+
+#~ msgid ""
+#~ "If all above steps are not "
+#~ "working, you can try the following "
+#~ "code with python to check whether "
+#~ "there is any error:"
+#~ msgstr "如果以上所有步骤都无效，你可以尝试使用以下 python 代码来检查是否有错误："
+
+#~ msgid "If all above steps are not working, feel free to submit a GitHub issue."
+#~ msgstr "如果以上所有步骤都无法解决问题，欢迎提交一个 GitHub issue。"
+
+#~ msgid "7. How does vllm-kunlun perform?"
+#~ msgstr "7. vllm-kunlun 的性能如何？"
+
+#~ msgid ""
+#~ "Currently, only some models are "
+#~ "improved. Such as `Qwen2.5 VL`, `Qwen3`,"
+#~ " `Deepseek  V3`. Others are not good"
+#~ " enough. From 0.9.0rc2, Qwen and "
+#~ "Deepseek works with graph mode to "
+#~ "play a good performance. What's more,"
+#~ " you can install `mindie-turbo` with"
+#~ " `vllm-kunlun v0.7.3` to speed up "
+#~ "the inference as well."
+#~ msgstr ""
+#~ "目前，只有部分模型得到了改进，比如 `Qwen2.5 VL`、`Qwen3` 和 "
+#~ "`Deepseek V3`。其他模型的效果还不够理想。从 0.9.0rc2 开始，Qwen "
+#~ "和 Deepseek 已经支持图模式，以获得更好的性能。此外，你还可以在 `vllm-"
+#~ "kunlun v0.7.3` 上安装 `mindie-turbo`，进一步加速推理。"
+
+#~ msgid "8. How vllm-kunlun work with vllm?"
+#~ msgstr "8. vllm-kunlun 如何与 vllm 协同工作？"
+
+#~ msgid ""
+#~ "vllm-kunlun is a plugin for vllm."
+#~ " Basically, the version of vllm-"
+#~ "kunlun is the same as the version"
+#~ " of vllm. For example, if you "
+#~ "use vllm 0.7.3, you should use "
+#~ "vllm-kunlun 0.7.3 as well. For main"
+#~ " branch, we will make sure `vllm-"
+#~ "kunlun` and `vllm` are compatible by "
+#~ "each commit."
+#~ msgstr ""
+#~ "vllm-kunlun 是 vllm 的一个插件。基本上，vllm-kunlun"
+#~ " 的版本与 vllm 的版本是相同的。例如，如果你使用 vllm "
+#~ "0.7.3，你也应该使用 vllm-kunlun 0.7.3。对于主分支，我们会确保每次提交都让 "
+#~ "`vllm-kunlun` 和 `vllm` 保持兼容。"
+
+#~ msgid "9. Does vllm-kunlun support Prefill Disaggregation feature?"
+#~ msgstr "9. vllm-kunlun 支持 Prefill Disaggregation 功能吗？"
+
+#~ msgid ""
+#~ "Currently, only 1P1D is supported on "
+#~ "V0 Engine. For V1 Engine or NPND"
+#~ " support, We will make it stable "
+#~ "and supported by vllm-kunlun in "
+#~ "the future."
+#~ msgstr "目前，V0引擎只支持1P1D。对于V1引擎或NPND的支持，我们将在未来使其稳定并由vllm-kunlun支持。"
+
+#~ msgid "10. Does vllm-kunlun support quantization method?"
+#~ msgstr "10. vllm-kunlun 支持量化方法吗？"
+
+#~ msgid ""
+#~ "Currently, w8a8 quantization is already "
+#~ "supported by vllm-kunlun originally on"
+#~ " v0.8.4rc2 or higher, If you're using"
+#~ " vllm 0.7.3 version, w8a8 quantization "
+#~ "is supporeted with the integration of"
+#~ " vllm-kunlun and mindie-turbo, please"
+#~ " use `pip install vllm-kunlun[mindie-"
+#~ "turbo]`."
+#~ msgstr ""
+#~ "目前，w8a8 量化已在 v0.8.4rc2 或更高版本的 vllm-"
+#~ "kunlun 中原生支持。如果你使用的是 vllm 0.7.3 版本，集成了 "
+#~ "vllm-kunlun 和 mindie-turbo 后也支持 w8a8"
+#~ " 量化，请使用 `pip install vllm-kunlun[mindie-"
+#~ "turbo]`。"
+
+#~ msgid "11. How to run w8a8 DeepSeek model?"
+#~ msgstr "11. 如何运行 w8a8 DeepSeek 模型？"
+
+#~ msgid ""
+#~ "Please following the [inferencing "
+#~ "tutorail](https://vllm-"
+#~ "kunlun.readthedocs.io/en/latest/tutorials/multi_node.html) and"
+#~ " replace model to DeepSeek."
+#~ msgstr ""
+#~ "请按照[inferencing 教程](https://vllm-"
+#~ "kunlun.readthedocs.io/en/latest/tutorials/multi_node.html)进行操作，并将模型更换为"
+#~ " DeepSeek。"
+
+#~ msgid ""
+#~ "12. There is no output in log "
+#~ "when loading models using vllm-kunlun,"
+#~ " How to solve it?"
+#~ msgstr "12. 使用 vllm-kunlun 加载模型时日志没有输出，如何解决？"
+
+#~ msgid ""
+#~ "If you're using vllm 0.7.3 version, "
+#~ "this is a known progress bar "
+#~ "display issue in VLLM, which has "
+#~ "been resolved in [this PR](https://github.com"
+#~ "/vllm-project/vllm/pull/12428), please cherry-"
+#~ "pick it locally by yourself. Otherwise,"
+#~ " please fill up an issue."
+#~ msgstr ""
+#~ "如果你正在使用 vllm 0.7.3 版本，这是 VLLM "
+#~ "已知的进度条显示问题，已在 [此 PR](https://github.com/vllm-"
+#~ "project/vllm/pull/12428) 中解决，请自行在本地进行 cherry-"
+#~ "pick。否则，请提交一个 issue。"
+
+#~ msgid "13. How vllm-kunlun is tested"
+#~ msgstr "13. 如何测试 vllm-kunlun"
+
+#~ msgid ""
+#~ "vllm-kunlun is tested by functional "
+#~ "test, performance test and accuracy "
+#~ "test."
+#~ msgstr "vllm-kunlun 经过功能测试、性能测试和精度测试。"
+
+#~ msgid ""
+#~ "**Functional test**: we added CI, "
+#~ "includes portion of vllm's native unit"
+#~ " tests and vllm-kunlun's own unit "
+#~ "tests，on vllm-kunlun's test, we test "
+#~ "basic functionality、popular models availability "
+#~ "and [supported features](https://vllm-"
+#~ "kunlun.readthedocs.io/en/latest/user_guide/support_matrix/supported_features.html)"
+#~ " via e2e test"
+#~ msgstr ""
+#~ "**功能测试**：我们添加了CI，包含了vllm原生单元测试的一部分以及vllm-kunlun自己的单元测试。在vllm-"
+#~ "kunlun的测试中，我们通过e2e测试验证了基本功能、主流模型可用性和[支持的特性](https://vllm-"
+#~ "kunlun.readthedocs.io/en/latest/user_guide/support_matrix/supported_features.html)。"
+
+#~ msgid ""
+#~ "**Performance test**: we provide "
+#~ "[benchmark](https://github.com/vllm-project/vllm-"
+#~ "kunlun/tree/main/benchmarks) tools for end-"
+#~ "to-end performance benchmark which can "
+#~ "easily to re-route locally, we'll "
+#~ "publish a perf website to show the"
+#~ " performance test results for each "
+#~ "pull request"
+#~ msgstr ""
+#~ "**性能测试**：我们提供了用于端到端性能基准测试的[基准测试](https://github.com/vllm-project"
+#~ "/vllm-"
+#~ "kunlun/tree/main/benchmarks)工具，可以方便地在本地重新运行。我们将发布一个性能网站，用于展示每个拉取请求的性能测试结果。"
+
+#~ msgid "**Accuracy test**: we're working on adding accuracy test to CI as well."
+#~ msgstr "**准确性测试**：我们也在努力将准确性测试添加到CI中。"
+
+#~ msgid ""
+#~ "Finnall, for each release, we'll publish"
+#~ " the performance test and accuracy "
+#~ "test report in the future."
+#~ msgstr "最后，未来每个版本发布时，我们都会公开性能测试和准确性测试报告。"
+
+#~ msgid "14. How to fix the error \"InvalidVersion\" when using vllm-kunlun?"
+#~ msgstr "14. 使用 vllm-kunlun 时如何解决 “InvalidVersion” 错误？"
+
+#~ msgid ""
+#~ "It's usually because you have installed"
+#~ " an dev/editable version of vLLM "
+#~ "package. In this case, we provide "
+#~ "the env variable `VLLM_VERSION` to let"
+#~ " users specify the version of vLLM"
+#~ " package to use. Please set the "
+#~ "env variable `VLLM_VERSION` to the "
+#~ "version of vLLM package you have "
+#~ "installed. The format of `VLLM_VERSION` "
+#~ "should be `X.Y.Z`."
+#~ msgstr ""
+#~ "这通常是因为你安装了开发版或可编辑版本的 vLLM 包。在这种情况下，我们提供了环境变量 "
+#~ "`VLLM_VERSION`，以便用户指定要使用的 vLLM 包版本。请将环境变量 "
+#~ "`VLLM_VERSION` 设置为你已安装的 vLLM 包的版本。`VLLM_VERSION` "
+#~ "的格式应为 `X.Y.Z`。"
+
+#~ msgid "15. How to handle Out Of Memory?"
+#~ msgstr "15. 如何处理内存溢出？"
+
+#~ msgid ""
+#~ "OOM errors typically occur when the "
+#~ "model exceeds the memory capacity of "
+#~ "a single XPU. For general guidance, "
+#~ "you can refer to [vLLM's OOM "
+#~ "troubleshooting "
+#~ "documentation](https://docs.vllm.ai/en/latest/getting_started/troubleshooting.html"
+#~ "#out-of-memory)."
+#~ msgstr ""
+#~ "当模型超出单个 XPU 的内存容量时，通常会发生 OOM（内存溢出）错误。一般性的指导可以参考 "
+#~ "[vLLM 的 OOM "
+#~ "故障排除文档](https://docs.vllm.ai/en/latest/getting_started/troubleshooting.html"
+#~ "#out-of-memory)。"
+
+#~ msgid ""
+#~ "In scenarios where XPUs have limited "
+#~ "HBM (High Bandwidth Memory) capacity, "
+#~ "dynamic memory allocation/deallocation during "
+#~ "inference can exacerbate memory fragmentation,"
+#~ " leading to OOM. To address this:"
+#~ msgstr ""
+#~ "在 XPU 的 "
+#~ "HBM（高带宽内存）容量有限的场景下，推理过程中动态内存分配和释放会加剧内存碎片，从而导致 "
+#~ "OOM（内存溢出）。为了解决这个问题："
+
+#~ msgid ""
+#~ "**Adjust `--gpu-memory-utilization`**: If "
+#~ "unspecified, will use the default value"
+#~ " of `0.9`. You can decrease this "
+#~ "param to reserve more memory to "
+#~ "reduce fragmentation risks. See more "
+#~ "note in: [vLLM - Inference and "
+#~ "Serving - Engine "
+#~ "Arguments](https://docs.vllm.ai/en/latest/serving/engine_args.html#vllm.engine"
+#~ ".arg_utils-_engine_args_parser-cacheconfig)."
+#~ msgstr ""
+#~ "**调整 `--gpu-memory-utilization`**：如果未指定，将使用默认值 "
+#~ "`0.9`。你可以降低此参数来预留更多内存，从而降低内存碎片风险。参见更多说明：[vLLM - 推理与服务 "
+#~ "- "
+#~ "引擎参数](https://docs.vllm.ai/en/latest/serving/engine_args.html#vllm.engine"
+#~ ".arg_utils-_engine_args_parser-cacheconfig)。"
+
+#~ msgid ""
+#~ "**Configure `PYTORCH_XPU_ALLOC_CONF`**: Set this "
+#~ "environment variable to optimize XPU "
+#~ "memory management. For example, you can"
+#~ " `export PYTORCH_XPU_ALLOC_CONF=expandable_segments:True` "
+#~ "to enable virtual memory feature to "
+#~ "mitigate memory fragmentation caused by "
+#~ "frequent dynamic memory size adjustments "
+#~ "during runtime, see more note in: "
+#~ "[PYTORCH_XPU_ALLOC_CONF](https://www.hikunlun.com/document/detail/zh/Pytorch/700/comref/Envvariables/Envir_012.html)."
+#~ msgstr ""
+#~ "**配置 `PYTORCH_XPU_ALLOC_CONF`**：设置此环境变量以优化XPU内存管理。例如，你可以通过 "
+#~ "`export PYTORCH_XPU_ALLOC_CONF=expandable_segments:True` "
+#~ "来启用虚拟内存功能，以缓解运行时频繁动态调整内存大小导致的内存碎片问题，更多说明参见：[PYTORCH_XPU_ALLOC_CONF](https://www.hikunlun.com/document/detail/zh/Pytorch/700/comref/Envvariables/Envir_012.html)。"
+
+#~ msgid "16. Failed to enable XPU graph mode when running DeepSeek?"
+#~ msgstr "16. 运行 DeepSeek 时无法启用 XPU 图模式？"
+
+#~ msgid ""
+#~ "You may encounter the following error"
+#~ " if running DeepSeek with XPU graph"
+#~ " mode enabled. The allowed number of"
+#~ " queries per kv when enabling both"
+#~ " MLA and Graph mode only support "
+#~ "{32, 64, 128}, **Thus this is not"
+#~ " supported for DeepSeek-V2-Lite**, as it"
+#~ " only has 16 attention heads. The "
+#~ "XPU graph mode support on "
+#~ "DeepSeek-V2-Lite will be done in the "
+#~ "future."
+#~ msgstr ""
+#~ "如果在启用XPU图模式（Graph "
+#~ "mode）运行DeepSeek时，您可能会遇到以下错误。当同时启用MLA和图模式时，每个kv允许的查询数只支持{32, 64,"
+#~ " "
+#~ "128}，**因此这不支持DeepSeek-V2-Lite**，因为它只有16个注意力头。未来会增加对DeepSeek-V2-Lite在XPU图模式下的支持。"
+
+#~ msgid ""
+#~ "And if you're using DeepSeek-V3 or "
+#~ "DeepSeek-R1, please make sure after the"
+#~ " tensor parallel split, num_heads / "
+#~ "num_kv_heads in {32, 64, 128}."
+#~ msgstr ""
+#~ "如果你正在使用 DeepSeek-V3 或 "
+#~ "DeepSeek-R1，请确保在张量并行切分后，num_heads / num_kv_heads 的值为"
+#~ " {32, 64, 128} 中的一个。"
+
+#~ msgid ""
+#~ "17. Failed to reinstall vllm-kunlun "
+#~ "from source after uninstalling vllm-"
+#~ "kunlun?"
+#~ msgstr "17. 卸载 vllm-kunlun 后无法从源码重新安装 vllm-kunlun？"
+
+#~ msgid ""
+#~ "You may encounter the problem of C"
+#~ " compilation failure when reinstalling "
+#~ "vllm-kunlun from source using pip. If"
+#~ " the installation fails, it is "
+#~ "recommended to use `python setup.py "
+#~ "install` to install, or use `python "
+#~ "setup.py clean` to clear the cache."
+#~ msgstr ""
+#~ "当你使用 pip 从源码重新安装 vllm-kunlun 时，可能会遇到 "
+#~ "C 编译失败的问题。如果安装失败，建议使用 `python setup.py "
+#~ "install` 进行安装，或者使用 `python setup.py clean` "
+#~ "清除缓存。"
+
+#~ msgid "18. How to generate determinitic results when using vllm-kunlun?"
+#~ msgstr "18. 使用 vllm-kunlun 时如何生成确定性结果？"
+
+#~ msgid "There are several factors that affect output certainty:"
+#~ msgstr "有几个因素会影响输出的确定性："
+
+#~ msgid ""
+#~ "Sampler Method: using **Greedy sample** "
+#~ "by setting `temperature=0` in "
+#~ "`SamplingParams`, e.g.:"
+#~ msgstr ""
+#~ "采样方法：通过在 `SamplingParams` 中设置 `temperature=0` "
+#~ "来使用 **贪婪采样（Greedy sample）**，例如："
+
+#~ msgid "Set the following enveriments parameters:"
+#~ msgstr "设置以下环境参数："
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/index.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/index.po
@@ -0,0 +1,78 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version:  vllm-kunlun\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 17:48+0800\n"
+"PO-Revision-Date: 2025-07-18 10:05+0800\n"
+"Last-Translator: \n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/index.md:34
+msgid "Getting Started"
+msgstr "快速开始"
+
+#: ../../source/index.md:44
+msgid "User Guide"
+msgstr "用户指南"
+
+#: ../../source/index.md:54
+msgid "Developer Guide"
+msgstr "开发者指南"
+
+#: ../../source/index.md:64
+msgid "Community"
+msgstr "社区"
+
+#: ../../source/index.md:1
+msgid "Welcome to vLLM Kunlun Plugin"
+msgstr "欢迎使用 vLLM Kunlun 插件"
+
+#: ../../source/index.md:3
+msgid "vLLM"
+msgstr "vLLM"
+
+#: ../../source/index.md:25
+msgid ""
+"vLLM Kunlun (vllm-kunlun) is a community-maintained hardware plugin "
+"designed to seamlessly run vLLM on the Kunlun XPU. It is the recommended "
+"approach for integrating the Kunlun backend within the vLLM community, "
+"adhering to the principles outlined in the [[RFC]: Hardware "
+"pluggable](https://github.com/vllm-project/vllm/issues/11162). This "
+"plugin provides a hardware-pluggable interface that decouples the "
+"integration of the Kunlun XPU with vLLM."
+msgstr "vLLM Kunlun（vllm-kunlun）是一个由社区维护的硬件插件，旨在无缝地在昆仑 XPU 上运行 vLLM。它是将昆仑后端集成到 vLLM 社区的推荐方法，遵循 [[RFC]：硬件可插拔](https://github.com/vllm-project/vllm/issues/11162) 中提出的原则，提供了一个硬件可插拔接口，实现了昆仑 XPU 与 vLLM 集成的解耦。"
+
+
+#: ../../source/index.md:27
+msgid ""
+"By utilizing the vLLM Kunlun plugin, popular open-source models, "
+"including Transformer-like, Mixture-of-Expert, Embedding, and Multi-modal"
+" LLMs, can run effortlessly on the Kunlun XPU."
+msgstr ""
+"通过使用 vLLM Kunlun 插件，流行的开源模型，包括 Transformer 类、混合专家、嵌入式、多模态大模型等，都可以在 Kunlun"
+" XPU 上无缝运行。"
+
+#: ../../source/index.md:31
+msgid "Documentation"
+msgstr "文档"
+
+#~ msgid ""
+#~ "vLLM Kunlun plugin (vllm-kunlun) is "
+#~ "a community maintained hardware plugin "
+#~ "for running vLLM on the Kunlun "
+#~ "XPU."
+#~ msgstr "vLLM Kunlun 插件（vllm-kunlun）是一个由社区维护的硬件插件，用于在 Kunlun XPU 上运行 vLLM。"
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/installation.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/installation.po
@@ -0,0 +1,260 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version:  vllm-kunlun\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: 2025-07-18 10:09+0800\n"
+"Last-Translator: \n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/installation.md:1
+msgid "Installation"
+msgstr "安装"
+
+#~ msgid "This document describes how to install vllm-kunlun manually."
+#~ msgstr "本文档介绍如何手动安装 vllm-kunlun。"
+
+#~ msgid "Requirements"
+#~ msgstr "要求"
+
+#~ msgid "OS: Linux"
+#~ msgstr "操作系统：Linux"
+
+#~ msgid "Python: >= 3.9, < 3.12"
+#~ msgstr "Python：>= 3.9，< 3.12"
+
+#~ msgid "A hardware with Kunlun XPU. It's usually the Atlas 800 A2 series."
+#~ msgstr "配备有昇腾XPU的硬件，通常是Atlas 800 A2系列。"
+
+#~ msgid "Software:"
+#~ msgstr "软件："
+
+#~ msgid "Software"
+#~ msgstr "软件"
+
+#~ msgid "Supported version"
+#~ msgstr "支持的版本"
+
+#~ msgid "Note"
+#~ msgstr "注释"
+
+#~ msgid "CANN"
+#~ msgstr "CANN"
+
+#~ msgid ">= 8.1.RC1"
+#~ msgstr ">= 8.1.RC1"
+
+#~ msgid "Required for vllm-kunlun and torch-xpu"
+#~ msgstr "vllm-kunlun 和 torch-xpu 必需"
+
+#~ msgid "torch-xpu"
+#~ msgstr "torch-xpu"
+
+#~ msgid ">= 2.5.1.post1.dev20250619"
+#~ msgstr ">= 2.5.1.post1.dev20250619"
+
+#~ msgid ""
+#~ "Required for vllm-kunlun, No need "
+#~ "to install manually, it will be "
+#~ "auto installed in below steps"
+#~ msgstr "vllm-kunlun 必需，无需手动安装，后续步骤会自动安装。"
+
+#~ msgid "torch"
+#~ msgstr "torch"
+
+#~ msgid ">= 2.5.1"
+#~ msgstr ">= 2.5.1"
+
+#~ msgid "Required for torch-xpu and vllm"
+#~ msgstr "torch-xpu 和 vllm 所需"
+
+#~ msgid "You have 2 way to install:"
+#~ msgstr "你有两种安装方式："
+
+#~ msgid ""
+#~ "**Using pip**: first prepare env "
+#~ "manually or via CANN image, then "
+#~ "install `vllm-kunlun` using pip."
+#~ msgstr "**使用 pip**：首先手动准备环境或通过 CANN 镜像准备环境，然后使用 pip 安装 `vllm-kunlun`。"
+
+#~ msgid ""
+#~ "**Using docker**: use the `vllm-kunlun`"
+#~ " pre-built docker image directly."
+#~ msgstr "**使用 docker**：直接使用 `vllm-kunlun` 预构建的 docker 镜像。"
+
+#~ msgid "Configure a new environment"
+#~ msgstr "配置一个新环境"
+
+#~ msgid ""
+#~ "Before installing, you need to make "
+#~ "sure firmware/driver and CANN are "
+#~ "installed correctly, refer to "
+#~ "[link](https://kunlun.github.io/docs/sources/kunlun/quick_install.html)"
+#~ " for more details."
+#~ msgstr ""
+#~ "在安装之前，您需要确保固件/驱动和 CANN 已正确安装，更多详情请参考 "
+#~ "[链接](https://kunlun.github.io/docs/sources/kunlun/quick_install.html)。"
+
+#~ msgid "Configure hardware environment"
+#~ msgstr "配置硬件环境"
+
+#~ msgid ""
+#~ "To verify that the Kunlun XPU "
+#~ "firmware and driver were correctly "
+#~ "installed, run:"
+#~ msgstr "要验证 Kunlun XPU 固件和驱动程序是否正确安装，请运行："
+
+#~ msgid ""
+#~ "Refer to [Kunlun Environment Setup "
+#~ "Guide](https://kunlun.github.io/docs/sources/kunlun/quick_install.html)"
+#~ " for more details."
+#~ msgstr "更多详情请参考[Kunlun环境搭建指南](https://kunlun.github.io/docs/sources/kunlun/quick_install.html)。"
+
+#~ msgid "Configure software environment"
+#~ msgstr "配置软件环境"
+
+#~ msgid "Before using pip"
+#~ msgstr "在使用 pip 之前"
+
+#~ msgid ""
+#~ "The easiest way to prepare your "
+#~ "software environment is using CANN image"
+#~ " directly:"
+#~ msgstr "最简单的方式是直接使用 CANN 镜像来准备您的软件环境："
+
+#~ msgid "Click here to see \"Install CANN manually\""
+#~ msgstr "点击此处查看“手动安装 CANN”"
+
+#~ msgid "You can also install CANN manually:"
+#~ msgstr "你也可以手动安装 CANN："
+
+#~ msgid "Before using docker"
+#~ msgstr "在使用 docker 之前"
+
+#~ msgid ""
+#~ "No more extra step if you are "
+#~ "using `vllm-kunlun` prebuilt docker "
+#~ "image."
+#~ msgstr "如果你使用 `vllm-kunlun` 预构建的 docker 镜像，就无需额外的步骤。"
+
+#~ msgid "Once it's done, you can start to set up `vllm` and `vllm-kunlun`."
+#~ msgstr "完成后，你可以开始配置 `vllm` 和 `vllm-kunlun`。"
+
+#~ msgid "Setup vllm and vllm-kunlun"
+#~ msgstr "安装 vllm 和 vllm-kunlun"
+
+#~ msgid "Using pip"
+#~ msgstr "使用 pip"
+
+#~ msgid "First install system dependencies and config pip mirror:"
+#~ msgstr "首先安装系统依赖并配置 pip 镜像："
+
+#~ msgid ""
+#~ "**[Optional]** Then config the extra-"
+#~ "index of `pip` if you are working"
+#~ " on a x86 machine or using "
+#~ "torch-xpu dev version:"
+#~ msgstr "**[可选]** 如果你在 x86 机器上工作或使用 torch-xpu 开发版，请配置 `pip` 的额外索引："
+
+#~ msgid "Then you can install `vllm` and `vllm-kunlun` from **pre-built wheel**:"
+#~ msgstr "然后你可以从**预编译的 wheel 包**安装 `vllm` 和 `vllm-kunlun`："
+
+#~ msgid "Click here to see \"Build from source code\""
+#~ msgstr "点击此处查看“从源代码构建”"
+
+#~ msgid "or build from **source code**:"
+#~ msgstr "或者从**源代码**构建："
+
+#~ msgid ""
+#~ "vllm-kunlun will build custom ops "
+#~ "by default. If you don't want to"
+#~ " build it, set `COMPILE_CUSTOM_KERNELS=0` "
+#~ "environment to disable it."
+#~ msgstr ""
+#~ "vllm-kunlun 默认会编译自定义算子。如果你不想编译它，可以设置环境变量 "
+#~ "`COMPILE_CUSTOM_KERNELS=0` 来禁用。"
+
+#~ msgid ""
+#~ "If you are building from v0.7.3-dev "
+#~ "and intend to use sleep mode "
+#~ "feature, you should set "
+#~ "`COMPILE_CUSTOM_KERNELS=1` manually. To build "
+#~ "custom ops, gcc/g++ higher than 8 "
+#~ "and c++ 17 or higher is required."
+#~ " If you're using `pip install -e "
+#~ ".` and encourage a torch-xpu "
+#~ "version conflict, please install with "
+#~ "`pip install --no-build-isolation -e "
+#~ ".` to build on system env. If "
+#~ "you encounter other problems during "
+#~ "compiling, it is probably because "
+#~ "unexpected compiler is being used, you"
+#~ " may export `CXX_COMPILER` and `C_COMPILER`"
+#~ " in env to specify your g++ and"
+#~ " gcc locations before compiling."
+#~ msgstr ""
+#~ "如果你是从 v0.7.3-dev 版本开始构建，并且打算使用休眠模式功能，你需要手动设置 "
+#~ "`COMPILE_CUSTOM_KERNELS=1`。构建自定义算子时，要求 gcc/g++ 版本高于 "
+#~ "8 且支持 c++ 17 或更高标准。如果你正在使用 `pip "
+#~ "install -e .` 并且出现了 torch-xpu "
+#~ "版本冲突，请使用 `pip install --no-build-"
+#~ "isolation -e .` "
+#~ "在系统环境下进行安装。如果在编译过程中遇到其它问题，可能是因为使用了非预期的编译器，你可以在编译前通过环境变量导出 "
+#~ "`CXX_COMPILER` 和 `C_COMPILER`，以指定你的 g++ 和 "
+#~ "gcc 路径。"
+
+#~ msgid "Using docker"
+#~ msgstr "使用 docker"
+
+#~ msgid "You can just pull the **prebuilt image** and run it with bash."
+#~ msgstr "你可以直接拉取**预构建镜像**并用 bash 运行它。"
+
+#~ msgid "Click here to see \"Build from Dockerfile\""
+#~ msgstr "点击这里查看“从 Dockerfile 构建”"
+
+#~ msgid "or build IMAGE from **source code**:"
+#~ msgstr "或从**源代码**构建 IMAGE："
+
+#~ msgid ""
+#~ "The default workdir is `/workspace`, "
+#~ "vLLM and vLLM Kunlun code are "
+#~ "placed in `/vllm-workspace` and "
+#~ "installed in [development "
+#~ "mode](https://setuptools.pypa.io/en/latest/userguide/development_mode.html)(`pip"
+#~ " install -e`) to help developer "
+#~ "immediately take place changes without "
+#~ "requiring a new installation."
+#~ msgstr ""
+#~ "默认的工作目录是 `/workspace`，vLLM 和 vLLM Kunlun "
+#~ "代码被放置在 `/vllm-"
+#~ "workspace`，并以[开发模式](https://setuptools.pypa.io/en/latest/userguide/development_mode.html)（`pip"
+#~ " install -e`）安装，以便开发者能够即时生效更改，而无需重新安装。"
+
+#~ msgid "Extra information"
+#~ msgstr "额外信息"
+
+#~ msgid "Verify installation"
+#~ msgstr "验证安装"
+
+#~ msgid "Create and run a simple inference test. The `example.py` can be like:"
+#~ msgstr "创建并运行一个简单的推理测试。`example.py` 可以如下："
+
+#~ msgid "Then run:"
+#~ msgstr "然后运行："
+
+#~ msgid "The output will be like:"
+#~ msgstr "输出将会像这样："
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/quick_start.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/quick_start.po
@@ -0,0 +1,139 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version:  vllm-kunlun\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: 2025-07-18 10:09+0800\n"
+"Last-Translator: \n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/quick_start.md:1
+msgid "Quickstart"
+msgstr "快速入门"
+
+#: ../../source/quick_start.md:3
+msgid "Prerequisites"
+msgstr "先决条件"
+
+#: ../../source/quick_start.md:5
+msgid "Supported Devices"
+msgstr "支持的设备"
+
+#~ msgid ""
+#~ "Atlas A2 Training series (Atlas 800T "
+#~ "A2, Atlas 900 A2 PoD, Atlas 200T"
+#~ " A2 Box16, Atlas 300T A2)"
+#~ msgstr ""
+#~ "Atlas A2 训练系列（Atlas 800T A2，Atlas 900"
+#~ " A2 PoD，Atlas 200T A2 Box16，Atlas "
+#~ "300T A2）"
+
+#~ msgid "Atlas 800I A2 Inference series (Atlas 800I A2)"
+#~ msgstr "Atlas 800I A2 推理系列（Atlas 800I A2）"
+
+#~ msgid "Setup environment using container"
+#~ msgstr "使用容器设置环境"
+
+#~ msgid "Ubuntu"
+#~ msgstr "Ubuntu"
+
+#~ msgid "openEuler"
+#~ msgstr "openEuler"
+
+#~ msgid ""
+#~ "The default workdir is `/workspace`, "
+#~ "vLLM and vLLM Kunlun code are "
+#~ "placed in `/vllm-workspace` and "
+#~ "installed in [development "
+#~ "mode](https://setuptools.pypa.io/en/latest/userguide/development_mode.html)(`pip"
+#~ " install -e`) to help developer "
+#~ "immediately take place changes without "
+#~ "requiring a new installation."
+#~ msgstr ""
+#~ "默认的工作目录是 `/workspace`，vLLM 和 vLLM Kunlun "
+#~ "代码被放置在 `/vllm-"
+#~ "workspace`，并以[开发模式](https://setuptools.pypa.io/en/latest/userguide/development_mode.html)（`pip"
+#~ " install -e`）安装，以便开发者能够即时生效更改，而无需重新安装。"
+
+#~ msgid "Usage"
+#~ msgstr "用法"
+
+#~ msgid "You can use Modelscope mirror to speed up download:"
+#~ msgstr "你可以使用 Modelscope 镜像来加速下载："
+
+#~ msgid "There are two ways to start vLLM on Kunlun XPU:"
+#~ msgstr "在昇腾 XPU 上启动 vLLM 有两种方式："
+
+#~ msgid "Offline Batched Inference"
+#~ msgstr "离线批量推理"
+
+#~ msgid ""
+#~ "With vLLM installed, you can start "
+#~ "generating texts for list of input "
+#~ "prompts (i.e. offline batch inferencing)."
+#~ msgstr "安装了 vLLM 后，您可以开始为一系列输入提示生成文本（即离线批量推理）。"
+
+#~ msgid ""
+#~ "Try to run below Python script "
+#~ "directly or use `python3` shell to "
+#~ "generate texts:"
+#~ msgstr "尝试直接运行下面的 Python 脚本，或者使用 `python3` 交互式命令行来生成文本："
+
+#~ msgid "OpenAI Completions API"
+#~ msgstr "OpenAI Completions API"
+
+#~ msgid ""
+#~ "vLLM can also be deployed as a "
+#~ "server that implements the OpenAI API"
+#~ " protocol. Run the following command "
+#~ "to start the vLLM server with the"
+#~ " [Qwen/Qwen2.5-0.5B-"
+#~ "Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct) "
+#~ "model:"
+#~ msgstr ""
+#~ "vLLM 也可以作为实现 OpenAI API 协议的服务器进行部署。运行以下命令，使用"
+#~ " [Qwen/Qwen2.5-0.5B-"
+#~ "Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct) "
+#~ "模型启动 vLLM 服务器："
+
+#~ msgid "If you see log as below:"
+#~ msgstr "如果你看到如下日志："
+
+#~ msgid "Congratulations, you have successfully started the vLLM server!"
+#~ msgstr "恭喜，你已经成功启动了 vLLM 服务器！"
+
+#~ msgid "You can query the list the models:"
+#~ msgstr "你可以查询模型列表："
+
+#~ msgid "You can also query the model with input prompts:"
+#~ msgstr "你也可以通过输入提示来查询模型："
+
+#~ msgid ""
+#~ "vLLM is serving as background process,"
+#~ " you can use `kill -2 $VLLM_PID` "
+#~ "to stop the background process "
+#~ "gracefully, it's equal to `Ctrl-C` to"
+#~ " stop foreground vLLM process:"
+#~ msgstr ""
+#~ "vLLM 正作为后台进程运行，你可以使用 `kill -2 $VLLM_PID` "
+#~ "来优雅地停止后台进程，这等同于使用 `Ctrl-C` 停止前台 vLLM 进程："
+
+#~ msgid "You will see output as below:"
+#~ msgstr "你将会看到如下输出："
+
+#~ msgid "Finally, you can exit container by using `ctrl-D`."
+#~ msgstr "最后，你可以通过按 `ctrl-D` 退出容器。"
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/DeepSeek-V3.2-Exp.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/DeepSeek-V3.2-Exp.po
@@ -0,0 +1,30 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-kunlun \n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/tutorials/DeepSeek-V3.2-Exp.md:1
+msgid "DeepSeek-V3.2-Exp"
+msgstr ""
+
+#: ../../source/tutorials/DeepSeek-V3.2-Exp.md:3
+msgid "Introduction"
+msgstr ""
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/index.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/index.po
@@ -0,0 +1,29 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-kunlun\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../tutorials/index.md:3
+msgid "Deployment"
+msgstr "部署"
+
+#: ../../tutorials/index.md:1
+msgid "Tutorials"
+msgstr "教程"
--- a/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/multi_node.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/multi_node.po
@@ -0,0 +1,213 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version:  vllm-kunlun\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/tutorials/multi_node.md:1
+msgid "Multi-Node-DP (DeepSeek)"
+msgstr "多节点分布式处理（DeepSeek）"
+
+#: ../../source/tutorials/multi_node.md:3
+msgid "Getting Start"
+msgstr "快速开始"
+
+#~ msgid ""
+#~ "vLLM-Kunlun now supports Data Parallel"
+#~ " (DP) deployment, enabling model weights"
+#~ " to be replicated across multiple "
+#~ "XPUs or instances, each processing "
+#~ "independent batches of requests. This is"
+#~ " particularly useful for scaling throughput"
+#~ " across devices while maintaining high "
+#~ "resource utilization."
+#~ msgstr ""
+#~ "vLLM-Kunlun 现在支持数据并行（DP）部署，可以在多个 XPU "
+#~ "或实例之间复制模型权重，每个实例处理独立的请求批次。这对于在保证高资源利用率的同时，实现跨设备的吞吐量扩展特别有用。"
+
+#~ msgid ""
+#~ "Each DP rank is deployed as a "
+#~ "separate “core engine” process which "
+#~ "communicates with front-end process(es) "
+#~ "via ZMQ sockets. Data Parallel can "
+#~ "be combined with Tensor Parallel, in "
+#~ "which case each DP engine owns a"
+#~ " number of per-XPU worker processes"
+#~ " equal to the TP size."
+#~ msgstr ""
+#~ "每个 DP 进程作为一个单独的“核心引擎”进程部署，并通过 ZMQ "
+#~ "套接字与前端进程通信。数据并行可以与张量并行结合使用，此时每个 DP 引擎拥有数量等于 TP "
+#~ "大小的每 XPU 工作进程。"
+
+#~ msgid ""
+#~ "For Mixture-of-Experts (MoE) models "
+#~ "— especially advanced architectures like "
+#~ "DeepSeek that utilize Multi-head Latent"
+#~ " Attention (MLA) — a hybrid "
+#~ "parallelism approach is recommended:     - "
+#~ "Use **Data Parallelism (DP)** for "
+#~ "attention layers, which are replicated "
+#~ "across devices and handle separate "
+#~ "batches.     - Use **Expert or Tensor"
+#~ " Parallelism (EP/TP)** for expert layers,"
+#~ " which are sharded across devices to"
+#~ " distribute the computation."
+#~ msgstr ""
+#~ "对于混合专家（Mixture-of-Experts, MoE）模型——尤其是像 "
+#~ "DeepSeek 这样采用多头潜在注意力（Multi-head Latent "
+#~ "Attention, MLA）的高级架构——推荐使用混合并行策略：\n"
+#~ "    - 对于注意力层，使用 **数据并行（Data Parallelism, DP）**，这些层会在各设备间复刻，并处理不同的批次。\n"
+#~ "    - 对于专家层，使用 **专家并行或张量并行（Expert or "
+#~ "Tensor Parallelism, EP/TP）**，这些层会在设备间分片，从而分担计算。"
+
+#~ msgid ""
+#~ "This division enables attention layers "
+#~ "to be replicated across Data Parallel"
+#~ " (DP) ranks, enabling them to process"
+#~ " different batches independently. Meanwhile, "
+#~ "expert layers are partitioned (sharded) "
+#~ "across devices using Expert or Tensor"
+#~ " Parallelism(DP*TP), maximizing hardware "
+#~ "utilization and efficiency."
+#~ msgstr "这种划分使得注意力层能够在数据并行（DP）组内复制，从而能够独立处理不同的批次。同时，专家层通过专家或张量并行（DP*TP）在设备间进行分区（切片），最大化硬件利用率和效率。"
+
+#~ msgid ""
+#~ "In these cases the data parallel "
+#~ "ranks are not completely independent, "
+#~ "forward passes must be aligned and "
+#~ "expert layers across all ranks are "
+#~ "required to synchronize during every "
+#~ "forward pass, even if there are "
+#~ "fewer requests to be processed than "
+#~ "DP ranks."
+#~ msgstr ""
+#~ "在这些情况下，数据并行的各个 rank 不是完全独立的，前向传播必须对齐，并且所有 rank "
+#~ "上的专家层在每次前向传播时都需要同步，即使待处理的请求数量少于 DP rank 的数量。"
+
+#~ msgid ""
+#~ "For MoE models, when any requests "
+#~ "are in progress in any rank, we"
+#~ " must ensure that empty “dummy” "
+#~ "forward passes are performed in all "
+#~ "ranks which don’t currently have any "
+#~ "requests scheduled. This is handled via"
+#~ " a separate DP `Coordinator` process "
+#~ "which communicates with all of the "
+#~ "ranks, and a collective operation "
+#~ "performed every N steps to determine "
+#~ "when all ranks become idle and can"
+#~ " be paused. When TP is used in"
+#~ " conjunction with DP, expert layers "
+#~ "form an EP or TP group of "
+#~ "size (DP x TP)."
+#~ msgstr ""
+#~ "对于 MoE 模型，当任何一个 rank 有请求正在进行时，必须确保所有当前没有请求的"
+#~ " rank 都执行空的“虚拟”前向传播。这是通过一个单独的 DP `Coordinator`"
+#~ " 协调器进程来实现的，该进程与所有 rank 通信，并且每隔 N "
+#~ "步执行一次集体操作，以判断所有 rank 是否都处于空闲状态并可以暂停。当 TP 与 "
+#~ "DP 结合使用时，专家层会组成一个规模为（DP x TP）的 EP 或 "
+#~ "TP 组。"
+
+#~ msgid "Verify Multi-Node Communication Environment"
+#~ msgstr "验证多节点通信环境"
+
+#~ msgid "Physical Layer Requirements:"
+#~ msgstr "物理层要求："
+
+#~ msgid ""
+#~ "The physical machines must be located"
+#~ " on the same WLAN, with network "
+#~ "connectivity."
+#~ msgstr "物理机器必须位于同一个 WLAN 中，并且具有网络连接。"
+
+#~ msgid ""
+#~ "All XPUs are connected with optical "
+#~ "modules, and the connection status must"
+#~ " be normal."
+#~ msgstr "所有 XPU 都通过光模块连接，且连接状态必须正常。"
+
+#~ msgid "Verification Process:"
+#~ msgstr "验证流程："
+
+#~ msgid ""
+#~ "Execute the following commands on each"
+#~ " node in sequence. The results must"
+#~ " all be `success` and the status "
+#~ "must be `UP`:"
+#~ msgstr "在每个节点上依次执行以下命令。所有结果必须为 `success` 且状态必须为 `UP`："
+
+#~ msgid "XPU Interconnect Verification:"
+#~ msgstr "XPU 互连验证："
+
+#~ msgid "1. Get XPU IP Addresses"
+#~ msgstr "1. 获取 XPU IP 地址"
+
+#~ msgid "2. Cross-Node PING Test"
+#~ msgstr "2. 跨节点PING测试"
+
+#~ msgid "Run with docker"
+#~ msgstr "用 docker 运行"
+
+#~ msgid ""
+#~ "Assume you have two Atlas 800 "
+#~ "A2(64G*8) nodes, and want to deploy "
+#~ "the `deepseek-v3-w8a8` quantitative model "
+#~ "across multi-node."
+#~ msgstr "假设你有两台 Atlas 800 A2（64G*8）节点，并且想要在多节点上部署 `deepseek-v3-w8a8` 量化模型。"
+
+#~ msgid ""
+#~ "Before launch the inference server, "
+#~ "ensure some environment variables are "
+#~ "set for multi node communication"
+#~ msgstr "在启动推理服务器之前，确保已经为多节点通信设置了一些环境变量。"
+
+#~ msgid "Run the following scripts on two nodes respectively"
+#~ msgstr "分别在两台节点上运行以下脚本"
+
+#~ msgid "**node0**"
+#~ msgstr "**节点0**"
+
+#~ msgid "**node1**"
+#~ msgstr "**节点1**"
+
+#~ msgid ""
+#~ "The Deployment view looks like:  ![alt"
+#~ " text](../assets/multi_node_dp.png)"
+#~ msgstr "部署视图如下所示：![替代文本](../assets/multi_node_dp.png)"
+
+#~ msgid "alt text"
+#~ msgstr "替代文本"
+
+#~ msgid ""
+#~ "Once your server is started, you "
+#~ "can query the model with input "
+#~ "prompts:"
+#~ msgstr "一旦你的服务器启动，你可以通过输入提示词来查询模型："
+
+#~ msgid "Run benchmarks"
+#~ msgstr "运行基准测试"
+
+#~ msgid ""
+#~ "For details please refer to "
+#~ "[benchmark](https://github.com/vllm-project/vllm-"
+#~ "kunlun/tree/main/benchmarks)"
+#~ msgstr ""
+#~ "详细信息请参阅 [benchmark](https://github.com/vllm-project"
+#~ "/vllm-kunlun/tree/main/benchmarks)"
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/multi_node_kimi.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/multi_node_kimi.po
@@ -0,0 +1,30 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-kunlun \n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/tutorials/multi_node_kimi.md:1
+msgid "Multi-Node-DP (Kimi-K2)"
+msgstr ""
+
+#: ../../source/tutorials/multi_node_kimi.md:3
+msgid "Verify Multi-Node Communication Environment"
+msgstr ""
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/multi_node_pd_disaggregation_llmdatadist.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/multi_node_pd_disaggregation_llmdatadist.po
@@ -0,0 +1,30 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-kunlun \n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/tutorials/multi_node_pd_disaggregation_llmdatadist.md:1
+msgid "Prefill-Decode Disaggregation Llmdatadist Verification (Qwen)"
+msgstr ""
+
+#: ../../source/tutorials/multi_node_pd_disaggregation_llmdatadist.md:3
+msgid "Getting Start"
+msgstr ""
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/multi_node_pd_disaggregation_mooncake.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/multi_node_pd_disaggregation_mooncake.po
@@ -0,0 +1,30 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-kunlun \n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/tutorials/multi_node_pd_disaggregation_mooncake.md:1
+msgid "Prefill-Decode Disaggregation Mooncake Verification (Qwen)"
+msgstr ""
+
+#: ../../source/tutorials/multi_node_pd_disaggregation_mooncake.md:3
+msgid "Getting Start"
+msgstr ""
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/multi_node_qwen3vl.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/multi_node_qwen3vl.po
@@ -0,0 +1,26 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-kunlun \n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/tutorials/multi_node_qwen3vl.md:1
+msgid "Multi-Node-DP (Qwen3-VL-235B-A22B)"
+msgstr ""
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/multi_node_ray.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/multi_node_ray.po
@@ -0,0 +1,26 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-kunlun \n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/tutorials/multi_node_ray.md:1
+msgid "Multi-Node-Ray (Qwen/Qwen3-235B-A22B)"
+msgstr ""
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/multi_npu.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/multi_npu.po
@@ -0,0 +1,53 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version:  vllm-kunlun\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/tutorials/multi_npu.md:1
+msgid "Multi-XPU (QwQ 32B)"
+msgstr "多-XPU（QwQ 32B）"
+
+#~ msgid "Run vllm-kunlun on Multi-XPU"
+#~ msgstr "在多XPU上运行 vllm-kunlun"
+
+#~ msgid "Run docker container:"
+#~ msgstr "运行 docker 容器："
+
+#~ msgid "Setup environment variables:"
+#~ msgstr "设置环境变量："
+
+#~ msgid "Online Inference on Multi-XPU"
+#~ msgstr "多XPU的在线推理"
+
+#~ msgid "Run the following script to start the vLLM server on Multi-XPU:"
+#~ msgstr "运行以下脚本，在多XPU上启动 vLLM 服务器："
+
+#~ msgid "Once your server is started, you can query the model with input prompts"
+#~ msgstr "一旦服务器启动，就可以通过输入提示词来查询模型。"
+
+#~ msgid "Offline Inference on Multi-XPU"
+#~ msgstr "多XPU离线推理"
+
+#~ msgid "Run the following script to execute offline inference on multi-XPU:"
+#~ msgstr "运行以下脚本以在多XPU上执行离线推理："
+
+#~ msgid "If you run this script successfully, you can see the info shown below:"
+#~ msgstr "如果你成功运行此脚本，你可以看到如下所示的信息："
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/multi_npu_moge.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/multi_npu_moge.po
@@ -0,0 +1,74 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version:  vllm-kunlun\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/tutorials/multi_npu_moge.md:1
+msgid "Multi-XPU (Pangu Pro MoE)"
+msgstr "多XPU（Pangu Pro MoE）"
+
+#~ msgid "Run vllm-kunlun on Multi-XPU"
+#~ msgstr "在多XPU上运行 vllm-kunlun"
+
+#~ msgid "Run container:"
+#~ msgstr "运行容器："
+
+#~ msgid "Setup environment variables:"
+#~ msgstr "设置环境变量："
+
+#~ msgid "Download the model:"
+#~ msgstr "下载该模型："
+
+#~ msgid "Online Inference on Multi-XPU"
+#~ msgstr "多XPU上的在线推理"
+
+#~ msgid "Run the following script to start the vLLM server on Multi-XPU:"
+#~ msgstr "运行以下脚本，在多XPU上启动 vLLM 服务器："
+
+#~ msgid ""
+#~ "Once your server is started, you "
+#~ "can query the model with input "
+#~ "prompts:"
+#~ msgstr "一旦你的服务器启动，你可以通过输入提示词来查询模型："
+
+#~ msgid "v1/completions"
+#~ msgstr "v1/补全"
+
+#~ msgid "v1/chat/completions"
+#~ msgstr "v1/chat/completions"
+
+#~ msgid "If you run this successfully, you can see the info shown below:"
+#~ msgstr "如果你成功运行这个，你可以看到如下所示的信息："
+
+#~ msgid "Offline Inference on Multi-XPU"
+#~ msgstr "多XPU离线推理"
+
+#~ msgid "Run the following script to execute offline inference on multi-XPU:"
+#~ msgstr "运行以下脚本以在多XPU上执行离线推理："
+
+#~ msgid "Graph Mode"
+#~ msgstr "图模式"
+
+#~ msgid "Eager Mode"
+#~ msgstr "即时模式"
+
+#~ msgid "If you run this script successfully, you can see the info shown below:"
+#~ msgstr "如果你成功运行此脚本，你可以看到如下所示的信息："
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/multi_npu_quantization.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/multi_npu_quantization.po
@@ -0,0 +1,82 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version:  vllm-kunlun\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/tutorials/multi_npu_quantization.md:1
+msgid "Multi-XPU (QwQ 32B W8A8)"
+msgstr "多XPU（QwQ 32B W8A8）"
+
+#: ../../source/tutorials/multi_npu_quantization.md:3
+#, fuzzy
+msgid "Run Docker Container"
+msgstr "运行 docker 容器"
+
+#~ msgid "w8a8 quantization feature is supported by v0.8.4rc2 or higher"
+#~ msgstr "w8a8 量化功能由 v0.8.4rc2 或更高版本支持"
+
+#~ msgid "Install modelslim and convert model"
+#~ msgstr "安装 modelslim 并转换模型"
+
+#~ msgid ""
+#~ "You can choose to convert the "
+#~ "model yourself or use the quantized "
+#~ "model we uploaded,  see "
+#~ "https://www.modelscope.cn/models/vllm-kunlun/QwQ-32B-"
+#~ "W8A8"
+#~ msgstr ""
+#~ "你可以选择自己转换模型，或者使用我们上传的量化模型，详见 https://www.modelscope.cn/models"
+#~ "/vllm-kunlun/QwQ-32B-W8A8"
+
+#~ msgid "Verify the quantized model"
+#~ msgstr "验证量化模型"
+
+#~ msgid "The converted model files looks like:"
+#~ msgstr "转换后的模型文件如下所示："
+
+#~ msgid "Run the following script to start the vLLM server with quantized model:"
+#~ msgstr "运行以下脚本以启动带有量化模型的 vLLM 服务器："
+
+#~ msgid ""
+#~ "The value \"kunlun\" for \"--"
+#~ "quantization\" argument will be supported "
+#~ "after [a specific PR](https://github.com/vllm-"
+#~ "project/vllm-kunlun/pull/877) is merged and"
+#~ " released, you can cherry-pick this"
+#~ " commit for now."
+#~ msgstr ""
+#~ "在 [特定的PR](https://github.com/vllm-project/vllm-"
+#~ "kunlun/pull/877) 合并并发布后， \"--quantization\" "
+#~ "参数将支持值 \"kunlun\"，你也可以现在手动挑选该提交。"
+
+#~ msgid "Once your server is started, you can query the model with input prompts"
+#~ msgstr "一旦服务器启动，就可以通过输入提示词来查询模型。"
+
+#~ msgid ""
+#~ "Run the following script to execute "
+#~ "offline inference on multi-XPU with "
+#~ "quantized model:"
+#~ msgstr "运行以下脚本，在多XPU上使用量化模型执行离线推理："
+
+#~ msgid ""
+#~ "To enable quantization for kunlun, "
+#~ "quantization method must be \"kunlun\""
+#~ msgstr "要在kunlun上启用量化，量化方法必须为“kunlun”。"
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/multi_npu_qwen3_moe.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/multi_npu_qwen3_moe.po
@@ -0,0 +1,63 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version:  vllm-kunlun\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/tutorials/multi_npu_qwen3_moe.md:1
+msgid "Multi-XPU (Qwen3-30B-A3B)"
+msgstr "多XPU（Qwen3-30B-A3B）"
+
+#~ msgid "Run vllm-kunlun on Multi-XPU with Qwen3 MoE"
+#~ msgstr "在多XPU上运行带有Qwen3 MoE的vllm-kunlun"
+
+#~ msgid "Run docker container:"
+#~ msgstr "运行 docker 容器："
+
+#~ msgid "Setup environment variables:"
+#~ msgstr "设置环境变量："
+
+#~ msgid "Online Inference on Multi-XPU"
+#~ msgstr "多XPU的在线推理"
+
+#~ msgid "Run the following script to start the vLLM server on Multi-XPU:"
+#~ msgstr "运行以下脚本以在多XPU上启动 vLLM 服务器："
+
+#~ msgid ""
+#~ "For an Atlas A2 with 64GB of "
+#~ "XPU card memory, tensor-parallel-size"
+#~ " should be at least 2, and for"
+#~ " 32GB of memory, tensor-parallel-size"
+#~ " should be at least 4."
+#~ msgstr ""
+#~ "对于拥有64GB XPU卡内存的Atlas A2，tensor-parallel-size"
+#~ " 至少应为2；对于32GB内存的XPU卡，tensor-parallel-size 至少应为4。"
+
+#~ msgid "Once your server is started, you can query the model with input prompts"
+#~ msgstr "一旦服务器启动，就可以通过输入提示词来查询模型。"
+
+#~ msgid "Offline Inference on Multi-XPU"
+#~ msgstr "多XPU离线推理"
+
+#~ msgid "Run the following script to execute offline inference on multi-XPU:"
+#~ msgstr "运行以下脚本以在多XPU上执行离线推理："
+
+#~ msgid "If you run this script successfully, you can see the info shown below:"
+#~ msgstr "如果你成功运行此脚本，你可以看到如下所示的信息："
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/multi_npu_qwen3_next.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/multi_npu_qwen3_next.po
@@ -0,0 +1,26 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-kunlun \n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/tutorials/multi_npu_qwen3_next.md:1
+msgid "Multi-XPU (Qwen3-Next)"
+msgstr ""
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/single_node_300i.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/single_node_300i.po
@@ -0,0 +1,94 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version:  vllm-kunlun\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/tutorials/single_node_300i.md:1
+#, fuzzy
+msgid "Single Node (Atlas 300I Series)"
+msgstr "单节点（Atlas 300I 系列）"
+
+#~ msgid ""
+#~ "This Atlas 300I series is currently "
+#~ "experimental. In future versions, there "
+#~ "may be behavioral changes around model"
+#~ " coverage, performance improvement."
+#~ msgstr "Atlas 300I 系列目前处于实验阶段。在未来的版本中，模型覆盖范围和性能提升方面可能会有行为上的变化。"
+
+#~ msgid "Run vLLM on Altlas 300I series"
+#~ msgstr "在 Altlas 300I 系列上运行 vLLM"
+
+#~ msgid "Run docker container:"
+#~ msgstr "运行 docker 容器："
+
+#~ msgid "Setup environment variables:"
+#~ msgstr "设置环境变量："
+
+#~ msgid "Online Inference on XPU"
+#~ msgstr "在XPU上进行在线推理"
+
+#~ msgid ""
+#~ "Run the following script to start "
+#~ "the vLLM server on XPU(Qwen3-0.6B:1 "
+#~ "card, Qwen2.5-7B-Instruct:2 cards, Pangu-"
+#~ "Pro-MoE-72B: 8 cards):"
+#~ msgstr ""
+#~ "运行以下脚本，在 XPU 上启动 vLLM 服务器（Qwen3-0.6B：1 "
+#~ "张卡，Qwen2.5-7B-Instruct：2 张卡，Pangu-Pro-MoE-"
+#~ "72B：8 张卡）："
+
+#~ msgid "Qwen3-0.6B"
+#~ msgstr "Qwen3-0.6B"
+
+#~ msgid "Run the following command to start the vLLM server:"
+#~ msgstr "运行以下命令以启动 vLLM 服务器："
+
+#~ msgid "Once your server is started, you can query the model with input prompts"
+#~ msgstr "一旦服务器启动，就可以通过输入提示词来查询模型。"
+
+#~ msgid "Qwen/Qwen2.5-7B-Instruct"
+#~ msgstr "Qwen/Qwen2.5-7B-Instruct"
+
+#~ msgid "Pangu-Pro-MoE-72B"
+#~ msgstr "Pangu-Pro-MoE-72B"
+
+#~ msgid "Download the model:"
+#~ msgstr "下载该模型："
+
+#~ msgid "If you run this script successfully, you can see the results."
+#~ msgstr "如果你成功运行此脚本，你就可以看到结果。"
+
+#~ msgid "Offline Inference"
+#~ msgstr "离线推理"
+
+#~ msgid ""
+#~ "Run the following script (`example.py`) "
+#~ "to execute offline inference on XPU:"
+#~ msgstr "运行以下脚本（`example.py`）以在 XPU 上执行离线推理："
+
+#~ msgid "Qwen2.5-7B-Instruct"
+#~ msgstr "Qwen2.5-7B-指令版"
+
+#~ msgid "Run script:"
+#~ msgstr "运行脚本："
+
+#~ msgid "If you run this script successfully, you can see the info shown below:"
+#~ msgstr "如果你成功运行此脚本，你可以看到如下所示的信息："
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/single_npu.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/single_npu.po
@@ -0,0 +1,106 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version:  vllm-kunlun\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/tutorials/single_npu.md:1
+msgid "Single XPU (Qwen3 8B)"
+msgstr "单个XPU（Qwen3 8B）"
+
+#: ../../source/tutorials/single_npu.md:3
+msgid "Run vllm-kunlun on Single XPU"
+msgstr "在单个 XPU 上运行 vllm-kunlun"
+
+#: ../../source/tutorials/single_npu.md:5
+msgid "Offline Inference on Single XPU"
+msgstr "在单个XPU上进行离线推理"
+
+#~ msgid "Run docker container:"
+#~ msgstr "运行 docker 容器："
+
+#~ msgid "Setup environment variables:"
+#~ msgstr "设置环境变量："
+
+#~ msgid ""
+#~ "`max_split_size_mb` prevents the native "
+#~ "allocator from splitting blocks larger "
+#~ "than this size (in MB). This can"
+#~ " reduce fragmentation and may allow "
+#~ "some borderline workloads to complete "
+#~ "without running out of memory. You "
+#~ "can find more details "
+#~ "[<u>here</u>](https://www.hikunlun.com/document/detail/zh/CANNCommunityEdition/800alpha003/apiref/envref/envref_07_0061.html)."
+#~ msgstr ""
+#~ "`max_split_size_mb` 防止本地分配器拆分超过此大小（以 MB "
+#~ "为单位）的内存块。这可以减少内存碎片，并且可能让一些边缘情况下的工作负载顺利完成而不会耗尽内存。你可以在[<u>这里</u>](https://www.hikunlun.com/document/detail/zh/CANNCommunityEdition/800alpha003/apiref/envref/envref_07_0061.html)找到更多详细信息。"
+
+#~ msgid "Run the following script to execute offline inference on a single XPU:"
+#~ msgstr "运行以下脚本以在单个 XPU 上执行离线推理："
+
+#~ msgid "Graph Mode"
+#~ msgstr "图模式"
+
+#~ msgid "Eager Mode"
+#~ msgstr "即时模式"
+
+#~ msgid "If you run this script successfully, you can see the info shown below:"
+#~ msgstr "如果你成功运行此脚本，你可以看到如下所示的信息："
+
+#~ msgid "Online Serving on Single XPU"
+#~ msgstr "单个 XPU 上的在线服务"
+
+#~ msgid "Run docker container to start the vLLM server on a single XPU:"
+#~ msgstr "运行 docker 容器，在单个 XPU 上启动 vLLM 服务器："
+
+#~ msgid ""
+#~ "Add `--max_model_len` option to avoid "
+#~ "ValueError that the Qwen2.5-7B model's "
+#~ "max seq len (32768) is larger than"
+#~ " the maximum number of tokens that"
+#~ " can be stored in KV cache "
+#~ "(26240). This will differ with different"
+#~ " XPU series base on the HBM "
+#~ "size. Please modify the value according"
+#~ " to a suitable value for your "
+#~ "XPU series."
+#~ msgstr ""
+#~ "添加 `--max_model_len` 选项，以避免出现 Qwen2.5-7B "
+#~ "模型的最大序列长度（32768）大于 KV 缓存能存储的最大 token "
+#~ "数（26240）时的 ValueError。不同 XPU 系列由于 HBM "
+#~ "容量不同，该值也会有所不同。请根据您的 XPU 系列，修改为合适的数值。"
+
+#~ msgid "If your service start successfully, you can see the info shown below:"
+#~ msgstr "如果你的服务启动成功，你会看到如下所示的信息："
+
+#~ msgid ""
+#~ "Once your server is started, you "
+#~ "can query the model with input "
+#~ "prompts:"
+#~ msgstr "一旦你的服务器启动，你可以通过输入提示词来查询模型："
+
+#~ msgid ""
+#~ "If you query the server successfully,"
+#~ " you can see the info shown "
+#~ "below (client):"
+#~ msgstr "如果你成功查询了服务器，你可以看到如下所示的信息（客户端）："
+
+#~ msgid "Logs of the vllm server:"
+#~ msgstr "vllm 服务器的日志："
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/single_npu_audio.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/single_npu_audio.po
@@ -0,0 +1,77 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-kunlun\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../tutorials/single_npu_audio.md:1
+msgid "Single XPU (Qwen2-Audio 7B)"
+msgstr "单个 XPU（Qwen2-Audio 7B）"
+
+#: ../../tutorials/single_npu_audio.md:3
+msgid "Run vllm-kunlun on Single XPU"
+msgstr "在单个 XPU 上运行 vllm-kunlun"
+
+#: ../../tutorials/single_npu_audio.md:5
+msgid "Offline Inference on Single XPU"
+msgstr "在单个XPU上进行离线推理"
+
+#: ../../tutorials/single_npu_audio.md:7
+msgid "Run docker container:"
+msgstr "运行 docker 容器："
+
+#: ../../tutorials/single_npu_audio.md:29
+msgid "Setup environment variables:"
+msgstr "设置环境变量："
+
+#: ../../tutorials/single_npu_audio.md:40
+msgid ""
+"`max_split_size_mb` prevents the native allocator from splitting blocks "
+"larger than this size (in MB). This can reduce fragmentation and may allow "
+"some borderline workloads to complete without running out of memory. You can"
+" find more details "
+"[<u>here</u>](https://www.hikunlun.com/document/detail/zh/CANNCommunityEdition/800alpha003/apiref/envref/envref_07_0061.html)."
+msgstr ""
+"`max_split_size_mb` 防止本地分配器拆分超过此大小（以 MB "
+"为单位）的内存块。这可以减少内存碎片，并且可能让一些边缘情况下的工作负载顺利完成而不会耗尽内存。你可以在[<u>这里</u>](https://www.hikunlun.com/document/detail/zh/CANNCommunityEdition/800alpha003/apiref/envref/envref_07_0061.html)找到更多详细信息。"
+
+#: ../../tutorials/single_npu_audio.md:43
+msgid "Install packages required for audio processing:"
+msgstr "安装音频处理所需的软件包："
+
+#: ../../tutorials/single_npu_audio.md:50
+msgid "Run the following script to execute offline inference on a single XPU:"
+msgstr "运行以下脚本以在单个 XPU 上执行离线推理："
+
+#: ../../tutorials/single_npu_audio.md:114
+msgid "If you run this script successfully, you can see the info shown below:"
+msgstr "如果你成功运行此脚本，你可以看到如下所示的信息："
+
+#: ../../tutorials/single_npu_audio.md:120
+msgid "Online Serving on Single XPU"
+msgstr "单个 XPU 上的在线服务"
+
+#: ../../tutorials/single_npu_audio.md:122
+msgid ""
+"Currently, vllm's OpenAI-compatible server doesn't support audio inputs, "
+"find more details [<u>here</u>](https://github.com/vllm-"
+"project/vllm/issues/19977)."
+msgstr ""
+"目前，vllm 的兼容 OpenAI 的服务器不支持音频输入，更多详情请查看[<u>这里</u>](https://github.com/vllm-"
+"project/vllm/issues/19977)。"
--- a/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/single_npu_multimodal.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/single_npu_multimodal.po
@@ -0,0 +1,99 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-kunlun\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-07-18 09:01+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Language: zh_CN\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../tutorials/single_npu_multimodal.md:1
+msgid "Single XPU (Qwen2.5-VL 7B)"
+msgstr "单个XPU（Qwen2.5-VL 7B）"
+
+#: ../../tutorials/single_npu_multimodal.md:3
+msgid "Run vllm-kunlun on Single XPU"
+msgstr "在单个 XPU 上运行 vllm-kunlun"
+
+#: ../../tutorials/single_npu_multimodal.md:5
+msgid "Offline Inference on Single XPU"
+msgstr "在单个XPU上进行离线推理"
+
+#: ../../tutorials/single_npu_multimodal.md:7
+msgid "Run docker container:"
+msgstr "运行 docker 容器："
+
+#: ../../tutorials/single_npu_multimodal.md:29
+msgid "Setup environment variables:"
+msgstr "设置环境变量："
+
+#: ../../tutorials/single_npu_multimodal.md:40
+msgid ""
+"`max_split_size_mb` prevents the native allocator from splitting blocks "
+"larger than this size (in MB). This can reduce fragmentation and may allow "
+"some borderline workloads to complete without running out of memory. You can"
+" find more details "
+"[<u>here</u>](https://www.hikunlun.com/document/detail/zh/CANNCommunityEdition/800alpha003/apiref/envref/envref_07_0061.html)."
+msgstr ""
+"`max_split_size_mb` 防止本地分配器拆分超过此大小（以 MB "
+"为单位）的内存块。这可以减少内存碎片，并且可能让一些边缘情况下的工作负载顺利完成而不会耗尽内存。你可以在[<u>这里</u>](https://www.hikunlun.com/document/detail/zh/CANNCommunityEdition/800alpha003/apiref/envref/envref_07_0061.html)找到更多详细信息。"
+
+#: ../../tutorials/single_npu_multimodal.md:43
+msgid "Run the following script to execute offline inference on a single XPU:"
+msgstr "运行以下脚本以在单个 XPU 上执行离线推理："
+
+#: ../../tutorials/single_npu_multimodal.md:109
+msgid "If you run this script successfully, you can see the info shown below:"
+msgstr "如果你成功运行此脚本，你可以看到如下所示的信息："
+
+#: ../../tutorials/single_npu_multimodal.md:121
+msgid "Online Serving on Single XPU"
+msgstr "单个 XPU 上的在线服务"
+
+#: ../../tutorials/single_npu_multimodal.md:123
+msgid "Run docker container to start the vLLM server on a single XPU:"
+msgstr "运行 docker 容器，在单个 XPU 上启动 vLLM 服务器："
+
+#: ../../tutorials/single_npu_multimodal.md:154
+msgid ""
+"Add `--max_model_len` option to avoid ValueError that the "
+"Qwen2.5-VL-7B-Instruct model's max seq len (128000) is larger than the "
+"maximum number of tokens that can be stored in KV cache. This will differ "
+"with different XPU series base on the HBM size. Please modify the value "
+"according to a suitable value for your XPU series."
+msgstr ""
+"新增 `--max_model_len` 选项，以避免出现 ValueError，即 Qwen2.5-VL-7B-Instruct "
+"模型的最大序列长度（128000）大于 KV 缓存可存储的最大 token 数。该数值会根据不同 XPU 系列的 HBM 大小而不同。请根据你的 XPU"
+" 系列，将该值设置为合适的数值。"
+
+#: ../../tutorials/single_npu_multimodal.md:157
+msgid "If your service start successfully, you can see the info shown below:"
+msgstr "如果你的服务启动成功，你会看到如下所示的信息："
+
+#: ../../tutorials/single_npu_multimodal.md:165
+msgid ""
+"Once your server is started, you can query the model with input prompts:"
+msgstr "一旦你的服务器启动，你可以通过输入提示词来查询模型："
+
+#: ../../tutorials/single_npu_multimodal.md:182
+msgid ""
+"If you query the server successfully, you can see the info shown below "
+"(client):"
+msgstr "如果你成功查询了服务器，你可以看到如下所示的信息（客户端）："
+
+#: ../../tutorials/single_npu_multimodal.md:188
+msgid "Logs of the vllm server:"
+msgstr "vllm 服务器的日志："
--- a/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/single_npu_qwen2.5_vl.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/single_npu_qwen2.5_vl.po
@@ -0,0 +1,38 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-kunlun \n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/tutorials/single_npu_qwen2.5_vl.md:1
+msgid "Single XPU (Qwen2.5-VL 7B)"
+msgstr ""
+
+#: ../../source/tutorials/single_npu_qwen2.5_vl.md:3
+msgid "Run vllm-kunlun on Single XPU"
+msgstr ""
+
+#: ../../source/tutorials/single_npu_qwen2.5_vl.md:5
+msgid "Offline Inference on Single XPU"
+msgstr ""
+
+#: ../../source/tutorials/single_npu_qwen2.5_vl.md:7
+msgid "Run docker container:"
+msgstr ""
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/single_npu_qwen2_audio.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/single_npu_qwen2_audio.po
@@ -0,0 +1,38 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-kunlun \n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/tutorials/single_npu_qwen2_audio.md:1
+msgid "Single XPU (Qwen2-Audio 7B)"
+msgstr ""
+
+#: ../../source/tutorials/single_npu_qwen2_audio.md:3
+msgid "Run vllm-kunlun on Single XPU"
+msgstr ""
+
+#: ../../source/tutorials/single_npu_qwen2_audio.md:5
+msgid "Offline Inference on Single XPU"
+msgstr ""
+
+#: ../../source/tutorials/single_npu_qwen2_audio.md:7
+msgid "Run docker container:"
+msgstr ""
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/single_npu_qwen3_embedding.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/single_npu_qwen3_embedding.po
@@ -0,0 +1,77 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version:  vllm-kunlun\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/tutorials/single_npu_qwen3_embedding.md:1
+msgid "Single XPU (Qwen3-Embedding-8B)"
+msgstr "单个XPU（Qwen3-Embedding-8B）"
+
+#: ../../source/tutorials/single_npu_qwen3_embedding.md:3
+msgid ""
+"The Qwen3 Embedding model series is the latest proprietary model of the "
+"Qwen family,"
+msgstr ""
+
+#~ msgid ""
+#~ "The Qwen3 Embedding model series is "
+#~ "the latest proprietary model of the "
+#~ "Qwen family, specifically designed for "
+#~ "text embedding and ranking tasks. "
+#~ "Building upon the dense foundational "
+#~ "models of the Qwen3 series, it "
+#~ "provides a comprehensive range of text"
+#~ " embeddings and reranking models in "
+#~ "various sizes (0.6B, 4B, and 8B). "
+#~ "This guide describes how to run "
+#~ "the model with vLLM Kunlun. Note "
+#~ "that only 0.9.2rc1 and higher versions"
+#~ " of vLLM Kunlun support the model."
+#~ msgstr ""
+#~ "Qwen3 Embedding 模型系列是 Qwen "
+#~ "家族最新的专有模型，专为文本嵌入和排序任务设计。在 Qwen3 "
+#~ "系列的密集基础模型之上，它提供了多种尺寸（0.6B、4B 和 8B）的文本嵌入与重排序模型。本指南介绍如何使用"
+#~ " vLLM Kunlun 运行该模型。请注意，只有 vLLM Kunlun "
+#~ "0.9.2rc1 及更高版本才支持该模型。"
+
+#~ msgid "Run docker container"
+#~ msgstr "运行 docker 容器"
+
+#~ msgid ""
+#~ "Take Qwen3-Embedding-8B model as an "
+#~ "example, first run the docker container"
+#~ " with the following command:"
+#~ msgstr "以 Qwen3-Embedding-8B 模型为例，首先使用以下命令运行 docker 容器："
+
+#~ msgid "Setup environment variables:"
+#~ msgstr "设置环境变量："
+
+#~ msgid "Online Inference"
+#~ msgstr "在线推理"
+
+#~ msgid "Once your server is started, you can query the model with input prompts"
+#~ msgstr "一旦服务器启动，就可以通过输入提示词来查询模型。"
+
+#~ msgid "Offline Inference"
+#~ msgstr "离线推理"
+
+#~ msgid "If you run this script successfully, you can see the info shown below:"
+#~ msgstr "如果你成功运行此脚本，你可以看到如下所示的信息："
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/single_npu_qwen3_quantization.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/tutorials/single_npu_qwen3_quantization.po
@@ -0,0 +1,30 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-kunlun \n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/tutorials/single_npu_qwen3_quantization.md:1
+msgid "Single-XPU (Qwen3 8B W4A8)"
+msgstr ""
+
+#: ../../source/tutorials/single_npu_qwen3_quantization.md:3
+msgid "Run Docker Container"
+msgstr ""
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/configuration/additional_config.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/configuration/additional_config.po
@@ -0,0 +1,245 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version:  vllm-kunlun\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/user_guide/configuration/additional_config.md:1
+msgid "Additional Configuration"
+msgstr "附加配置"
+
+#~ msgid ""
+#~ "additional configuration is a mechanism "
+#~ "provided by vLLM to allow plugins "
+#~ "to control inner behavior by their "
+#~ "own. vLLM Kunlun uses this mechanism "
+#~ "to make the project more flexible."
+#~ msgstr "额外配置是 vLLM 提供的一种机制，允许插件自行控制内部行为。vLLM Kunlun 利用这种机制使项目更加灵活。"
+
+#~ msgid "How to use"
+#~ msgstr "如何使用"
+
+#~ msgid ""
+#~ "With either online mode or offline "
+#~ "mode, users can use additional "
+#~ "configuration. Take Qwen3 as an example:"
+#~ msgstr "无论是在线模式还是离线模式，用户都可以使用额外的配置。以 Qwen3 为例："
+
+#~ msgid "**Online mode**:"
+#~ msgstr "**在线模式**："
+
+#~ msgid "**Offline mode**:"
+#~ msgstr "**离线模式**："
+
+#~ msgid "Configuration options"
+#~ msgstr "配置选项"
+
+#~ msgid ""
+#~ "The following table lists the additional"
+#~ " configuration options available in vLLM"
+#~ " Kunlun:"
+#~ msgstr "下表列出了 vLLM Kunlun 中可用的其他配置选项："
+
+#~ msgid "Name"
+#~ msgstr "名称"
+
+#~ msgid "Type"
+#~ msgstr "类型"
+
+#~ msgid "Default"
+#~ msgstr "默认"
+
+#~ msgid "Description"
+#~ msgstr "描述"
+
+#~ msgid "`torchair_graph_config`"
+#~ msgstr "`torchair_graph_config`"
+
+#~ msgid "dict"
+#~ msgstr "dict"
+
+#~ msgid "`{}`"
+#~ msgstr "`{}`"
+
+#~ msgid "The config options for torchair graph mode"
+#~ msgstr "torchair 图模式的配置选项"
+
+#~ msgid "`kunlun_scheduler_config`"
+#~ msgstr "`kunlun_scheduler_config`"
+
+#~ msgid "The config options for kunlun scheduler"
+#~ msgstr "kunlun 调度器的配置选项"
+
+#~ msgid "`expert_tensor_parallel_size`"
+#~ msgstr "`expert_tensor_parallel_size`"
+
+#~ msgid "str"
+#~ msgstr "str"
+
+#~ msgid "`0`"
+#~ msgstr "`0`"
+
+#~ msgid "Expert tensor parallel size the model to use."
+#~ msgstr "专家张量并行的模型大小设置。"
+
+#~ msgid "`refresh`"
+#~ msgstr "`刷新`"
+
+#~ msgid "bool"
+#~ msgstr "bool"
+
+#~ msgid "`false`"
+#~ msgstr "`false`"
+
+#~ msgid ""
+#~ "Whether to refresh global kunlun config"
+#~ " content. This value is usually used"
+#~ " by rlhf or ut/e2e test case."
+#~ msgstr "是否刷新全局 kunlun 配置信息。此值通常由 rlhf 或 ut/e2e 测试用例使用。"
+
+#~ msgid "`expert_map_path`"
+#~ msgstr "`expert_map_path`"
+
+#~ msgid "`None`"
+#~ msgstr "`None`"
+
+#~ msgid ""
+#~ "When using expert load balancing for "
+#~ "the MOE model, an expert map path"
+#~ " needs to be passed in."
+#~ msgstr "在为MOE模型使用专家负载均衡时，需要传入专家映射路径。"
+
+#~ msgid "`False`"
+#~ msgstr "`False`"
+
+#~ msgid "Whether to enable the fused operator-like chunked_prefill."
+#~ msgstr "是否启用类似算子融合的 chunked_prefill 功能。"
+
+#~ msgid "`kv_cache_dtype`"
+#~ msgstr "`kv_cache_dtype`"
+
+#~ msgid ""
+#~ "When using the kv cache quantization "
+#~ "method, kv cache dtype needs to be"
+#~ " set, currently only int8 is "
+#~ "supported."
+#~ msgstr "当使用kv缓存量化方法时，需要设置kv缓存的数据类型，目前仅支持int8。"
+
+#~ msgid "The details of each config option are as follows:"
+#~ msgstr "每个配置选项的详细信息如下："
+
+#~ msgid "**torchair_graph_config**"
+#~ msgstr "**torchair_graph_config**"
+
+#~ msgid "`enabled`"
+#~ msgstr "`启用`"
+
+#~ msgid ""
+#~ "Whether to enable torchair graph mode."
+#~ " Currently only DeepSeek series models "
+#~ "and PanguProMoE are supported to use "
+#~ "torchair graph mode"
+#~ msgstr "是否启用 torchair 图模式。目前仅支持 DeepSeek 系列模型和 PanguProMoE 使用 torchair 图模式。"
+
+#~ msgid "`enable_multistream_mla`"
+#~ msgstr "`enable_multistream_mla`"
+
+#~ msgid ""
+#~ "Whether to put vector ops of MLA"
+#~ " to another stream. This option only"
+#~ " takes effects on models using MLA"
+#~ " (e.g., DeepSeek)."
+#~ msgstr "是否将MLA的向量操作放到另一个流中。此选项仅对使用MLA的模型（例如，DeepSeek）有效。"
+
+#~ msgid "`multistream_overlap_shared_expert`"
+#~ msgstr "`multistream_overlap_shared_expert`"
+
+#~ msgid ""
+#~ "Whether to enable multistream shared "
+#~ "expert. This option only takes effects"
+#~ " on DeepSeek moe models."
+#~ msgstr "是否启用多流共享专家功能。此选项仅对 DeepSeek MoE 模型生效。"
+
+#~ msgid "`enable_view_optimize`"
+#~ msgstr "`enable_view_optimize` （启用视图优化）"
+
+#~ msgid "`True`"
+#~ msgstr "`True`"
+
+#~ msgid "Whether to enable torchair view optimization"
+#~ msgstr "是否启用torchair视图优化"
+
+#~ msgid "`use_cached_graph`"
+#~ msgstr "`use_cached_graph`"
+
+#~ msgid "Whether to use cached graph"
+#~ msgstr "是否使用缓存的图"
+
+#~ msgid "`graph_batch_sizes`"
+#~ msgstr "`graph_batch_sizes`"
+
+#~ msgid "list[int]"
+#~ msgstr "list[int]"
+
+#~ msgid "`[]`"
+#~ msgstr "`[]`"
+
+#~ msgid "The batch size for torchair graph cache"
+#~ msgstr "torchair 图缓存的批量大小"
+
+#~ msgid "`graph_batch_sizes_init`"
+#~ msgstr "`graph_batch_sizes_init`"
+
+#~ msgid "Init graph batch size dynamically if `graph_batch_sizes` is empty"
+#~ msgstr "如果 `graph_batch_sizes` 为空，则动态初始化图批大小"
+
+#~ msgid "`enable_kv_nz`"
+#~ msgstr "`enable_kv_nz`"
+
+#~ msgid ""
+#~ "Whether to enable kvcache NZ layout. "
+#~ "This option only takes effects on "
+#~ "models using MLA (e.g., DeepSeek)."
+#~ msgstr "是否启用 kvcache NZ 布局。此选项仅对使用 MLA 的模型（例如 DeepSeek）生效。"
+
+#~ msgid "**kunlun_scheduler_config**"
+#~ msgstr "**kunlun_scheduler_config**"
+
+#~ msgid "Whether to enable kunlun scheduler for V1 engine"
+#~ msgstr "是否为 V1 引擎启用 kunlun 调度器"
+
+#~ msgid ""
+#~ "kunlun_scheduler_config also support the "
+#~ "options from [vllm scheduler "
+#~ "config](https://docs.vllm.ai/en/stable/api/vllm/config.html#vllm.config.SchedulerConfig)."
+#~ " For example, you can add "
+#~ "`enable_chunked_prefill: True` to "
+#~ "kunlun_scheduler_config as well."
+#~ msgstr ""
+#~ "kunlun_scheduler_config 也支持来自 [vllm scheduler "
+#~ "config](https://docs.vllm.ai/en/stable/api/vllm/config.html#vllm.config.SchedulerConfig)"
+#~ " 的选项。例如，你也可以在 kunlun_scheduler_config 中添加 "
+#~ "`enable_chunked_prefill: True`。"
+
+#~ msgid "Example"
+#~ msgstr "示例"
+
+#~ msgid "An example of additional configuration is as follows:"
+#~ msgstr "以下是额外配置的一个示例："
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/configuration/env_vars.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/configuration/env_vars.po
@@ -0,0 +1,29 @@
+# Translations template for PROJECT.
+# Copyright (C) 2025 ORGANIZATION
+# This file is distributed under the same license as the PROJECT project.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: PROJECT VERSION\n"
+"Report-Msgid-Bugs-To: EMAIL@ADDRESS\n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: LANGUAGE <LL@li.org>\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/user_guide/configuration/env_vars.md:1
+msgid "Environment Variables"
+msgstr "环境变量"
+
+#~ msgid ""
+#~ "vllm-kunlun uses the following "
+#~ "environment variables to configure the "
+#~ "system:"
+#~ msgstr "vllm-kunlun 使用以下环境变量来配置系统："
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/configuration/index.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/configuration/index.po
@@ -0,0 +1,32 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version:  vllm-kunlun\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 19:12+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/user_guide/configuration/index.md:1
+#: ../../source/user_guide/configuration/index.md:5
+msgid "Configuration Guide"
+msgstr "配置指南"
+
+#: ../../source/user_guide/configuration/index.md:3
+#, fuzzy
+msgid "This section provides a detailed configuration guide of vLLM Kunlun."
+msgstr "本节提供了 vLLM Kunlun 的详细配置指南。"
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/feature_guide/dynamic_batch.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/feature_guide/dynamic_batch.po
@@ -0,0 +1,26 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-kunlun \n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/user_guide/feature_guide/dynamic_batch.md:1
+msgid "Dynamic Batch"
+msgstr ""
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/feature_guide/eplb_swift_balancer.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/feature_guide/eplb_swift_balancer.po
@@ -0,0 +1,30 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-kunlun \n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/user_guide/feature_guide/eplb_swift_balancer.md:1
+msgid "Expert Load Balance (EPLB)"
+msgstr ""
+
+#: ../../source/user_guide/feature_guide/eplb_swift_balancer.md:3
+msgid "Overview"
+msgstr ""
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/feature_guide/graph_mode.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/feature_guide/graph_mode.po
@@ -0,0 +1,126 @@
+# Translations template for PROJECT.
+# Copyright (C) 2025 ORGANIZATION
+# This file is distributed under the same license as the PROJECT project.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: PROJECT VERSION\n"
+"Report-Msgid-Bugs-To: EMAIL@ADDRESS\n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: LANGUAGE <LL@li.org>\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/user_guide/feature_guide/graph_mode.md:1
+msgid "Graph Mode Guide"
+msgstr "图模式指南"
+
+#~ msgid ""
+#~ "This feature is currently experimental. "
+#~ "In future versions, there may be "
+#~ "behavioral changes around configuration, "
+#~ "coverage, performance improvement."
+#~ msgstr "此功能目前为实验性功能。在未来的版本中，配置、覆盖率和性能改进等方面的行为可能会有变化。"
+
+#~ msgid ""
+#~ "This guide provides instructions for "
+#~ "using Kunlun Graph Mode with vLLM "
+#~ "Kunlun. Please note that graph mode "
+#~ "is only available on V1 Engine. "
+#~ "And only Qwen, DeepSeek series models"
+#~ " are well tested from 0.9.0rc1. We'll"
+#~ " make it stable and generalize in "
+#~ "the next release."
+#~ msgstr ""
+#~ "本指南提供了在 vLLM Kunlun 上使用 Kunlun "
+#~ "图模式的操作说明。请注意，图模式仅在 V1 引擎上可用，并且从 0.9.0rc1 起，仅对"
+#~ " Qwen、DeepSeek 系列模型进行了充分测试。我们将在下一个版本中使其更加稳定和通用。"
+
+#~ msgid "Getting Started"
+#~ msgstr "快速入门"
+
+#~ msgid ""
+#~ "From v0.9.1rc1 with V1 Engine, vLLM "
+#~ "Kunlun will run models in graph "
+#~ "mode by default to keep the same"
+#~ " behavior with vLLM. If you hit "
+#~ "any issues, please feel free to "
+#~ "open an issue on GitHub and "
+#~ "fallback to eager mode temporarily by"
+#~ " set `enforce_eager=True` when initializing "
+#~ "the model."
+#~ msgstr ""
+#~ "从 v0.9.1rc1 版本起，使用 V1 引擎时，vLLM Kunlun"
+#~ " 默认将在图模式下运行模型，以保持与 vLLM 同样的行为。如果遇到任何问题，欢迎在 GitHub"
+#~ " 上提交 issue，并在初始化模型时通过设置 `enforce_eager=True` "
+#~ "临时切换回 eager 模式。"
+
+#~ msgid "There are two kinds for graph mode supported by vLLM Kunlun:"
+#~ msgstr "vLLM Kunlun 支持两种图模式："
+
+#~ msgid ""
+#~ "**ACLGraph**: This is the default graph"
+#~ " mode supported by vLLM Kunlun. In"
+#~ " v0.9.1rc1, only Qwen series models "
+#~ "are well tested."
+#~ msgstr ""
+#~ "**ACLGraph**：这是 vLLM Kunlun 支持的默认图模式。在 "
+#~ "v0.9.1rc1 版本中，只有 Qwen 系列模型得到了充分测试。"
+
+#~ msgid ""
+#~ "**TorchAirGraph**: This is the GE graph"
+#~ " mode. In v0.9.1rc1, only DeepSeek "
+#~ "series models are supported."
+#~ msgstr "**TorchAirGraph**：这是GE图模式。在v0.9.1rc1版本中，仅支持DeepSeek系列模型。"
+
+#~ msgid "Using ACLGraph"
+#~ msgstr "使用 ACLGraph"
+
+#~ msgid ""
+#~ "ACLGraph is enabled by default. Take "
+#~ "Qwen series models as an example, "
+#~ "just set to use V1 Engine is "
+#~ "enough."
+#~ msgstr "ACLGraph 默认启用。以 Qwen 系列模型为例，只需设置为使用 V1 引擎即可。"
+
+#~ msgid "offline example:"
+#~ msgstr "离线示例："
+
+#~ msgid "online example:"
+#~ msgstr "在线示例："
+
+#~ msgid "Using TorchAirGraph"
+#~ msgstr "使用 TorchAirGraph"
+
+#~ msgid ""
+#~ "If you want to run DeepSeek series"
+#~ " models with graph mode, you should"
+#~ " use "
+#~ "[TorchAirGraph](https://www.hikunlun.com/document/detail/zh/Pytorch/700/modthirdparty/torchairuseguide/torchair_0002.html)."
+#~ " In this case, additional config is"
+#~ " required."
+#~ msgstr ""
+#~ "如果你想通过图模式运行 DeepSeek 系列模型，你应该使用 "
+#~ "[TorchAirGraph](https://www.hikunlun.com/document/detail/zh/Pytorch/700/modthirdparty/torchairuseguide/torchair_0002.html)。在这种情况下，需要额外的配置。"
+
+#~ msgid ""
+#~ "You can find more detail about "
+#~ "additional config "
+#~ "[here](../configuration/additional_config.md)."
+#~ msgstr "你可以在[这里](../configuration/additional_config.md)找到关于附加配置的更多详细信息。"
+
+#~ msgid "Fallback to Eager Mode"
+#~ msgstr "回退到 Eager 模式"
+
+#~ msgid ""
+#~ "If both `ACLGraph` and `TorchAirGraph` "
+#~ "fail to run, you should fallback "
+#~ "to eager mode."
+#~ msgstr "如果 `ACLGraph` 和 `TorchAirGraph` 都无法运行，你应该退回到 eager 模式。"
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/feature_guide/index.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/feature_guide/index.po
@@ -0,0 +1,32 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version:  vllm-kunlun\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 19:12+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/user_guide/feature_guide/index.md:1
+#: ../../source/user_guide/feature_guide/index.md:5
+msgid "Feature Guide"
+msgstr "功能指南"
+
+#: ../../source/user_guide/feature_guide/index.md:3
+#, fuzzy
+msgid "This section provides a detailed usage guide of vLLM Kunlun features."
+msgstr "本节提供了 vLLM Kunlun 功能的详细使用指南。"
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/feature_guide/kv_pool_mooncake.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/feature_guide/kv_pool_mooncake.po
@@ -0,0 +1,30 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-kunlun \n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/user_guide/feature_guide/kv_pool_mooncake.md:1
+msgid "Mooncacke Store Deployment Guide"
+msgstr ""
+
+#: ../../source/user_guide/feature_guide/kv_pool_mooncake.md:3
+msgid "Environmental Dependencies"
+msgstr ""
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/feature_guide/lora.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/feature_guide/lora.po
@@ -0,0 +1,68 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version:  vllm-kunlun\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/user_guide/feature_guide/lora.md:1
+msgid "LoRA Adapters Guide"
+msgstr "LoRA 适配器指南"
+
+#: ../../source/user_guide/feature_guide/lora.md:3
+msgid "Overview"
+msgstr ""
+
+#~ msgid ""
+#~ "Like vLLM, vllm-kunlun supports LoRA "
+#~ "as well. The usage and more "
+#~ "details can be found in [vLLM "
+#~ "official "
+#~ "document](https://docs.vllm.ai/en/latest/features/lora.html)."
+#~ msgstr ""
+#~ "与 vLLM 类似，vllm-kunlun 也支持 "
+#~ "LoRA。用法及更多详情可参见 [vLLM "
+#~ "官方文档](https://docs.vllm.ai/en/latest/features/lora.html)。"
+
+#~ msgid ""
+#~ "You can also refer to "
+#~ "[this](https://docs.vllm.ai/en/latest/models/supported_models.html"
+#~ "#list-of-text-only-language-models) "
+#~ "to find which models support LoRA "
+#~ "in vLLM."
+#~ msgstr ""
+#~ "你也可以参考[这个链接](https://docs.vllm.ai/en/latest/models/supported_models.html"
+#~ "#list-of-text-only-language-models)来查找哪些模型在"
+#~ " vLLM 中支持 LoRA。"
+
+#~ msgid "Tips"
+#~ msgstr "提示"
+
+#~ msgid ""
+#~ "If you fail to run vllm-kunlun "
+#~ "with LoRA, you may follow [this "
+#~ "instruction](https://vllm-"
+#~ "kunlun.readthedocs.io/en/latest/user_guide/feature_guide/graph_mode.html"
+#~ "#fallback-to-eager-mode) to disable "
+#~ "graph mode and try again."
+#~ msgstr ""
+#~ "如果你在使用 LoRA 运行 vllm-kunlun "
+#~ "时失败，可以按照[此说明](https://vllm-"
+#~ "kunlun.readthedocs.io/en/latest/user_guide/feature_guide/graph_mode.html"
+#~ "#fallback-to-eager-mode)禁用图模式后再重试。"
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/feature_guide/netloader.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/feature_guide/netloader.po
@@ -0,0 +1,26 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: vllm-kunlun \n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/user_guide/feature_guide/netloader.md:1
+msgid "Netloader Guide"
+msgstr ""
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/feature_guide/quantization.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/feature_guide/quantization.po
@@ -0,0 +1,198 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version:  vllm-kunlun\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/user_guide/feature_guide/quantization.md:1
+msgid "Quantization Guide"
+msgstr "量化指南"
+
+#~ msgid ""
+#~ "Model quantization is a technique that"
+#~ " reduces the size and computational "
+#~ "requirements of a model by lowering "
+#~ "the data precision of the weights "
+#~ "and activation values in the model, "
+#~ "thereby saving the memory and improving"
+#~ " the inference speed."
+#~ msgstr "模型量化是一种通过降低模型中权重和激活值的数据精度，从而减少模型大小和计算需求的技术，这样可以节省内存并提高推理速度。"
+
+#~ msgid ""
+#~ "Since 0.9.0rc2 version, quantization feature"
+#~ " is experimentally supported in vLLM "
+#~ "Kunlun. Users can enable quantization "
+#~ "feature by specifying `--quantization kunlun`."
+#~ " Currently, only Qwen, DeepSeek series "
+#~ "models are well tested. We’ll support"
+#~ " more quantization algorithm and models "
+#~ "in the future."
+#~ msgstr ""
+#~ "自 0.9.0rc2 版本起，vLLM Kunlun 实验性地支持量化特性。用户可以通过指定"
+#~ " `--quantization kunlun` 启用量化功能。目前，只有 "
+#~ "Qwen、DeepSeek 系列模型经过了充分测试。未来我们将支持更多的量化算法和模型。"
+
+#~ msgid "Install modelslim"
+#~ msgstr "安装 modelslim"
+
+#~ msgid ""
+#~ "To quantize a model, users should "
+#~ "install "
+#~ "[ModelSlim](https://gitcode.com/Kunlun/msit/blob/master/msmodelslim/README.md)"
+#~ " which is the Kunlun compression and"
+#~ " acceleration tool. It is an "
+#~ "affinity-based compression tool designed "
+#~ "for acceleration, using compression as "
+#~ "its core technology and built upon "
+#~ "the Kunlun platform."
+#~ msgstr "要对模型进行量化，用户应安装[ModelSlim](https://gitcode.com/Kunlun/msit/blob/master/msmodelslim/README.md)，这是昇腾的压缩与加速工具。它是一种基于亲和性的压缩工具，专为加速设计，以压缩为核心技术，并基于昇腾平台构建。"
+
+#~ msgid ""
+#~ "Currently, only the specific tag "
+#~ "[modelslim-"
+#~ "VLLM-8.1.RC1.b020_001](https://gitcode.com/Kunlun/msit/blob"
+#~ "/modelslim-VLLM-8.1.RC1.b020_001/msmodelslim/README.md) of"
+#~ " modelslim works with vLLM Kunlun. "
+#~ "Please do not install other version "
+#~ "until modelslim master version is "
+#~ "available for vLLM Kunlun in the "
+#~ "future."
+#~ msgstr ""
+#~ "目前，只有 modelslim 的特定标签 [modelslim-"
+#~ "VLLM-8.1.RC1.b020_001](https://gitcode.com/Kunlun/msit/blob"
+#~ "/modelslim-VLLM-8.1.RC1.b020_001/msmodelslim/README.md) 支持"
+#~ " vLLM Kunlun。在未来 modelslim 的主版本支持 vLLM "
+#~ "Kunlun 之前，请不要安装其他版本。"
+
+#~ msgid "Install modelslim:"
+#~ msgstr "安装 modelslim："
+
+#~ msgid "Quantize model"
+#~ msgstr "量化模型"
+
+#~ msgid ""
+#~ "Take [DeepSeek-V2-Lite](https://modelscope.cn/models"
+#~ "/deepseek-ai/DeepSeek-V2-Lite) as an example, "
+#~ "you just need to download the "
+#~ "model, and then execute the convert "
+#~ "command. The command is shown below. "
+#~ "More info can be found in "
+#~ "modelslim doc [deepseek w8a8 dynamic "
+#~ "quantization docs](https://gitcode.com/Kunlun/msit/blob"
+#~ "/modelslim-"
+#~ "VLLM-8.1.RC1.b020_001/msmodelslim/example/DeepSeek/README.md#deepseek-v2-w8a8-dynamic%E9%87%8F%E5%8C%96)."
+#~ msgstr ""
+#~ "以 [DeepSeek-V2-Lite](https://modelscope.cn/models/deepseek-"
+#~ "ai/DeepSeek-V2-Lite) 为例，你只需要下载模型，然后执行转换命令。命令如下所示。更多信息可参考 "
+#~ "modelslim 文档 [deepseek w8a8 "
+#~ "动态量化文档](https://gitcode.com/Kunlun/msit/blob/modelslim-"
+#~ "VLLM-8.1.RC1.b020_001/msmodelslim/example/DeepSeek/README.md#deepseek-v2-w8a8-dynamic%E9%87%8F%E5%8C%96)。"
+
+#~ msgid ""
+#~ "You can also download the quantized "
+#~ "model that we uploaded. Please note "
+#~ "that these weights should be used "
+#~ "for test only. For example, "
+#~ "https://www.modelscope.cn/models/vllm-kunlun/DeepSeek-V2"
+#~ "-Lite-W8A8"
+#~ msgstr ""
+#~ "你也可以下载我们上传的量化模型。请注意，这些权重仅应用于测试。例如：https://www.modelscope.cn/models"
+#~ "/vllm-kunlun/DeepSeek-V2-Lite-W8A8"
+
+#~ msgid "Once convert action is done, there are two important files generated."
+#~ msgstr "转换操作完成后，会生成两个重要的文件。"
+
+#~ msgid ""
+#~ "[config.json](https://www.modelscope.cn/models/vllm-"
+#~ "kunlun/DeepSeek-V2-Lite-"
+#~ "W8A8/file/view/master/config.json?status=1). Please make"
+#~ " sure that there is no "
+#~ "`quantization_config` field in it."
+#~ msgstr ""
+#~ "[config.json](https://www.modelscope.cn/models/vllm-"
+#~ "kunlun/DeepSeek-V2-Lite-"
+#~ "W8A8/file/view/master/config.json?status=1)。请确保其中没有 "
+#~ "`quantization_config` 字段。"
+
+#~ msgid ""
+#~ "[quant_model_description.json](https://www.modelscope.cn/models"
+#~ "/vllm-kunlun/DeepSeek-V2-Lite-"
+#~ "W8A8/file/view/master/quant_model_description.json?status=1). "
+#~ "All the converted weights info are "
+#~ "recorded in this file."
+#~ msgstr ""
+#~ "[quant_model_description.json](https://www.modelscope.cn/models"
+#~ "/vllm-kunlun/DeepSeek-V2-Lite-"
+#~ "W8A8/file/view/master/quant_model_description.json?status=1)。所有被转换的权重信息都记录在该文件中。"
+
+#~ msgid "Here is the full converted model files:"
+#~ msgstr "以下是完整转换后的模型文件："
+
+#~ msgid "Run the model"
+#~ msgstr "运行模型"
+
+#~ msgid ""
+#~ "Now, you can run the quantized "
+#~ "models with vLLM Kunlun. Here is "
+#~ "the example for online and offline "
+#~ "inference."
+#~ msgstr "现在，你可以使用 vLLM Kunlun 运行量化模型。下面是在线和离线推理的示例。"
+
+#~ msgid "Offline inference"
+#~ msgstr "离线推理"
+
+#~ msgid "Online inference"
+#~ msgstr "在线推理"
+
+#~ msgid "FAQs"
+#~ msgstr "常见问题解答"
+
+#~ msgid ""
+#~ "1. How to solve the KeyError: "
+#~ "'xxx.layers.0.self_attn.q_proj.weight' problem?"
+#~ msgstr "1. 如何解决 KeyError: 'xxx.layers.0.self_attn.q_proj.weight' 问题？"
+
+#~ msgid ""
+#~ "First, make sure you specify `kunlun`"
+#~ " quantization method. Second, check if "
+#~ "your model is converted by this "
+#~ "`modelslim-VLLM-8.1.RC1.b020_001` modelslim version."
+#~ " Finally, if it still doesn't work,"
+#~ " please submit a issue, maybe some"
+#~ " new models need to be adapted."
+#~ msgstr ""
+#~ "首先，请确保你指定了 `kunlun` 量化方法。其次，检查你的模型是否由 `modelslim-"
+#~ "VLLM-8.1.RC1.b020_001` 这个 modelslim "
+#~ "版本转换。如果仍然无法使用，请提交一个 issue，可能有一些新模型需要适配。"
+
+#~ msgid ""
+#~ "2. How to solve the error \"Could"
+#~ " not locate the configuration_deepseek.py\"?"
+#~ msgstr "2. 如何解决“无法找到 configuration_deepseek.py”错误？"
+
+#~ msgid ""
+#~ "Please convert DeepSeek series models "
+#~ "using `modelslim-VLLM-8.1.RC1.b020_001` modelslim,"
+#~ " this version has fixed the missing"
+#~ " configuration_deepseek.py error."
+#~ msgstr ""
+#~ "请使用 `modelslim-VLLM-8.1.RC1.b020_001` 的 "
+#~ "modelslim 转换 DeepSeek 系列模型，该版本已修复缺少 "
+#~ "configuration_deepseek.py 的错误。"
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/feature_guide/sleep_mode.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/feature_guide/sleep_mode.po
@@ -0,0 +1,165 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version:  vllm-kunlun\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/user_guide/feature_guide/sleep_mode.md:1
+msgid "Sleep Mode Guide"
+msgstr "睡眠模式指南"
+
+#: ../../source/user_guide/feature_guide/sleep_mode.md:3
+msgid "Overview"
+msgstr "概述"
+
+#~ msgid ""
+#~ "Sleep Mode is an API designed to"
+#~ " offload model weights and discard KV"
+#~ " cache from XPU memory. This "
+#~ "functionality is essential for reinforcement"
+#~ " learning (RL) post-training workloads, "
+#~ "particularly in online algorithms such "
+#~ "as PPO, GRPO, or DPO. During "
+#~ "training, the policy model typically "
+#~ "performs auto-regressive generation using "
+#~ "inference engines like vLLM, followed by"
+#~ " forward and backward passes for "
+#~ "optimization."
+#~ msgstr ""
+#~ "Sleep Mode 是一个用于卸载模型权重并清除 XPU 内存中 KV "
+#~ "缓存的 API。此功能对于强化学习（RL）后训练任务尤其重要，特别是在 PPO、GRPO 或 "
+#~ "DPO 等在线算法中。在训练过程中，策略模型通常会使用像 vLLM "
+#~ "这样的推理引擎进行自回归生成，然后进行前向和反向传播以进行优化。"
+
+#~ msgid ""
+#~ "Since the generation and training phases"
+#~ " may employ different model parallelism "
+#~ "strategies, it becomes crucial to free"
+#~ " KV cache and even offload model "
+#~ "parameters stored within vLLM during "
+#~ "training. This ensures efficient memory "
+#~ "utilization and avoids resource contention "
+#~ "on the XPU."
+#~ msgstr ""
+#~ "由于生成和训练阶段可能采用不同的模型并行策略，因此在训练过程中及时释放 KV 缓存，甚至卸载存储在 "
+#~ "vLLM 内的模型参数变得至关重要。这可以确保内存的高效利用，并避免 XPU 上的资源争用。"
+
+#~ msgid "Getting started"
+#~ msgstr "快速上手"
+
+#~ msgid ""
+#~ "With `enable_sleep_mode=True`, the way we "
+#~ "manage memory(malloc, free) in vllm will"
+#~ " under a specific memory pool, during"
+#~ " loading model and initialize kv_caches,"
+#~ " we tag the memory as a map:"
+#~ " `{\"weight\": data, \"kv_cache\": data}`."
+#~ msgstr ""
+#~ "当 `enable_sleep_mode=True` 时，我们在 vllm "
+#~ "中管理内存（malloc, free）的方式会在一个特定的内存池下进行，在加载模型和初始化 kv_caches"
+#~ " 期间，我们会将内存打上标签，组织成一个映射：`{\"weight\": data, "
+#~ "\"kv_cache\": data}`。"
+
+#~ msgid ""
+#~ "The engine(v0/v1) supports two sleep "
+#~ "levels to manage memory during idle "
+#~ "periods:"
+#~ msgstr "该引擎（v0/v1）支持两种睡眠等级，以在空闲期间管理内存："
+
+#~ msgid "Level 1 Sleep"
+#~ msgstr "一级睡眠"
+
+#~ msgid "Action: Offloads model weights and discards the KV cache."
+#~ msgstr "操作：卸载模型权重并清除KV缓存。"
+
+#~ msgid "Memory: Model weights are moved to CPU memory; KV cache is forgotten."
+#~ msgstr "内存：模型权重被移动到CPU内存；KV缓存被清除。"
+
+#~ msgid "Use Case: Suitable when reusing the same model later."
+#~ msgstr "用例：适用于之后需要重复使用同一个模型的情况。"
+
+#~ msgid ""
+#~ "Note: Ensure sufficient CPU memory is"
+#~ " available to hold the model weights."
+#~ msgstr "注意：请确保有足够的CPU内存来存储模型权重。"
+
+#~ msgid "Level 2 Sleep"
+#~ msgstr "二级睡眠"
+
+#~ msgid "Action: Discards both model weights and KV cache."
+#~ msgstr "操作：同时丢弃模型权重和KV缓存。"
+
+#~ msgid ""
+#~ "Memory: The content of both the "
+#~ "model weights and kv cache is "
+#~ "forgotten."
+#~ msgstr "内存：模型权重和kv缓存的内容都会被遗忘。"
+
+#~ msgid ""
+#~ "Use Case: Ideal when switching to "
+#~ "a different model or updating the "
+#~ "current one."
+#~ msgstr "用例：当切换到不同的模型或更新当前模型时非常理想。"
+
+#~ msgid ""
+#~ "Since this feature uses the low-"
+#~ "level API "
+#~ "[KunlunCL](https://www.hikunlun.com/document/detail/zh/CANNCommunityEdition/82RC1alpha002/API/appdevgapi/appdevgapi_07_0000.html),"
+#~ " in order to use sleep mode, "
+#~ "you should follow the [installation "
+#~ "guide](https://vllm-"
+#~ "kunlun.readthedocs.io/en/latest/installation.html) and "
+#~ "building from source, if you are "
+#~ "using v0.7.3, remember to set `export"
+#~ " COMPILE_CUSTOM_KERNELS=1`, for the latest "
+#~ "version(v0.9.x+), the environment variable "
+#~ "`COMPILE_CUSTOM_KERNELS` will be set 1 "
+#~ "by default while building from source."
+#~ msgstr ""
+#~ "由于此功能使用了底层 API "
+#~ "[KunlunCL](https://www.hikunlun.com/document/detail/zh/CANNCommunityEdition/82RC1alpha002/API/appdevgapi/appdevgapi_07_0000.html)，为了使用休眠模式，你应按照[安装指南](https"
+#~ "://vllm-"
+#~ "kunlun.readthedocs.io/en/latest/installation.html)进行操作，并从源码编译。如果你使用的是"
+#~ " v0.7.3，请记得设置 `export COMPILE_CUSTOM_KERNELS=1` "
+#~ "；对于最新版本（v0.9.x+），在从源码编译时环境变量 `COMPILE_CUSTOM_KERNELS` "
+#~ "默认会被设置为 1。"
+
+#~ msgid "Usage"
+#~ msgstr "用法"
+
+#~ msgid "The following is a simple example of how to use sleep mode."
+#~ msgstr "以下是如何使用睡眠模式的一个简单示例。"
+
+#~ msgid "offline inference:"
+#~ msgstr "离线推理："
+
+#~ msgid "online serving:"
+#~ msgstr "在线服务："
+
+#~ msgid ""
+#~ "Considering there may be a risk of"
+#~ " malicious access, please make sure "
+#~ "you are under a dev-mode, and "
+#~ "explicit specify the develop env: "
+#~ "`VLLM_SERVER_DEV_MODE` to expose these "
+#~ "endpoints(sleep/wake up)."
+#~ msgstr ""
+#~ "鉴于可能存在恶意访问的风险，请确保您处于开发模式，并明确指定开发环境：`VLLM_SERVER_DEV_MODE`，以便开放这些端点（sleep/wake"
+#~ " up）。"
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/feature_guide/structured_output.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/feature_guide/structured_output.po
@@ -0,0 +1,235 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version:  vllm-kunlun\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/user_guide/feature_guide/structured_output.md:1
+msgid "Structured Output Guide"
+msgstr "结构化输出指南"
+
+#: ../../source/user_guide/feature_guide/structured_output.md:3
+msgid "Overview"
+msgstr "概述"
+
+#: ../../source/user_guide/feature_guide/structured_output.md:5
+#, fuzzy
+msgid "What is structured output?"
+msgstr "什么是结构化输出？"
+
+#~ msgid ""
+#~ "LLMs can be unpredictable when you "
+#~ "need output in specific formats. Think"
+#~ " of asking a model to generate "
+#~ "JSON - without guidance, it might "
+#~ "produce valid text that breaks JSON "
+#~ "specification. **Structured Output (also "
+#~ "called Guided Decoding)** enables LLMs "
+#~ "to generate outputs that follow a "
+#~ "desired structure while preserving the "
+#~ "non-deterministic nature of the system."
+#~ msgstr ""
+#~ "当你需要特定格式输出时，大型语言模型（LLMs）可能表现出不可预测性。比如让模型生成 "
+#~ "JSON，如果没有指导，模型可能会生成有效的文本，但这些文本却不符合 JSON "
+#~ "规范。**结构化输出（也称为引导解码）** 能让大型语言模型生成符合预期结构的输出，同时保留系统的非确定性特性。"
+
+#~ msgid ""
+#~ "In simple terms, structured decoding "
+#~ "gives LLMs a “template” to follow. "
+#~ "Users provide a schema that “influences”"
+#~ " the model’s output, ensuring compliance"
+#~ " with the desired structure."
+#~ msgstr "简单来说，结构化解码为LLM提供了一个“模板”来遵循。用户提供一个模式来“影响”模型的输出，从而确保输出符合期望的结构。"
+
+#~ msgid "![structured decoding](./images/structured_output_1.png)"
+#~ msgstr "![结构化解码](./images/structured_output_1.png)"
+
+#~ msgid "structured decoding"
+#~ msgstr "结构化解码"
+
+#~ msgid "Structured Output in vllm-kunlun"
+#~ msgstr "vllm-kunlun 中的结构化输出"
+
+#~ msgid ""
+#~ "Currently, vllm-kunlun supports **xgrammar**"
+#~ " and **guidance** backend for structured"
+#~ " output with vllm v1 engine."
+#~ msgstr "目前，vllm-kunlun 支持 vllm v1 引擎的结构化输出，后端包括 **xgrammar** 和 **guidance**。"
+
+#~ msgid ""
+#~ "XGrammar introduces a new technique that"
+#~ " batch constrained decoding via pushdown"
+#~ " automaton (PDA). You can think of"
+#~ " a PDA as a “collection of "
+#~ "FSMs, and each FSM represents a "
+#~ "context-free grammar (CFG).” One "
+#~ "significant advantage of PDA is its "
+#~ "recursive nature, allowing us to execute"
+#~ " multiple state transitions. They also "
+#~ "include additional optimisation (for those "
+#~ "who are interested) to reduce grammar"
+#~ " compilation overhead. Besides, you can "
+#~ "also find more details about guidance"
+#~ " by yourself."
+#~ msgstr ""
+#~ "XGrammar 引入了一种通过下推自动机（PDA）进行批量约束解码的新技术。你可以把 PDA "
+#~ "理解为“有限状态机（FSM）的集合，每个 FSM 代表一个上下文无关文法（CFG）。” PDA "
+#~ "的一个重要优点是其递归特性，使我们能够执行多次状态转移。此外，PDA "
+#~ "还包含了额外的优化（供感兴趣的用户参考），以减少语法编译的开销。除此之外，你还可以自己找到更多关于指导的信息。"
+
+#~ msgid "How to Use Structured Output?"
+#~ msgstr "如何使用结构化输出？"
+
+#~ msgid "Online Inference"
+#~ msgstr "在线推理"
+
+#~ msgid ""
+#~ "You can also generate structured outputs"
+#~ " using the OpenAI's Completions and "
+#~ "Chat API. The following parameters are"
+#~ " supported, which must be added as"
+#~ " extra parameters:"
+#~ msgstr "你也可以使用 OpenAI 的 Completions 和 Chat API 生成结构化输出。支持以下参数，这些参数必须作为额外参数添加："
+
+#~ msgid "`guided_choice`: the output will be exactly one of the choices."
+#~ msgstr "`guided_choice`：输出将会是其中一个选项。"
+
+#~ msgid "`guided_regex`: the output will follow the regex pattern."
+#~ msgstr "`guided_regex`：输出将遵循正则表达式模式。"
+
+#~ msgid "`guided_json`: the output will follow the JSON schema."
+#~ msgstr "`guided_json`：输出将遵循 JSON 架构。"
+
+#~ msgid "`guided_grammar`: the output will follow the context free grammar."
+#~ msgstr "`guided_grammar`：输出将遵循上下文无关文法。"
+
+#~ msgid ""
+#~ "Structured outputs are supported by "
+#~ "default in the OpenAI-Compatible Server."
+#~ " You can choose to specify the "
+#~ "backend to use by setting the "
+#~ "`--guided-decoding-backend` flag to vllm"
+#~ " serve. The default backend is "
+#~ "`auto`, which will try to choose "
+#~ "an appropriate backend based on the "
+#~ "details of the request. You may "
+#~ "also choose a specific backend, along"
+#~ " with some options."
+#~ msgstr ""
+#~ "OpenAI 兼容服务器默认支持结构化输出。你可以通过设置 `--guided-decoding-"
+#~ "backend` 标志为 vllm serve 来指定要使用的后端。默认后端为 "
+#~ "`auto`，它会根据请求的详细信息尝试选择合适的后端。你也可以选择特定的后端，并设置一些选项。"
+
+#~ msgid ""
+#~ "Now let´s see an example for each"
+#~ " of the cases, starting with the "
+#~ "guided_choice, as it´s the easiest one:"
+#~ msgstr "现在让我们来看每种情况的示例，首先是 guided_choice，因为它是最简单的："
+
+#~ msgid ""
+#~ "The next example shows how to use"
+#~ " the guided_regex. The idea is to "
+#~ "generate an email address, given a "
+#~ "simple regex template:"
+#~ msgstr "下一个例子展示了如何使用 guided_regex。其思路是基于一个简单的正则表达式模板生成一个电子邮件地址："
+
+#~ msgid ""
+#~ "One of the most relevant features "
+#~ "in structured text generation is the "
+#~ "option to generate a valid JSON "
+#~ "with pre-defined fields and formats. "
+#~ "For this we can use the "
+#~ "guided_json parameter in two different "
+#~ "ways:"
+#~ msgstr ""
+#~ "在结构化文本生成中，最相关的特性之一是能够生成具有预定义字段和格式的有效 JSON。为此，我们可以通过两种不同的方式使用 "
+#~ "guided_json 参数："
+
+#~ msgid "Using a JSON Schema."
+#~ msgstr "使用 JSON 架构。"
+
+#~ msgid "Defining a Pydantic model and then extracting the JSON Schema from it."
+#~ msgstr "定义一个 Pydantic 模型，然后从中提取 JSON Schema。"
+
+#~ msgid ""
+#~ "The next example shows how to use"
+#~ " the guided_json parameter with a "
+#~ "Pydantic model:"
+#~ msgstr "下一个示例展示了如何将 guided_json 参数与 Pydantic 模型一起使用："
+
+#~ msgid ""
+#~ "Finally we have the guided_grammar "
+#~ "option, which is probably the most "
+#~ "difficult to use, but it´s really "
+#~ "powerful. It allows us to define "
+#~ "complete languages like SQL queries. It"
+#~ " works by using a context free "
+#~ "EBNF grammar. As an example, we "
+#~ "can use to define a specific "
+#~ "format of simplified SQL queries:"
+#~ msgstr ""
+#~ "最后，我们有 guided_grammar 选项，这可能是最难使用的，但它非常强大。它允许我们定义完整的语言，比如"
+#~ " SQL 查询。它通过使用上下文无关的 EBNF 语法来实现。例如，我们可以用它来定义一种简化"
+#~ " SQL 查询的特定格式："
+
+#~ msgid ""
+#~ "Find more examples [here](https://github.com"
+#~ "/vllm-"
+#~ "project/vllm/blob/main/examples/offline_inference/structured_outputs.py)."
+#~ msgstr ""
+#~ "在[这里](https://github.com/vllm-"
+#~ "project/vllm/blob/main/examples/offline_inference/structured_outputs.py)可以找到更多示例。"
+
+#~ msgid "Offline Inference"
+#~ msgstr "离线推理"
+
+#~ msgid ""
+#~ "To use Structured Output, we'll need "
+#~ "to configure the guided decoding using"
+#~ " the class `GuidedDecodingParams` inside "
+#~ "`SamplingParams`. The main available options"
+#~ " inside `GuidedDecodingParams` are:"
+#~ msgstr ""
+#~ "要使用结构化输出，我们需要在 `SamplingParams` 内通过 "
+#~ "`GuidedDecodingParams` 类配置引导解码。`GuidedDecodingParams` "
+#~ "中主要可用的选项有："
+
+#~ msgid "json"
+#~ msgstr "json"
+
+#~ msgid "regex"
+#~ msgstr "正则表达式"
+
+#~ msgid "choice"
+#~ msgstr "选择"
+
+#~ msgid "grammar"
+#~ msgstr "语法"
+
+#~ msgid "One example for the usage of the choice parameter is shown below:"
+#~ msgstr "choice 参数用法的一个示例如下："
+
+#~ msgid ""
+#~ "Find more examples of other usages "
+#~ "[here](https://github.com/vllm-"
+#~ "project/vllm/blob/main/examples/offline_inference/structured_outputs.py)."
+#~ msgstr ""
+#~ "查看更多其他用法的示例 [在这里](https://github.com/vllm-"
+#~ "project/vllm/blob/main/examples/offline_inference/structured_outputs.py)。"
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/release_notes.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/release_notes.po
--- a/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/support_matrix/index.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/support_matrix/index.po
@@ -0,0 +1,33 @@
+# Translations template for PROJECT.
+# Copyright (C) 2025 ORGANIZATION
+# This file is distributed under the same license as the PROJECT project.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version: PROJECT VERSION\n"
+"Report-Msgid-Bugs-To: EMAIL@ADDRESS\n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language-Team: LANGUAGE <LL@li.org>\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/user_guide/support_matrix/index.md:5
+msgid "Support Matrix"
+msgstr "支持矩阵"
+
+#: ../../source/user_guide/support_matrix/index.md:1
+#, fuzzy
+msgid "Features and Models"
+msgstr "特性与模型"
+
+#: ../../source/user_guide/support_matrix/index.md:3
+#, fuzzy
+msgid "This section provides a detailed matrix supported by vLLM Kunlun."
+msgstr "本节提供了 vLLM Kunlun 的详细支持矩阵。"
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/support_matrix/supported_features.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/support_matrix/supported_features.po
@@ -0,0 +1,221 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version:  vllm-kunlun\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/user_guide/support_matrix/supported_features.md:1
+msgid "Supported Features"
+msgstr ""
+
+#: ../../source/user_guide/support_matrix/supported_features.md:3
+msgid "The feature support principle of vLLM"
+msgstr ""
+
+#~ msgid "Feature Support"
+#~ msgstr "功能支持"
+
+#~ msgid ""
+#~ "The feature support principle of vLLM"
+#~ " Kunlun is: **aligned with the "
+#~ "vLLM**. We are also actively "
+#~ "collaborating with the community to "
+#~ "accelerate support."
+#~ msgstr "vLLM Kunlun 的特性支持原则是：**与 vLLM 保持一致**。我们也在积极与社区合作，加快支持进度。"
+
+#~ msgid ""
+#~ "You can check the [support status "
+#~ "of vLLM V1 Engine][v1_user_guide]. Below "
+#~ "is the feature support status of "
+#~ "vLLM Kunlun:"
+#~ msgstr "你可以查看 [vLLM V1 引擎的支持状态][v1_user_guide]。下面是 vLLM Kunlun 的功能支持情况："
+
+#~ msgid "Feature"
+#~ msgstr "特性"
+
+#~ msgid "vLLM V0 Engine"
+#~ msgstr "vLLM V0 引擎"
+
+#~ msgid "vLLM V1 Engine"
+#~ msgstr "vLLM V1 引擎"
+
+#~ msgid "Next Step"
+#~ msgstr "下一步"
+
+#~ msgid "Chunked Prefill"
+#~ msgstr "分块预填充"
+
+#~ msgid "🟢 Functional"
+#~ msgstr "🟢 功能性"
+
+#~ msgid "Functional, see detail note: [Chunked Prefill][cp]"
+#~ msgstr "功能性，详见说明：[分块预填充][cp]"
+
+#~ msgid "Automatic Prefix Caching"
+#~ msgstr "自动前缀缓存"
+
+#~ msgid "Functional, see detail note: [vllm-kunlun#732][apc]"
+#~ msgstr "可用，请参见详细说明：[vllm-kunlun#732][apc]"
+
+#~ msgid "LoRA"
+#~ msgstr "LoRA"
+
+#~ msgid "[vllm-kunlun#396][multilora], [vllm-kunlun#893][v1 multilora]"
+#~ msgstr "[vllm-kunlun#396][multilora]，[vllm-kunlun#893][v1 multilora]"
+
+#~ msgid "Prompt adapter"
+#~ msgstr "提示适配器"
+
+#~ msgid "🔴 No plan"
+#~ msgstr "🔴 无计划"
+
+#~ msgid "This feature has been deprecated by vllm."
+#~ msgstr "此功能已被 vllm 弃用。"
+
+#~ msgid "Speculative decoding"
+#~ msgstr "猜测式解码"
+
+#~ msgid "Basic support"
+#~ msgstr "基础支持"
+
+#~ msgid "Pooling"
+#~ msgstr "池化"
+
+#~ msgid "🟡 Planned"
+#~ msgstr "🟡 计划中"
+
+#~ msgid "CI needed and adapting more models; V1 support rely on vLLM support."
+#~ msgstr "需要持续集成（CI）并适配更多模型；V1 的支持依赖于 vLLM 的支持。"
+
+#~ msgid "Enc-dec"
+#~ msgstr "Enc-dec（编码-解码）"
+
+#~ msgid "🔴 NO plan"
+#~ msgstr "🔴 没有计划"
+
+#~ msgid "Plan in 2025.06.30"
+#~ msgstr "2025.06.30 的计划"
+
+#~ msgid "Multi Modality"
+#~ msgstr "多模态"
+
+#~ msgid "[Tutorial][multimodal], optimizing and adapting more models"
+#~ msgstr "[教程][multimodal]，优化和适配更多模型"
+
+#~ msgid "LogProbs"
+#~ msgstr "LogProbs"
+
+#~ msgid "CI needed"
+#~ msgstr "需要持续集成（CI）"
+
+#~ msgid "Prompt logProbs"
+#~ msgstr "提示 logProbs"
+
+#~ msgid "Async output"
+#~ msgstr "异步输出"
+
+#~ msgid "Multi step scheduler"
+#~ msgstr "多步调度器"
+
+#~ msgid "🔴 Deprecated"
+#~ msgstr "🔴 已弃用"
+
+#~ msgid "[vllm#8779][v1_rfc], replaced by [vLLM V1 Scheduler][v1_scheduler]"
+#~ msgstr "[vllm#8779][v1_rfc]，已被 [vLLM V1 调度器][v1_scheduler] 替代"
+
+#~ msgid "Best of"
+#~ msgstr "精选"
+
+#~ msgid "[vllm#13361][best_of], CI needed"
+#~ msgstr "[vllm#13361][best_of]，需要持续集成（CI）"
+
+#~ msgid "Beam search"
+#~ msgstr "束搜索"
+
+#~ msgid "Guided Decoding"
+#~ msgstr "引导解码"
+
+#~ msgid "[vllm-kunlun#177][guided_decoding]"
+#~ msgstr "[vllm-kunlun#177][guided_decoding]"
+
+#~ msgid "Tensor Parallel"
+#~ msgstr "张量并行"
+
+#~ msgid "Pipeline Parallel"
+#~ msgstr "流水线并行"
+
+#~ msgid "Expert Parallel"
+#~ msgstr "专家并行"
+
+#~ msgid "CI needed; No plan on V0 support"
+#~ msgstr "需要持续集成；没有支持V0的计划"
+
+#~ msgid "Data Parallel"
+#~ msgstr "数据并行"
+
+#~ msgid "CI needed;  No plan on V0 support"
+#~ msgstr "需要 CI；暂无 V0 支持计划"
+
+#~ msgid "Prefill Decode Disaggregation"
+#~ msgstr "预填充 解码 拆分"
+
+#~ msgid "1P1D available, working on xPyD and V1 support."
+#~ msgstr "1P1D 已可用，正在开发 xPyD 和 V1 支持。"
+
+#~ msgid "Quantization"
+#~ msgstr "量化"
+
+#~ msgid "W8A8 available, CI needed; working on more quantization method support"
+#~ msgstr "W8A8 已可用，需要持续集成（CI）；正在开发对更多量化方法的支持。"
+
+#~ msgid "Graph Mode"
+#~ msgstr "图模式"
+
+#~ msgid "🔵 Experimental"
+#~ msgstr "🔵 实验性"
+
+#~ msgid "Experimental, see detail note: [vllm-kunlun#767][graph_mode]"
+#~ msgstr "实验性功能，详见说明：[vllm-kunlun#767][graph_mode]"
+
+#~ msgid "Sleep Mode"
+#~ msgstr "睡眠模式"
+
+#~ msgid "level=1 available, CI needed, working on V1 support"
+#~ msgstr "level=1 可用，需要CI，正在开发 V1 支持"
+
+#~ msgid "🟢 Functional: Fully operational, with ongoing optimizations."
+#~ msgstr "🟢 功能性：完全可用，正在持续优化中。"
+
+#~ msgid ""
+#~ "🔵 Experimental: Experimental support, "
+#~ "interfaces and functions may change."
+#~ msgstr "🔵 实验性：实验性支持，接口和功能可能会发生变化。"
+
+#~ msgid "🚧 WIP: Under active development, will be supported soon."
+#~ msgstr "🚧 WIP：正在积极开发中，很快将会支持。"
+
+#~ msgid ""
+#~ "🟡 Planned: Scheduled for future "
+#~ "implementation (some may have open "
+#~ "PRs/RFCs)."
+#~ msgstr "🟡 计划中：已安排将来实现（其中一些可能已有开放的PR/RFC）。"
+
+#~ msgid "🔴 NO plan / Deprecated: No plan for V0 or deprecated by vLLM v1."
+#~ msgstr "🔴 没有计划 / 已弃用：V0 没有计划或已被 vLLM v1 弃用。"
+
--- a/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/support_matrix/supported_models.po
+++ b/docs/source/locale/zh_CN/LC_MESSAGES/user_guide/support_matrix/supported_models.po
@@ -0,0 +1,168 @@
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2025, vllm-kunlun team
+# This file is distributed under the same license as the vllm-kunlun
+# package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2025.
+#
+#, fuzzy
+msgid ""
+msgstr ""
+"Project-Id-Version:  vllm-kunlun\n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-11-10 16:59+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../source/user_guide/support_matrix/supported_models.md:1
+#, fuzzy
+msgid "Supported Models"
+msgstr "支持"
+
+#~ msgid "Model Support"
+#~ msgstr "模型支持"
+
+#~ msgid "Text-only Language Models"
+#~ msgstr "纯文本语言模型"
+
+#~ msgid "Generative Models"
+#~ msgstr "生成模型"
+
+#~ msgid "Model"
+#~ msgstr "模型"
+
+#~ msgid "Note"
+#~ msgstr "注释"
+
+#~ msgid "DeepSeek v3"
+#~ msgstr "DeepSeek v3"
+
+#~ msgid "✅"
+#~ msgstr "✅"
+
+#~ msgid "DeepSeek R1"
+#~ msgstr "DeepSeek R1"
+
+#~ msgid "DeepSeek Distill (Qwen/LLama)"
+#~ msgstr "DeepSeek 精炼（Qwen/LLama）"
+
+#~ msgid "Qwen3"
+#~ msgstr "Qwen3"
+
+#~ msgid "Qwen3-Moe"
+#~ msgstr "Qwen3-Moe"
+
+#~ msgid "Qwen2.5"
+#~ msgstr "Qwen2.5"
+
+#~ msgid "QwQ-32B"
+#~ msgstr "QwQ-32B"
+
+#~ msgid "LLama3.1/3.2"
+#~ msgstr "LLama3.1/3.2"
+
+#~ msgid "Internlm"
+#~ msgstr "Internlm"
+
+#~ msgid "Baichuan"
+#~ msgstr "百川"
+
+#~ msgid "Phi-4-mini"
+#~ msgstr "Phi-4-mini"
+
+#~ msgid "MiniCPM"
+#~ msgstr "MiniCPM"
+
+#~ msgid "MiniCPM3"
+#~ msgstr "MiniCPM3"
+
+#~ msgid "LLama4"
+#~ msgstr "LLama4"
+
+#~ msgid "Mistral"
+#~ msgstr "Mistral"
+
+#~ msgid "Need test"
+#~ msgstr "需要测试"
+
+#~ msgid "DeepSeek v2.5"
+#~ msgstr "DeepSeek v2.5"
+
+#~ msgid "Gemma-2"
+#~ msgstr "Gemma-2"
+
+#~ msgid "Mllama"
+#~ msgstr "Mllama"
+
+#~ msgid "Gemma-3"
+#~ msgstr "Gemma-3"
+
+#~ msgid "❌"
+#~ msgstr "❌"
+
+#~ msgid "[#496](https://github.com/vllm-project/vllm-kunlun/issues/496)"
+#~ msgstr "[#496](https://github.com/vllm-project/vllm-kunlun/issues/496)"
+
+#~ msgid "ChatGLM"
+#~ msgstr "ChatGLM"
+
+#~ msgid "[#554](https://github.com/vllm-project/vllm-kunlun/issues/554)"
+#~ msgstr "[#554](https://github.com/vllm-project/vllm-kunlun/issues/554)"
+
+#~ msgid "Pooling Models"
+#~ msgstr "池化模型"
+
+#~ msgid "XLM-RoBERTa-based"
+#~ msgstr "基于XLM-RoBERTa"
+
+#~ msgid "Molmo"
+#~ msgstr "Molmo"
+
+#~ msgid "Multimodal Language Models"
+#~ msgstr "多模态语言模型"
+
+#~ msgid "Qwen2-VL"
+#~ msgstr "Qwen2-VL"
+
+#~ msgid "Qwen2.5-VL"
+#~ msgstr "Qwen2.5-VL"
+
+#~ msgid "LLaVA 1.5"
+#~ msgstr "LLaVA 1.5"
+
+#~ msgid "LLaVA 1.6"
+#~ msgstr "LLaVA 1.6"
+
+#~ msgid "[#553](https://github.com/vllm-project/vllm-kunlun/issues/553)"
+#~ msgstr "[#553](https://github.com/vllm-project/vllm-kunlun/issues/553)"
+
+#~ msgid "InternVL2"
+#~ msgstr "InternVL2"
+
+#~ msgid "InternVL2.5"
+#~ msgstr "InternVL2.5"
+
+#~ msgid "Qwen2-Audio"
+#~ msgstr "Qwen2-Audio"
+
+#~ msgid "LLaVA-Next"
+#~ msgstr "LLaVA-Next"
+
+#~ msgid "LLaVA-Next-Video"
+#~ msgstr "LLaVA-Next-Video"
+
+#~ msgid "Phi-3-Vison/Phi-3.5-Vison"
+#~ msgstr "Phi-3-Vison/Phi-3.5-Vison"
+
+#~ msgid "GLM-4v"
+#~ msgstr "GLM-4v"
+
+#~ msgid "Ultravox"
+#~ msgstr "Ultravox"
+