xc-llm-ascend

Author SHA1 Message Date

Author	SHA1	Message	Date
Yikun Jiang	ed175d6d92	[Doc][Release] Add release note skill (#6824 ) ### What this PR does / why we need it? This PR adds the releaseing note skills: - `SKILL.md`: vLLM Ascend Releasing Note Writer - `references/ref-past-release-notes-highlight.md`: And also add a `output/v0.13.0` examples which was used by `2da476d82f` Inspired: https://github.com/simon-mo/release-notes-writing/ ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - vLLM version: v0.15.0 - vLLM main: `83b47f67b1` Co-authored-by: esmeetu <jasonailu87@gmail.com> --------- Signed-off-by: Yikun Jiang <yikunkero@gmail.com>	2026-02-26 21:01:21 +08:00
wangxiyuan	c9d05d10aa	[Doc][Misc] Refactor skill documentation and add Claude support instructions (#6817 ) ### What this PR does / why we need it? This PR refactors the documentation for vLLM Ascend skills. - It renames and moves the `vllm-ascend-model-adapter` skill's README to serve as a new top-level README for the `.agents` directory. - It adds instructions on how to use the Ascend skills with Claude, including a new README in the `.claude` directory. - It updates `.gitignore` to exclude skills copied for Claude's use. - Add main2main skill This improves the documentation structure, making it more organized and providing clear instructions for developers using these skills with different tools. ### Does this PR introduce _any_ user-facing change? No, this PR contains only documentation and repository configuration changes. It does not affect any user-facing code functionality. ### How was this patch tested? These changes are documentation-only and do not require specific testing. The correctness of the instructions is being verified through this review. - vLLM version: v0.15.0 - vLLM main: `83b47f67b1` --------- Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2026-02-26 14:42:59 +08:00
jack	29e3cdde20	[Doc][Skill] Introduce AI-assisted model-adaptation workflow for vllm-ascend (#6731 ) ### What this PR does / why we need it This PR introduces the first AI-assisted model-adaptation skill package for `vllm-ascend`. The goal is to make model adaptation work (especially for recurring feature-request issues) repeatable, auditable, and easier to hand off. ### Scope in this PR This PR adds only skill/workflow assets under: - `.agents/skills/vllm-ascend-model-adapter/SKILL.md` - `.agents/skills/vllm-ascend-model-adapter/references/workflow-checklist.md` - `.agents/skills/vllm-ascend-model-adapter/references/troubleshooting.md` - `.agents/skills/vllm-ascend-model-adapter/references/multimodal-ep-aclgraph-lessons.md` - `.agents/skills/vllm-ascend-model-adapter/references/fp8-on-npu-lessons.md` - `.agents/skills/vllm-ascend-model-adapter/references/deliverables.md` ### Workflow improvements The skill standardizes: 1. Environment assumptions used in our Docker setup - implementation roots: `/vllm-workspace/vllm` and `/vllm-workspace/vllm-ascend` - serving root: `/workspace` - model path convention: `/models/<model-name>` 2. Validation strategy - Stage A: fast `--load-format dummy` gate - Stage B: mandatory real-weight gate before sign-off - avoid false-ready by requiring request-level checks (not startup log only) 3. Feature-first verification checklist - ACLGraph / EP / flashcomm1 / MTP / multimodal - explicit `supported / unsupported / not-applicable / checkpoint-missing` outcomes 4. Delivery contract - minimal scoped code changes - required artifacts (Chinese report + runbook, e2e config YAML, tutorial doc) - one signed commit in delivery repo ### What this PR does NOT do - No runtime/kernel/model patch is included in this PR. - No direct model support claim is made by this PR alone. - Model-specific adaptation/fix work should be submitted in follow-up PRs using this skill as the workflow baseline. ### Why this matters for maintainers This gives the repo a shared, explicit AI-assistance protocol, so future model-adaptation PRs are easier to review, compare, and reproduce. --------- Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com> Co-authored-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>	2026-02-26 08:48:15 +08:00

Yikun Jiang

ed175d6d92

[Doc][Release] Add release note skill (#6824 )

### What this PR does / why we need it?
This PR adds the releaseing note skills:
- `SKILL.md`: vLLM Ascend Releasing Note Writer
- `references/ref-past-release-notes-highlight.md`:
And also add a `output/v0.13.0` examples which was used by
2da476d82f

Inspired: https://github.com/simon-mo/release-notes-writing/

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?

- vLLM version: v0.15.0
- vLLM main:
83b47f67b1


Co-authored-by: esmeetu <jasonailu87@gmail.com>

---------

Signed-off-by: Yikun Jiang <yikunkero@gmail.com>

2026-02-26 21:01:21 +08:00

wangxiyuan

c9d05d10aa

[Doc][Misc] Refactor skill documentation and add Claude support instructions (#6817 )

### What this PR does / why we need it?
This PR refactors the documentation for vLLM Ascend skills.
- It renames and moves the `vllm-ascend-model-adapter` skill's README to
serve as a new top-level README for the `.agents` directory.
- It adds instructions on how to use the Ascend skills with Claude,
including a new README in the `.claude` directory.
- It updates `.gitignore` to exclude skills copied for Claude's use.
- Add main2main skill

This improves the documentation structure, making it more organized and
providing clear instructions for developers using these skills with
different tools.

### Does this PR introduce _any_ user-facing change?
No, this PR contains only documentation and repository configuration
changes. It does not affect any user-facing code functionality.

### How was this patch tested?
These changes are documentation-only and do not require specific
testing. The correctness of the instructions is being verified through
this review.

- vLLM version: v0.15.0
- vLLM main:
83b47f67b1

---------

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

2026-02-26 14:42:59 +08:00

jack

29e3cdde20

[Doc][Skill] Introduce AI-assisted model-adaptation workflow for vllm-ascend (#6731 )

### What this PR does / why we need it

This PR introduces the **first AI-assisted model-adaptation skill
package** for `vllm-ascend`.

The goal is to make model adaptation work (especially for recurring
feature-request issues) **repeatable, auditable, and easier to hand
off**.

### Scope in this PR

This PR adds only skill/workflow assets under:

- `.agents/skills/vllm-ascend-model-adapter/SKILL.md`
-
`.agents/skills/vllm-ascend-model-adapter/references/workflow-checklist.md`
-
`.agents/skills/vllm-ascend-model-adapter/references/troubleshooting.md`
-
`.agents/skills/vllm-ascend-model-adapter/references/multimodal-ep-aclgraph-lessons.md`
-
`.agents/skills/vllm-ascend-model-adapter/references/fp8-on-npu-lessons.md`
- `.agents/skills/vllm-ascend-model-adapter/references/deliverables.md`

### Workflow improvements

The skill standardizes:

1. **Environment assumptions** used in our Docker setup
- implementation roots: `/vllm-workspace/vllm` and
`/vllm-workspace/vllm-ascend`
- serving root: `/workspace`
- model path convention: `/models/<model-name>`

2. **Validation strategy**
- Stage A: fast `--load-format dummy` gate
- Stage B: mandatory real-weight gate before sign-off
- avoid false-ready by requiring request-level checks (not startup log
only)

3. **Feature-first verification checklist**
- ACLGraph / EP / flashcomm1 / MTP / multimodal
- explicit `supported / unsupported / not-applicable /
checkpoint-missing` outcomes

4. **Delivery contract**
- minimal scoped code changes
- required artifacts (Chinese report + runbook, e2e config YAML,
tutorial doc)
- one signed commit in delivery repo

### What this PR does NOT do

- No runtime/kernel/model patch is included in this PR.
- No direct model support claim is made by this PR alone.
- Model-specific adaptation/fix work should be submitted in follow-up
PRs using this skill as the workflow baseline.

### Why this matters for maintainers

This gives the repo a shared, explicit AI-assistance protocol, so future
model-adaptation PRs are easier to review, compare, and reproduce.

---------

Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>
Co-authored-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>

2026-02-26 08:48:15 +08:00

3 Commits