Files

wangxiyuan c9d05d10aa [Doc][Misc] Refactor skill documentation and add Claude support instructions (#6817 )

### What this PR does / why we need it?
This PR refactors the documentation for vLLM Ascend skills.
- It renames and moves the `vllm-ascend-model-adapter` skill's README to
serve as a new top-level README for the `.agents` directory.
- It adds instructions on how to use the Ascend skills with Claude,
including a new README in the `.claude` directory.
- It updates `.gitignore` to exclude skills copied for Claude's use.
- Add main2main skill

This improves the documentation structure, making it more organized and
providing clear instructions for developers using these skills with
different tools.

### Does this PR introduce _any_ user-facing change?
No, this PR contains only documentation and repository configuration
changes. It does not affect any user-facing code functionality.

### How was this patch tested?
These changes are documentation-only and do not require specific
testing. The correctness of the instructions is being verified through
this review.

- vLLM version: v0.15.0
- vLLM main:
83b47f67b1

---------

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

2026-02-26 14:42:59 +08:00

3.0 KiB

Raw Blame History

vLLM Ascend skills

This directory contains the skills for vLLM Ascend.

Note: Please copy the skills directory .agents/skills to .claude/skills if you want to use the skills in this repo with Claude code.

vLLM Ascend Model Adapter Skill
vLLM Ascend main2main Skill

vLLM Ascend Model Adapter Skill

Adapt and debug models for vLLM on Ascend NPU — covering both already-supported architectures and new models not yet registered in vLLM.

What it does

This skill guides an AI agent through a deterministic workflow to:

Triage a model checkpoint (architecture, quant type, multimodal capability).
Implement minimal code changes in /vllm-workspace/vllm and /vllm-workspace/vllm-ascend.
Validate via a two-stage gate (dummy fast gate + real-weight mandatory gate).
Deliver one signed commit with code, test config, and tutorial doc.

File layout

File	Purpose
`SKILL.md`	Skill definition, constraints, and execution playbook
`references/workflow-checklist.md`	Step-by-step commands and templates
`references/troubleshooting.md`	Symptom-action pairs for common failures
`references/fp8-on-npu-lessons.md`	FP8 checkpoint handling on Ascend
`references/multimodal-ep-aclgraph-lessons.md`	VL, EP, and ACLGraph patterns
`references/deliverables.md`	Required outputs and commit discipline

Quick start

Open a conversation with the AI agent inside the vllm-ascend dev container.
Invoke the skill (e.g. /vllm-ascend-model-adapter).
Provide the model path (default /models/<model-name>) and the originating issue number.
The agent follows the playbook in SKILL.md and produces a ready-to-merge commit.

Key constraints

Never upgrade transformers.
Start vllm serve from /workspace (direct command, port 8000).
Dummy-only evidence is not sufficient — real-weight validation is mandatory.
Final delivery is exactly one signed commit in the current repo.

Two-stage validation

Stage A (dummy): fast architecture / operator / API path check with --load-format dummy.
Stage B (real): real-weight loading, fp8/quant path, KV sharding, runtime stability.

Both stages require request-level verification (/v1/models + at least one chat request), not just startup success.

vLLM Ascend main2main Skill

Migrate changes from the main vLLM repository to the vLLM Ascend repository, ensuring compatibility and performance optimizations for Ascend NPUs.

What it does

This skill facilitates the process of:

Identifying changes in the main vLLM repository.
Applying necessary modifications for Ascend support.
Validating the changes in an Ascend environment.
Delivering a ready-to-merge commit with optimized code and configurations.

Quick start

Open a conversation with the AI agent inside the vllm-ascend dev container.
Invoke the skill (e.g. /main2main).
The agent follows the playbook and produces a ready-to-merge commit.

3.0 KiB Raw Blame History

vLLM Ascend skills

Table of Contents

vLLM Ascend Model Adapter Skill

What it does

File layout

Quick start

Key constraints

Two-stage validation

vLLM Ascend main2main Skill

What it does

Quick start

3.0 KiB

Raw Blame History