[Doc][Misc] Refactor skill documentation and add Claude support instructions (#6817)

### What this PR does / why we need it? This PR refactors the documentation for vLLM Ascend skills. - It renames and moves the `vllm-ascend-model-adapter` skill's README to serve as a new top-level README for the `.agents` directory. - It adds instructions on how to use the Ascend skills with Claude, including a new README in the `.claude` directory. - It updates `.gitignore` to exclude skills copied for Claude's use. - Add main2main skill This improves the documentation structure, making it more organized and providing clear instructions for developers using these skills with different tools. ### Does this PR introduce _any_ user-facing change? No, this PR contains only documentation and repository configuration changes. It does not affect any user-facing code functionality. ### How was this patch tested? These changes are documentation-only and do not require specific testing. The correctness of the instructions is being verified through this review. - vLLM version: v0.15.0 - vLLM main: 83b47f67b1 --------- Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2026-02-26 14:42:59 +08:00
parent e76b69b9ef
commit c9d05d10aa
4 changed files with 317 additions and 6 deletions
--- a/.agents/README.md
+++ b/.agents/README.md
@@ -0,0 +1,76 @@
+# vLLM Ascend skills
+
+This directory contains the skills for vLLM Ascend.
+
+Note: Please copy the skills directory `.agents/skills` to `.claude/skills` if you want to use the skills in this repo with Claude code.
+
+## Table of Contents
+
+- [vLLM Ascend Model Adapter Skill](#vllm-ascend-model-adapter-skill)
+- [vLLM Ascend main2main Skill](#vllm-ascend-main2main-skill)
+
+## vLLM Ascend Model Adapter Skill
+
+Adapt and debug models for vLLM on Ascend NPU — covering both already-supported
+architectures and new models not yet registered in vLLM.
+
+### What it does
+
+This skill guides an AI agent through a deterministic workflow to:
+
+1. Triage a model checkpoint (architecture, quant type, multimodal capability).
+2. Implement minimal code changes in `/vllm-workspace/vllm` and `/vllm-workspace/vllm-ascend`.
+3. Validate via a two-stage gate (dummy fast gate + real-weight mandatory gate).
+4. Deliver one signed commit with code, test config, and tutorial doc.
+
+### File layout
+
+| File | Purpose |
+| ---- | ------- |
+| `SKILL.md` | Skill definition, constraints, and execution playbook |
+| `references/workflow-checklist.md` | Step-by-step commands and templates |
+| `references/troubleshooting.md` | Symptom-action pairs for common failures |
+| `references/fp8-on-npu-lessons.md` | FP8 checkpoint handling on Ascend |
+| `references/multimodal-ep-aclgraph-lessons.md` | VL, EP, and ACLGraph patterns |
+| `references/deliverables.md` | Required outputs and commit discipline |
+
+### Quick start
+
+1. Open a conversation with the AI agent inside the vllm-ascend dev container.
+2. Invoke the skill (e.g. `/vllm-ascend-model-adapter`).
+3. Provide the model path (default `/models/<model-name>`) and the originating issue number.
+4. The agent follows the playbook in `SKILL.md` and produces a ready-to-merge commit.
+
+### Key constraints
+
+- Never upgrade `transformers`.
+- Start `vllm serve` from `/workspace` (direct command, port 8000).
+- Dummy-only evidence is not sufficient — real-weight validation is mandatory.
+- Final delivery is exactly one signed commit in the current repo.
+
+### Two-stage validation
+
+- **Stage A (dummy)**: fast architecture / operator / API path check with `--load-format dummy`.
+- **Stage B (real)**: real-weight loading, fp8/quant path, KV sharding, runtime stability.
+
+Both stages require request-level verification (`/v1/models` + at least one chat request),
+not just startup success.
+
+## vLLM Ascend main2main Skill
+
+Migrate changes from the main vLLM repository to the vLLM Ascend repository, ensuring compatibility and performance optimizations for Ascend NPUs.
+
+### What it does
+
+This skill facilitates the process of:
+
+1. Identifying changes in the main vLLM repository.
+2. Applying necessary modifications for Ascend support.
+3. Validating the changes in an Ascend environment.
+4. Delivering a ready-to-merge commit with optimized code and configurations.
+
+### Quick start
+
+1. Open a conversation with the AI agent inside the vllm-ascend dev container.
+2. Invoke the skill (e.g. `/main2main`).
+3. The agent follows the playbook and produces a ready-to-merge commit.