xc-llm-ascend/source at 95e640012822a81783d05fa4e793c16b9048daf6 - xc-llm-ascend - Gitea: Git with a cup of tea

EngineX/xc-llm-ascend

Files

History

InSec a5cb8e40f5 [doc]Modify quantization tutorials (#5026 )

### What this PR does / why we need it?
Modify quantization tutorials to correct a few mistakes:
Qwen3-32B-W4A4.md and Qwen3-8B-W4A8.md
Qwen3-8B-W4A8: need to set one idle npu card.
Qwen3-32B-W4A4: need to set two idle npu cards for the flatquant
training and modify the calib_file path which does not match the
ModeSlim version.
### Does this PR introduce _any_ user-facing change?
N/A
### How was this patch tested?

- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c

Signed-off-by: IncSec <1790766300@qq.com>

2025-12-15 20:12:06 +08:00

..

_templates/sections

[Doc] add v0.9.1 release note (#2646 )

2025-09-03 18:04:27 +08:00

[docs] [P/D] add feature guide for disaggregated-prefill (#3950 )

2025-11-10 09:31:30 +08:00

[Misc] Upgrade vllm hash to 12_14 (#5000 )

2025-12-15 19:54:23 +08:00

developer_guide

[CI] CI refactor (#4928 )

2025-12-14 11:09:56 +08:00

locale/zh_CN/LC_MESSAGES

[doc][main] Correct more doc mistakes (#4958 )

2025-12-13 18:36:58 +08:00

[Doc] Add sphinx build for vllm-ascend (#55 )

2025-02-13 18:44:17 +08:00

[doc]Modify quantization tutorials (#5026 )

2025-12-15 20:12:06 +08:00

update release note for suffix decoding (#5009 )

2025-12-15 17:22:19 +08:00

conf.py

add release note for 0.12.0 (#4995 )

2025-12-13 22:09:59 +08:00

faqs.md

add release note for 0.12.0 (#4995 )

2025-12-13 22:09:59 +08:00

index.md

[feature] vllm-ascend support msprobe (eager mode dump) (#4241 )

2025-11-24 21:58:31 +08:00

installation.md

add release note for 0.12.0 (#4995 )

2025-12-13 22:09:59 +08:00

quick_start.md

[Doc] Refactor the DeepSeek-V3.2-Exp tutorial. (#3871 )

2025-11-04 18:58:33 +08:00