Logo
Explore Help
Register Sign In
EngineX/xc-llm-ascend
3
0
Fork 0
You've already forked xc-llm-ascend
Code Issues Pull Requests Projects Releases Wiki Activity
Files
e8f7b2e3f19c8ec6a1ac0e323e250724ca2539d1
xc-llm-ascend/vllm_ascend/quantization
History
jiangmengyu18 305820f1a9 [Bugfix] fix bug about model type of qwen3_vl_8b_instruct_w8a8 (#7383)
### What this PR does / why we need it?
Adapt to the model type of Qwen3-VL-8B-Instruct-W8A8

- vLLM version: v0.17.0
- vLLM main:
4034c3d32e
---------
Signed-off-by: betta18 <jiangmengyu1@huawei.com>
Co-authored-by: betta18 <jiangmengyu1@huawei.com>
2026-03-18 20:30:03 +08:00
..
methods
[Feature]Supports DSv3.1 PD separation and C8 quantization (#7222)
2026-03-16 22:49:05 +08:00
__init__.py
[Lint]Style: Convert vllm-ascend/ to ruff format(Batch #7) (#6023)
2026-02-06 14:56:53 +08:00
compressed_tensors_config.py
[Lint]Style: Convert vllm-ascend/ to ruff format(Batch #7) (#6023)
2026-02-06 14:56:53 +08:00
method_adapters.py
[bugdix] The problem that the w4a8 weight fails to be loaded when the EP is not enabled is resolved. (#7090)
2026-03-10 16:57:05 +08:00
modelslim_config.py
[Bugfix] fix bug about model type of qwen3_vl_8b_instruct_w8a8 (#7383)
2026-03-18 20:30:03 +08:00
quant_parser.py
[misc] move mxfp_compat into device to decouple from quantization init chain (#6918)
2026-03-02 18:17:01 +08:00
utils.py
[Feature][Quant] Reapply auto-detect quantization format and support remote model ID (#7111)
2026-03-13 22:53:25 +08:00
Powered by Gitea Version: 1.24.3 Page: 148ms Template: 7ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API