Logo
Explore Help
Register Sign In
EngineX/xc-llm-ascend
3
0
Fork 0
You've already forked xc-llm-ascend
Code Issues Pull Requests Projects Releases Wiki Activity
Files
2cb9f76a0f0f68093a7f8a079c9fde877434ae69
xc-llm-ascend/vllm_ascend/_310p/sample/__init__.py

4 lines
94 B
Python
Raw Normal View History

[BugFix][0.18.0][310p] fix post-sampling not working in graph mode on 310p (#8077) ### What this PR does / why we need it? Enabling temperature in post-processing on 310P devices can cause the service to stall and eventually hang. We first traced the issue to a timeout where the temperature-related `div` operator was waiting for results from a sub-stream. After investigating the preceding operators, we finally identified the root cause as the `q.exponential_()` operator, which is not well supported on 310P and triggers an internal issue in the `add` kernel. ### Does this PR introduce _any_ user-facing change? NA ### How was this patch tested? This patch was thoroughly tested locally(accuracy-dataset test and stress test). It is not easy to design a proper unit test for this case, and I appreciate your understanding. Signed-off-by: Tflowers-0129 <2906339855@qq.com>
2026-04-09 16:31:38 +08:00
from vllm_ascend._310p.sample.sampler import AscendSampler310
__all__ = ["AscendSampler310"]
Reference in New Issue Copy Permalink
Powered by Gitea Version: 1.24.3 Page: 47ms Template: 5ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API