Drop torchair (#4814)
aclgraph is stable and fast now. Let's drop torchair graph mode now.
TODO: some logic to adapt torchair should be cleaned up as well. We'll
do it in the following PR.
- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
Co-authored-by: Mengqing Cao <cmq0113@163.com>
This commit is contained in:
@@ -27,5 +27,4 @@ vllm serve Qwen/Qwen1.5-MoE-A2.7B \
|
||||
--max-num-batched-tokens 4096 \
|
||||
--gpu-memory-utilization 0.9 \
|
||||
--trust-remote-code \
|
||||
--enforce-eager \
|
||||
--additional-config '{"torchair_graph_config":{"enabled":false, "use_cached_graph":false}}'
|
||||
--enforce-eager
|
||||
|
||||
Reference in New Issue
Block a user