[ModelRunner] Fix cuda hard code in model runner (#155)

### What this PR does / why we need it? 1. Fix cuda hard code in model runner. 2. Fix tutorials doc rendering error. ### Does this PR introduce _any_ user-facing change? no. ### How was this patch tested? no. Signed-off-by: Shanshan Shen <467638484@qq.com>
2025-02-27 14:16:46 +08:00
parent 94cd66bba7
commit ee43179767
2 changed files with 6 additions and 3 deletions
--- a/docs/source/tutorials.md
+++ b/docs/source/tutorials.md
@@ -237,7 +237,10 @@ docker run \
 ```
 Choose one machine as head node, the other are worker nodes, then start ray on each machine:
-:::{note} Check out your `nic_name` by command `ip addr`  :::
+
 :::{note}
 Check out your `nic_name` by command `ip addr`.
 :::
 ```shell
 # Head node
--- a/vllm_ascend/model_runner.py
+++ b/vllm_ascend/model_runner.py
@@ -1113,8 +1113,8 @@ class NPUModelRunner(NPUModelRunnerBase[ModelInputForNPUWithSamplingMetadata]):
        if (self.observability_config is not None
                and self.observability_config.collect_model_forward_time):
-            model_forward_start = torch.cuda.Event(enable_timing=True)
+            model_forward_start = torch_npu.npu.Event(enable_timing=True)
-            model_forward_end = torch.cuda.Event(enable_timing=True)
+            model_forward_end = torch_npu.npu.Event(enable_timing=True)
            model_forward_start.record()
        if not bypass_model_exec: