There is lot of torchair code in model runner leading the code hard for
maintenance. We'll create new torchair_model_runner to split torchair
related logic. Following the workflow #2203, this is the first PR.
What this PR does:
create the new torchair model runner, more function will be added later
- vLLM version: v0.10.0
- vLLM main:
586f286789
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
### What this PR does / why we need it?
A refactoring of forward_context and model_runner_v1, add some context
which is necessary in model inference into forward_context, and refactor
dummy_run logic, make it more reasonable.
Some details for this PR:
Add `ascend_forward_context`;
Update mc2_v2 op, and support `active_mask` param;
Update scripts in examples dir;
refactor `dummy_run` logic;
Add soc_version for A2 and A3;
### Does this PR introduce _any_ user-facing change?
No change at user-facing.
### How was this patch tested?
- vLLM version: v0.10.0
- vLLM main:
57c22e57f9
Signed-off-by: zzzzwwjj <1183291235@qq.com>
There is a lot torchair specified logic in common code. It results hard
code maintenance. We will create a new torchair module to launch
torchair related logic there. I plan to add 4 PR.
1. Refactor worker
2. Refactor utils (this PR)
- simple change that move all torchair related util function to torchair
module
3. Refactor model_runner
4. Refactor attention
- vLLM version: v0.9.2
- vLLM main:
8188196a1c
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
There is a lot torchair specified logic in common code. It results hard
code maintenance. We will create a new torchair module to launch
torchair related logic there. I plan to add 4 PR.
1. Refactor worker (this PR)
- create torchair module and move torchair related code in worker to the
new module
3. Refactor utils
4. Refactor model_runner
5. Refactor attention
- vLLM version: v0.9.2
- vLLM main:
8188196a1c
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>