[Bugfix][SHM] Fix weak memory ordering problem in share memory (#3988)
### What this PR does / why we need it? This PR aims to fix weak memory ordering problem in share memory by patching message queue with an additional lock. The detailed issue can be found here https://github.com/vllm-project/vllm/issues/27858. The key point is to use the writer lock to enforce memory fence before the ready flag `metadata_buffer[0] = 1` is set. This is a temporary solution, and you can use it by setting env `SHM_BARRIER=true`. By default, we disable this modification. ### Does this PR introduce _any_ user-facing change? `SHM_BARRIER=true` enables this change while `SHM_BARRIER=false` disables this change. The latter is the default choice. ### How was this patch tested? by ci --------- Signed-off-by: Zetong Li <slippersss@126.com>
This commit is contained in:
@@ -24,3 +24,7 @@ import vllm_ascend.patch.platform.patch_sched_yield # noqa
|
||||
if os.getenv("DYNAMIC_EPLB", "false") == "true" or os.getenv(
|
||||
"EXPERT_MAP_RECORD", "false") == "true":
|
||||
import vllm_ascend.patch.platform.patch_multiproc_executor # noqa
|
||||
|
||||
if os.getenv("SHM_BARRIER", "false") == "true":
|
||||
import vllm_ascend.patch.platform.patch_core # noqa
|
||||
import vllm_ascend.patch.platform.patch_message_queue # noqa
|
||||
|
||||
Reference in New Issue
Block a user