[EPLB] Avoiding eplb's dependency on a specified model (#6528)
### What this PR does / why we need it? 1. Currently, eplb registers different attributes for different models, but these attributes are not actually used. Now, these attributes are directly deleted. 2. Add some log about eplb. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? #### Deepseek v3.1 chat Of course! Here is a comprehensive explanation of deep learning, broken down for clarity.\n\n### The Simple Analogy: A Child Learning to Recognize a Cat\n\nImagine teaching a child what a cat is. You don't give them a rulebook with instructions like \"has pointy ears, whiskers, and a tail.\" Instead, you show them many pictures, saying \"this is a cat\" or \"this is not a cat.\" The child's brain gradually learns to identify the complex patterns—the combination of shapes, colors, and textures—that define \"cat-ness.\"\n\n**Deep learning is essentially this, but for computers.** It's a method for teaching computers to learn from examples and recognize patterns directly from data (like images, sound, or text) without being explicitly programmed with rigid rules.\n\n---\n\n### The Technical Definition\n\n**Deep Learning is a subfield of machine learning, which itself is a subfield of artificial intelligence (AI).** It uses artificial **neural networks** with many layers (\"deep\" networks) to model and understand complex patterns in data.\n\nHere are the key concepts in that definition:\n\n1. **Artificial Intelligence (AI):** The broad science of making machines smart and capable of performing tasks that typically require human intelligence.\n2. **Machine Learning (ML):** A subset of AI that gives computers the ability to learn from data *without* being explicitly programmed for every single rule.\n3. **Deep Learning (DL):** A specific, powerful - vLLM version: v0.15.0 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.15.0 Signed-off-by: shenchuxiaofugui <1311027364@qq.com>
This commit is contained in:
@@ -28,47 +28,25 @@ def get_log2phy_map(self, layer_id):
|
||||
return self.model.layers[layer_id].mlp.experts.get_log2phy_map()
|
||||
|
||||
|
||||
def get_all_expert_map(self, num_moe_layers):
|
||||
all_loads = []
|
||||
num_dense_layers = self.num_dense_layers if hasattr(self, "num_dense_layers") else 0
|
||||
for layer_id in range(num_moe_layers):
|
||||
load_tensor = self.get_expert_map(layer_id + num_dense_layers) # (num_experts_per_layer,)
|
||||
all_loads.append(load_tensor)
|
||||
|
||||
return torch.stack(all_loads, dim=0)
|
||||
|
||||
|
||||
def get_all_moe_loads(self):
|
||||
num_dense_layers = self.num_dense_layers if hasattr(self, "num_dense_layers") else 0
|
||||
num_dense_layers = getattr(self.model.config, "first_k_dense_replace", 0)
|
||||
num_layers = self.model.config.num_hidden_layers
|
||||
all_moe_loads = torch.stack(
|
||||
[
|
||||
self.model.layers[layer_id + num_dense_layers].mlp.experts.moe_load
|
||||
for layer_id in range(self.num_moe_layers)
|
||||
],
|
||||
[self.model.layers[layer_id].mlp.experts.moe_load for layer_id in range(num_dense_layers, num_layers)],
|
||||
dim=0,
|
||||
)
|
||||
return all_moe_loads
|
||||
|
||||
|
||||
def clear_all_moe_loads(self):
|
||||
num_dense_layers = self.num_dense_layers if hasattr(self, "num_dense_layers") else 0
|
||||
for layer_id in range(self.num_moe_layers):
|
||||
self.model.layers[layer_id + num_dense_layers].mlp.experts.clear_moe_load()
|
||||
num_dense_layers = getattr(self.model.config, "first_k_dense_replace", 0)
|
||||
num_layers = self.model.config.num_hidden_layers
|
||||
for layer_id in range(num_dense_layers, num_layers):
|
||||
self.model.layers[layer_id].mlp.experts.clear_moe_load()
|
||||
|
||||
|
||||
def model_register(model, model_config):
|
||||
def model_register(model):
|
||||
model.get_expert_map = types.MethodType(get_expert_map, model)
|
||||
model.get_log2phy_map = types.MethodType(get_log2phy_map, model)
|
||||
model.get_all_expert_map = types.MethodType(get_all_expert_map, model)
|
||||
model.get_all_moe_loads = types.MethodType(get_all_moe_loads, model)
|
||||
model.clear_all_moe_loads = types.MethodType(clear_all_moe_loads, model)
|
||||
|
||||
config = model_config.hf_text_config
|
||||
|
||||
if config.model_type == "qwen3_moe":
|
||||
model.num_moe_layers = config.num_hidden_layers
|
||||
elif config.model_type == "deepseek_v2" or config.model_type == "deepseek_v3":
|
||||
model.num_dense_layers = config.first_k_dense_replace
|
||||
model.num_moe_layers = config.num_hidden_layers - model.num_dense_layers
|
||||
else:
|
||||
raise NotImplementedError("EPLB is not supported.")
|
||||
|
||||
Reference in New Issue
Block a user