Initial commit for vLLM-Kunlun Plugin
This commit is contained in:
10
docs/source/user_guide/support_matrix/index.md
Normal file
10
docs/source/user_guide/support_matrix/index.md
Normal file
@@ -0,0 +1,10 @@
|
||||
# Features and Models
|
||||
|
||||
This section provides a detailed matrix supported by vLLM-Kunlun.
|
||||
|
||||
:::{toctree}
|
||||
:caption: Support Matrix
|
||||
:maxdepth: 1
|
||||
supported_models
|
||||
supported_features
|
||||
:::
|
||||
14
docs/source/user_guide/support_matrix/supported_features.md
Normal file
14
docs/source/user_guide/support_matrix/supported_features.md
Normal file
@@ -0,0 +1,14 @@
|
||||
# Supported Features
|
||||
|
||||
The feature support principle of vLLM-KunLun is: **aligned with the vLLM**. We are also actively collaborating with the community to accelerate support.
|
||||
|
||||
You can check the [support status of vLLM V1 Engine][v1_user_guide]. Below is the feature support status of vLLM-KunLun:
|
||||
|
||||
## Features Supported
|
||||
|Feature|Status|Note|
|
||||
|-|-|-|
|
||||
|Tensor Parallel|🟢 Functional||
|
||||
|Experts Parallel|🟢 Functional||
|
||||
|Graph Mode|🟢 Functional||
|
||||
|Quantization| 🟢 Functional||
|
||||
|LoRA|⚠️ Need Test|Only LLM models|
|
||||
33
docs/source/user_guide/support_matrix/supported_models.md
Normal file
33
docs/source/user_guide/support_matrix/supported_models.md
Normal file
@@ -0,0 +1,33 @@
|
||||
# Supported Models
|
||||
|
||||
## Generative Models
|
||||
|
||||
| Model | Support | W8A8 | LoRA | Tensor Parallel | Expert Parallel | Data Parallel | Piecewise Kunlun Graph |
|
||||
| :------------ | :------------ | :--- | :--- | :-------------- | :-------------- | :------------ | :--------------------- |
|
||||
| Qwen2 | ✅ | | ✅ | ✅ | | ✅ | ✅ |
|
||||
| Qwen2.5 | ✅ | | ✅ | ✅ | | ✅ | ✅ |
|
||||
| Qwen3 | ✅ | | ✅ | ✅ | | ✅ | ✅ |
|
||||
| Qwen3-Moe | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
|
||||
| Qwen3-Coder | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
|
||||
| QwQ-32B | ✅ | | | ✅ | | ✅ | ✅ |
|
||||
| LLama2 | ✅ | | | ✅ | | ✅ | ✅ |
|
||||
| LLama3 | ✅ | | | ✅ | | ✅ | ✅ |
|
||||
| LLama3.1 | ✅ | | | ✅ | | ✅ | ✅ |
|
||||
| GLM-4.5 | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
|
||||
| GLM-4.5-Air | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
|
||||
| Qwen3-next | 🔜Comming soon | | | | | | |
|
||||
| gpt-oss | 🔜Comming soon | | | | | | |
|
||||
| DeepSeek-V3 | 🔜Comming soon | | | | | | |
|
||||
| DeepSeek-V3.2 | 🔜Comming soon | | | | | | |
|
||||
|
||||
## Multimodal Language Models
|
||||
| Model | Support | W8A8 | LoRA | Tensor Parallel | Expert Parallel | Data Parallel | Piecewise Kunlun Graph |
|
||||
| :----------- | :------------ | :--- | :--- | :-------------- | :-------------- | :------------ | :--------------------- |
|
||||
|Qianfan-VL | ✅ | | | ✅| |✅ |✅|
|
||||
| Qwen2.5VL | ✅ | | | ✅ | | ✅ | ✅ |
|
||||
| InternVL2.5 | ✅ | | | ✅ | | ✅ | ✅ |
|
||||
| InternVL3 | ✅ | | | ✅ | | ✅ | ✅ |
|
||||
| InternVL3.5 | ✅ | | | ✅ | | ✅ | ✅ |
|
||||
| InternS1 | ✅ | | | ✅ | | ✅ | ✅ |
|
||||
| Qwen2.5-Omni | 🔜Comming soon | | | | | | |
|
||||
| Qwen3-VL | 🔜Comming soon | | | | | | |
|
||||
Reference in New Issue
Block a user