初始化项目,由ModelHub XC社区提供模型

Model: harryadav3/Qwen3-30B-A3B-REAP-50
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-04-26 14:48:05 +08:00
commit 64eda1a024
20 changed files with 161956 additions and 0 deletions

36
.gitattributes vendored Normal file
View File

@@ -0,0 +1,36 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
tokenizer.json filter=lfs diff=lfs merge=lfs -text

333
README.md Normal file
View File

@@ -0,0 +1,333 @@
---
license: apache-2.0
base_model: Qwen/Qwen3-30B-A3B
tags:
- reap
- moe
- pruning
- expert-pruning
- qwen3
- mixture-of-experts
- compression
- one-shot
library_name: transformers
pipeline_tag: text-generation
model-index:
- name: Qwen3-30B-A3B-REAP-50
results:
- task:
type: text-generation
name: MMLU
dataset:
type: cais/mmlu
name: MMLU
metrics:
- type: acc
value: 49.42
name: Accuracy (0-shot)
- task:
type: text-generation
name: ARC Challenge
dataset:
type: allenai/ai2_arc
name: ARC-Challenge
metrics:
- type: acc_norm
value: 38.65
name: Accuracy Normalized (0-shot)
- task:
type: text-generation
name: HellaSwag
dataset:
type: Rowan/hellaswag
name: HellaSwag
metrics:
- type: acc_norm
value: 47.64
name: Accuracy Normalized (0-shot)
- task:
type: text-generation
name: BoolQ
dataset:
type: google/boolq
name: BoolQ
metrics:
- type: acc
value: 74.22
name: Accuracy (0-shot)
- task:
type: text-generation
name: WinoGrande
dataset:
type: allenai/winogrande
name: WinoGrande
metrics:
- type: acc
value: 58.80
name: Accuracy (0-shot)
---
# Qwen3-30B-A3B-REAP-50: 50% Expert-Pruned Qwen3 MoE
This model is a **50% expert-pruned** version of [Qwen/Qwen3-30B-A3B](https://huggingface.co/Qwen/Qwen3-30B-A3B), compressed using **REAP (Router-weighted Expert Activation Pruning)** from [Cerebras Research](https://github.com/CerebrasResearch/reap).
REAP is a one-shot compression technique for Mixture-of-Experts (MoE) models that physically removes low-importance experts based on a saliency criterion combining router gate-values and expert activation norms. The method was published at **ICLR 2026**.
## What Changed
| Property | Original | Pruned |
|----------|----------|--------|
| **Total Experts per Layer** | 128 | **64** |
| **Active Experts per Token** | 8 | 8 (unchanged) |
| **Model Size on Disk** | 57 GB | **30 GB** |
| **Safetensor Shards** | 16 | 7 |
| **Architecture** | Qwen3MoeForCausalLM | Qwen3MoeForCausalLM (unchanged) |
| **Hidden Size** | 2048 | 2048 (unchanged) |
| **Layers** | 48 | 48 (unchanged) |
| **Precision** | bfloat16 | bfloat16 (unchanged) |
The pruned model is a standard HuggingFace model and can be loaded directly with `transformers` -- no custom code required.
---
## How REAP Works
### The Problem
MoE models like Qwen3-30B-A3B use sparsely-activated expert networks: each token is routed to only 8 of 128 available experts per layer. This means most experts sit idle for any given input, making many experts redundant. REAP exploits this by identifying and removing the least important experts.
### The REAP Saliency Criterion
REAP scores each expert using a dual criterion that captures both **how often an expert is selected** and **how much it contributes when active**:
```
REAP_score(expert_i) = mean over calibration tokens of:
router_weight(expert_i) * activation_norm(expert_i)
```
Where:
- **Router weight** (`router_weight`): The softmax probability assigned by the gating network when selecting this expert. Higher means the router "prefers" this expert.
- **Expert Activation Norm** (`activation_norm`): The L2 norm of the expert's output vector. Higher means the expert produces larger (more impactful) modifications to the hidden state.
The product captures experts that are both frequently/strongly selected AND produce meaningful outputs. An expert with high router weight but low activation norm is just noise; one with high activation norm but low router weight is rarely used. REAP finds the experts that matter on both dimensions.
### Why Pruning Beats Merging
The REAP paper (ICLR 2026) demonstrates a key finding: **expert pruning consistently outperforms expert merging** for MoE compression on generative tasks. Merging (combining similar experts into one) degrades all participating experts, while pruning (removing entire experts) preserves the full capacity of remaining experts and the router's ability to select among them.
### The Full Pipeline
```
1. Load Model
|
2. Attach Observer Hooks to every MoE layer
|
3. Forward Pass over calibration data (1024 samples)
|-- Record router weights per expert per token
|-- Record L2 norm of expert outputs per token
|
4. Compute REAP saliency score for each expert
|-- score = mean(router_weight * activation_norm)
|
5. Rank experts by saliency score (lowest = least important)
|
6. Prune bottom 50% of experts per layer
|-- Remove expert modules from ModuleList
|-- Slice router weight matrix to match
|
7. Update config.json (num_experts: 128 -> 64)
|
8. Save compressed model
```
---
## Detailed Parameters Used
### Model Configuration
| Parameter | Value | Description |
|-----------|-------|-------------|
| `model_name` | `Qwen/Qwen3-30B-A3B` | Base model: 30B total params, 3B active per token |
| `num_hidden_layers` | 48 | Number of transformer layers |
| `hidden_size` | 2048 | Hidden dimension |
| `num_attention_heads` | 32 | Multi-head attention heads |
| `num_key_value_heads` | 4 | GQA key-value heads |
| `head_dim` | 128 | Per-head dimension |
| `intermediate_size` | 6144 | FFN intermediate size (shared experts) |
| `moe_intermediate_size` | 768 | Per-expert FFN intermediate size |
| `num_experts` | 128 -> **64** | Experts per MoE layer (before -> after) |
| `num_experts_per_tok` | 8 | Top-K experts activated per token (unchanged) |
| `vocab_size` | 151,936 | Vocabulary size |
| `max_position_embeddings` | 40,960 | Maximum sequence length |
| `torch_dtype` | bfloat16 | Model precision |
### Pruning Configuration
| Parameter | Value | Description |
|-----------|-------|-------------|
| `prune_method` | `reap` | REAP saliency criterion (router_weight * activation_norm) |
| `compression_ratio` | 0.50 | Remove 50% of experts (128 -> 64 per layer) |
| `seed` | 42 | Random seed for reproducibility |
| `singleton_super_experts` | `false` | Do not force high-activation outlier experts into singleton clusters |
| `singleton_outlier_experts` | `false` | Do not force outlier experts into singleton clusters |
### Observer Configuration (Activation Collection)
| Parameter | Value | Description |
|-----------|-------|-------------|
| `samples_per_category` | 1024 | Number of calibration samples processed |
| `batch_size` | 1 | Samples per forward pass |
| `model_max_length` | 2048 | Maximum sequence length for calibration |
| `distance_measure` | `cosine` | Distance metric for expert similarity |
| `renormalize_router_weights` | `true` | Renormalize router logits after softmax |
| `record_pruning_metrics_only` | `true` | Only collect metrics needed for pruning (skip merging metrics) |
| `overwrite_observations` | `false` | Do not overwrite existing observation files |
### Calibration Dataset
| Parameter | Value | Description |
|-----------|-------|-------------|
| `dataset_name` | `theblackcat102/evol-codealpaca-v1` | Code instruction-following dataset |
| `split` | `train` | Dataset split used |
| `shuffle` | `true` | Shuffle before sampling |
### Clustering Configuration
| Parameter | Value | Description |
|-----------|-------|-------------|
| `cluster_method` | `agglomerative` | Hierarchical agglomerative clustering |
| `expert_sim` | `ttm` | Token-to-token similarity matrix for expert similarity |
| `linkage_method` | `average` | Average linkage for hierarchical clustering |
| `frequency_penalty` | `true` | Penalize frequently-used experts during clustering |
---
### Timing
| Phase | Duration |
|-------|----------|
| Model loading | ~5 seconds |
| Observer pass (1024 samples) | ~6.5 hours |
| Expert pruning (all 48 layers) | < 1 second |
| Model saving | ~26 seconds |
| **Total** | **~6.5 hours** |
---
## Evaluation Results (0-shot, lm-eval-harness v0.4.11)
| Benchmark | Metric | Score |
|-----------|--------|-------|
| **MMLU** (57 subjects) | acc | **49.42%** |
| -- Humanities | acc | 39.17% |
| -- Social Sciences | acc | 60.38% |
| -- STEM | acc | 56.68% |
| -- Other | acc | 46.73% |
| **ARC Challenge** | acc | 33.62% |
| **ARC Challenge** | acc_norm | 38.65% |
| **ARC Easy** | acc | 53.16% |
| **ARC Easy** | acc_norm | 50.51% |
| **HellaSwag** | acc | 37.70% |
| **HellaSwag** | acc_norm | 47.64% |
| **BoolQ** | acc | **74.22%** |
| **WinoGrande** | acc | 58.80% |
| **OpenBookQA** | acc | 19.80% |
| **OpenBookQA** | acc_norm | 31.20% |
| **RTE** | acc | 58.48% |
### Evaluation Notes
- All benchmarks run at **0-shot** (no few-shot examples)
- Evaluation performed on the **base model** (not instruction-tuned)
- Evaluated using `lm-eval-harness` v0.4.11 with `model="hf"` backend
- Model loaded with `device_map="auto"` across 2 GPUs
---
## Usage
### Direct Loading with Transformers
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "harryadav3/Qwen3-30B-A3B-REAP-50"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_name,
device_map="auto",
torch_dtype="auto",
trust_remote_code=True,
)
messages = [{"role": "user", "content": "Write a Python function to check if a number is prime."}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True)
inputs = inputs.to(model.device)
outputs = model.generate(inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True))
```
### Serving with vLLM
```bash
vllm serve harryadav3/Qwen3-30B-A3B-REAP-50 \
--tensor-parallel-size 2 \
--port 8000 \
--trust-remote-code
```
---
## Reproducing This Model
```bash
# Clone REAP
git clone https://github.com/CerebrasResearch/reap.git
cd reap
git submodule init && git submodule update --recursive
# Install
uv venv .venv --seed --python 3.12
source .venv/bin/activate
uv pip install --editable . --native-tls --torch-backend auto
# Download base model
huggingface-cli download Qwen/Qwen3-30B-A3B
# Run REAP pruning
bash experiments/pruning-cli.sh \
0,1 \
"Qwen/Qwen3-30B-A3B" \
"reap" \
42 \
0.50 \
"theblackcat102/evol-codealpaca-v1" \
false false false false false false false
```
---
## Citation
If you use this model, please cite the REAP paper:
```bibtex
@inproceedings{klasby2025reap,
title={{REAP} the Experts: Why Pruning Prevails for One-Shot {MoE} Compression},
author={Mike Klasby and Thao Nguyen and Robert D Nowak},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2025},
url={https://arxiv.org/abs/2510.13999}
}
```
## Links
- **REAP Paper**: [arXiv:2510.13999](https://arxiv.org/abs/2510.13999)
- **REAP Repository**: [github.com/CerebrasResearch/reap](https://github.com/CerebrasResearch/reap)
- **Base Model**: [Qwen/Qwen3-30B-A3B](https://huggingface.co/Qwen/Qwen3-30B-A3B)
- **Cerebras Blog**: [cerebras.ai/blog/reap](https://www.cerebras.ai/blog/reap)

28
added_tokens.json Normal file
View File

@@ -0,0 +1,28 @@
{
"</think>": 151668,
"</tool_call>": 151658,
"</tool_response>": 151666,
"<think>": 151667,
"<tool_call>": 151657,
"<tool_response>": 151665,
"<|box_end|>": 151649,
"<|box_start|>": 151648,
"<|endoftext|>": 151643,
"<|file_sep|>": 151664,
"<|fim_middle|>": 151660,
"<|fim_pad|>": 151662,
"<|fim_prefix|>": 151659,
"<|fim_suffix|>": 151661,
"<|im_end|>": 151645,
"<|im_start|>": 151644,
"<|image_pad|>": 151655,
"<|object_ref_end|>": 151647,
"<|object_ref_start|>": 151646,
"<|quad_end|>": 151651,
"<|quad_start|>": 151650,
"<|repo_name|>": 151663,
"<|video_pad|>": 151656,
"<|vision_end|>": 151653,
"<|vision_pad|>": 151654,
"<|vision_start|>": 151652
}

89
chat_template.jinja Normal file
View File

@@ -0,0 +1,89 @@
{%- if tools %}
{{- '<|im_start|>system\n' }}
{%- if messages[0].role == 'system' %}
{{- messages[0].content + '\n\n' }}
{%- endif %}
{{- "# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }}
{%- for tool in tools %}
{{- "\n" }}
{{- tool | tojson }}
{%- endfor %}
{{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n" }}
{%- else %}
{%- if messages[0].role == 'system' %}
{{- '<|im_start|>system\n' + messages[0].content + '<|im_end|>\n' }}
{%- endif %}
{%- endif %}
{%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %}
{%- for message in messages[::-1] %}
{%- set index = (messages|length - 1) - loop.index0 %}
{%- if ns.multi_step_tool and message.role == "user" and message.content is string and not(message.content.startswith('<tool_response>') and message.content.endswith('</tool_response>')) %}
{%- set ns.multi_step_tool = false %}
{%- set ns.last_query_index = index %}
{%- endif %}
{%- endfor %}
{%- for message in messages %}
{%- if message.content is string %}
{%- set content = message.content %}
{%- else %}
{%- set content = '' %}
{%- endif %}
{%- if (message.role == "user") or (message.role == "system" and not loop.first) %}
{{- '<|im_start|>' + message.role + '\n' + content + '<|im_end|>' + '\n' }}
{%- elif message.role == "assistant" %}
{%- set reasoning_content = '' %}
{%- if message.reasoning_content is string %}
{%- set reasoning_content = message.reasoning_content %}
{%- else %}
{%- if '</think>' in content %}
{%- set reasoning_content = content.split('</think>')[0].rstrip('\n').split('<think>')[-1].lstrip('\n') %}
{%- set content = content.split('</think>')[-1].lstrip('\n') %}
{%- endif %}
{%- endif %}
{%- if loop.index0 > ns.last_query_index %}
{%- if loop.last or (not loop.last and reasoning_content) %}
{{- '<|im_start|>' + message.role + '\n<think>\n' + reasoning_content.strip('\n') + '\n</think>\n\n' + content.lstrip('\n') }}
{%- else %}
{{- '<|im_start|>' + message.role + '\n' + content }}
{%- endif %}
{%- else %}
{{- '<|im_start|>' + message.role + '\n' + content }}
{%- endif %}
{%- if message.tool_calls %}
{%- for tool_call in message.tool_calls %}
{%- if (loop.first and content) or (not loop.first) %}
{{- '\n' }}
{%- endif %}
{%- if tool_call.function %}
{%- set tool_call = tool_call.function %}
{%- endif %}
{{- '<tool_call>\n{"name": "' }}
{{- tool_call.name }}
{{- '", "arguments": ' }}
{%- if tool_call.arguments is string %}
{{- tool_call.arguments }}
{%- else %}
{{- tool_call.arguments | tojson }}
{%- endif %}
{{- '}\n</tool_call>' }}
{%- endfor %}
{%- endif %}
{{- '<|im_end|>\n' }}
{%- elif message.role == "tool" %}
{%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
{{- '<|im_start|>user' }}
{%- endif %}
{{- '\n<tool_response>\n' }}
{{- content }}
{{- '\n</tool_response>' }}
{%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
{{- '<|im_end|>\n' }}
{%- endif %}
{%- endif %}
{%- endfor %}
{%- if add_generation_prompt %}
{{- '<|im_start|>assistant\n' }}
{%- if enable_thinking is defined and enable_thinking is false %}
{{- '<think>\n\n</think>\n\n' }}
{%- endif %}
{%- endif %}

38
config.json Normal file
View File

@@ -0,0 +1,38 @@
{
"architectures": [
"Qwen3MoeForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 151643,
"decoder_sparse_step": 1,
"eos_token_id": 151645,
"head_dim": 128,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 6144,
"max_position_embeddings": 40960,
"max_window_layers": 48,
"mlp_only_layers": [],
"model_type": "qwen3_moe",
"moe_intermediate_size": 768,
"norm_topk_prob": true,
"num_attention_heads": 32,
"num_experts": 64,
"num_experts_per_tok": 8,
"num_hidden_layers": 48,
"num_key_value_heads": 4,
"output_router_logits": false,
"rms_norm_eps": 1e-06,
"rope_scaling": null,
"rope_theta": 1000000.0,
"router_aux_loss_coef": 0.001,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.55.0",
"use_cache": true,
"use_sliding_window": false,
"vocab_size": 151936
}

13
generation_config.json Normal file
View File

@@ -0,0 +1,13 @@
{
"bos_token_id": 151643,
"do_sample": true,
"eos_token_id": [
151645,
151643
],
"pad_token_id": 151643,
"temperature": 0.6,
"top_k": 20,
"top_p": 0.95,
"transformers_version": "4.55.0"
}

151388
merges.txt Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:343e05b7d1ef0da755b43857cc6fd7a7024e6746f8188421eecadf34c48382fa
size 5000093080

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:e8059d687853d725090941451f2e34fb18e9056f5564cce02321f5a48f9c6f1c
size 4997775080

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:a30af721a7d2cbc69f14c1fae5e189b07cb77ea77987ed08d593937f4a5e2ddc
size 4997775720

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:222a8658711ed92b19469fbea7462e235fd17bc32cb041f8bd181c8bd1a4ae14
size 4997775704

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:daedf7635da71ec1bef5bf48e4af12de389af8b77d7b133223333c2b8ae6f9ac
size 4997505344

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:eef1450e55d8ddecab0d6dabad8298bfed074bf51c2dd149e048775538e0e06f
size 4997775712

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:8a9ffd412a93627b6aece8f25732b25fc90c4f227cdea03e5060ab0cefc0d30f
size 2073121000

9659
model.safetensors.index.json Normal file

File diff suppressed because it is too large Load Diff

77
reap_args.yaml Normal file
View File

@@ -0,0 +1,77 @@
cluster_args:
cluster_description: null
cluster_method: agglomerative
compression_ratio: 0.5
expert_sim: ttm
frequency_penalty: true
linkage_method: average
max_cluster_size: null
multi_layer: null
num_clusters: null
singleton_outlier_experts: false
singleton_super_experts: false
softmax_temperature: null
ds_args:
dataset_config_name: null
dataset_name: theblackcat102/evol-codealpaca-v1
dataset_test_split: test
shuffle: true
split: train
eval_args:
evalplus_tasks:
- mbpp
- humaneval
greedy: true
lm_eval_tasks:
- winogrande
- arc_challenge
- arc_easy
- boolq
- hellaswag
- mmlu
- openbookqa
- rte
min_p: 0.0
parallel_tasks: 32
results_dir: null
run_evalplus: true
run_livecodebench: true
run_lm_eval: true
run_math: false
run_wildbench: false
server_log_file_name: pruning-cli-0.log
temperature: 0.7
top_k: 20
top_p: 0.8
use_server: true
vllm_port: 8000
model_args:
model_name: Qwen/Qwen3-30B-A3B
num_experts_per_tok_override: null
obs_args:
batch_size: 1
distance_measure: cosine
model_max_length: 2048
output_file_name: observations_1024_cosine-seed_42.pt
overwrite_observations: false
record_pruning_metrics_only: true
renormalize_router_weights: true
return_vllm_tokens_prompt: false
samples_per_category: 1024
select_only_categories: null
split_by_category: false
truncate: false
prune_args:
n_experts_to_prune: null
overwrite_pruned_model: false
perserve_outliers: false
perserve_super_experts: false
prune_method: reap
reap_args:
debug: false
do_eval: false
plot_clusters: true
profile: false
run_observer_only: false
seed: 42
smoke_test: true

31
special_tokens_map.json Normal file
View File

@@ -0,0 +1,31 @@
{
"additional_special_tokens": [
"<|im_start|>",
"<|im_end|>",
"<|object_ref_start|>",
"<|object_ref_end|>",
"<|box_start|>",
"<|box_end|>",
"<|quad_start|>",
"<|quad_end|>",
"<|vision_start|>",
"<|vision_end|>",
"<|vision_pad|>",
"<|image_pad|>",
"<|video_pad|>"
],
"eos_token": {
"content": "<|im_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"pad_token": {
"content": "<|endoftext|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

BIN
tokenizer.json (Stored with Git LFS) Normal file

Binary file not shown.

239
tokenizer_config.json Normal file
View File

@@ -0,0 +1,239 @@
{
"add_bos_token": false,
"add_prefix_space": false,
"added_tokens_decoder": {
"151643": {
"content": "<|endoftext|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151644": {
"content": "<|im_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151645": {
"content": "<|im_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151646": {
"content": "<|object_ref_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151647": {
"content": "<|object_ref_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151648": {
"content": "<|box_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151649": {
"content": "<|box_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151650": {
"content": "<|quad_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151651": {
"content": "<|quad_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151652": {
"content": "<|vision_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151653": {
"content": "<|vision_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151654": {
"content": "<|vision_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151655": {
"content": "<|image_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151656": {
"content": "<|video_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151657": {
"content": "<tool_call>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151658": {
"content": "</tool_call>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151659": {
"content": "<|fim_prefix|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151660": {
"content": "<|fim_middle|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151661": {
"content": "<|fim_suffix|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151662": {
"content": "<|fim_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151663": {
"content": "<|repo_name|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151664": {
"content": "<|file_sep|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151665": {
"content": "<tool_response>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151666": {
"content": "</tool_response>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151667": {
"content": "<think>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151668": {
"content": "</think>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
}
},
"additional_special_tokens": [
"<|im_start|>",
"<|im_end|>",
"<|object_ref_start|>",
"<|object_ref_end|>",
"<|box_start|>",
"<|box_end|>",
"<|quad_start|>",
"<|quad_end|>",
"<|vision_start|>",
"<|vision_end|>",
"<|vision_pad|>",
"<|image_pad|>",
"<|video_pad|>"
],
"bos_token": null,
"clean_up_tokenization_spaces": false,
"eos_token": "<|im_end|>",
"errors": "replace",
"extra_special_tokens": {},
"model_max_length": 131072,
"pad_token": "<|endoftext|>",
"split_special_tokens": false,
"tokenizer_class": "Qwen2Tokenizer",
"unk_token": null
}

1
vocab.json Normal file

File diff suppressed because one or more lines are too long