初始化项目,由ModelHub XC社区提供模型

Model: voidful/Llama-3.1-TAIDE-R1-8B-Chat
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-06-01 08:09:12 +08:00
commit 86eb4e9a72
20 changed files with 2364 additions and 0 deletions

51
.gitattributes vendored Normal file
View File

@@ -0,0 +1,51 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bin.* filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zstandard filter=lfs diff=lfs merge=lfs -text
*.tfevents* filter=lfs diff=lfs merge=lfs -text
*.db* filter=lfs diff=lfs merge=lfs -text
*.ark* filter=lfs diff=lfs merge=lfs -text
**/*ckpt*data* filter=lfs diff=lfs merge=lfs -text
**/*ckpt*.meta filter=lfs diff=lfs merge=lfs -text
**/*ckpt*.index filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.gguf* filter=lfs diff=lfs merge=lfs -text
*.ggml filter=lfs diff=lfs merge=lfs -text
*.llamafile* filter=lfs diff=lfs merge=lfs -text
*.pt2 filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
.ipynb_checkpoints/tokenizer-checkpoint.json filter=lfs diff=lfs merge=lfs -text
tokenizer.json filter=lfs diff=lfs merge=lfs -text
tokenizer_config.json filter=lfs diff=lfs merge=lfs -text

View File

@@ -0,0 +1,16 @@
{
"bos_token": {
"content": "<|begin_of_text|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "<|eot_id|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:28c3050d409c8efc2f75afc12eeec40d3f144a9dcbf29342d51ace4c96f8c283
size 28257705

File diff suppressed because it is too large Load Diff

86
README.md Normal file
View File

@@ -0,0 +1,86 @@
---
base_model:
- deepseek-ai/DeepSeek-R1-Distill-Llama-8B
- taide/Llama-3.1-TAIDE-LX-8B-Chat
- meta-llama/Llama-3.1-8B-Instruct
library_name: transformers
tags:
- mergekit
- merge
---
# voidful/Llama-3.1-TAIDE-R1-8B-Chat
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
## Usage
```python
import vllm
from transformers import AutoTokenizer
from vllm import LLM, SamplingParams
model_name = "voidful/Llama-3.1-TAIDE-R1-8B-Chat"
llm = vllm.LLM(model=model_name,max_model_len=4096)
tokenizer = AutoTokenizer.from_pretrained(model_name)
messages = [
{"role": "user", "content": f"早餐喝早餐店的奶茶會導致烙賽為什麼?"},
]
prompts = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
tokenize=False
)
sampling_params = SamplingParams(temperature=0.6, max_tokens=512, top_p=0.9)
outputs = llm.generate(prompts, sampling_params)
print(f"{prompts}")
print(f"{outputs[0].outputs[0].text}\n")
sampling_params = SamplingParams(temperature=0.6, max_tokens=512, top_p=0.9)
outputs = llm.generate(prompts, sampling_params)
print(f"{prompts}")
print(f"{outputs[0].outputs[0].text}\n")
```
Output
```
<think> 關於「早餐喝早餐店的奶茶會導致烙賽」的問題,可能的原因有幾種可能的解釋。首先,「烙賽」這個詞在台灣的網路用語中,通常指的是「燒腸」或「拉肚子」的意思,指的是人體的腸胃或腸道發生不舒服的狀況,可能是消化不良、腹泻、或其他腸胃道的問題。所以,喝了不健康的飲料,可能會導致腸胃不舒服,引起「烙賽」的反應。
其次,一個可能的原因是,早餐店的奶茶可能使用了低品質的奶源、含糖或含奶精等添加物。奶精是一種人工添加劑,可能會對胃造成刺激或不舒服的感覺。再者,早餐的奶茶可能是用即溶的粉末或濃縮的奶來泡的,這些東西可能含有許多添加劑或不健康的成分。
最後,個人的體質也是一個因素。有人可能對奶或糖有過敏或不耐受的反應,喝了之後就會出現不舒服的症狀。
綜合上述的原因,早餐喝早餐店的奶茶可能會導致烙賽的原因有:使用低品質的奶源、含糖或奶精等添加劑、個人的體質對奶或糖有過敏或不耐受的反應等。
<answer> 早餐喝早餐店的奶茶可能導致烙賽的原因有低品質的奶源、含糖或奶精等添加劑、以及個人的體質對奶或糖有過敏或不耐受的反應等。這是因為不健康的飲料成分可能會對身體造成不舒服的影響。</answer>
```
## Merge Details
### Merge Method
This model was merged using the [SCE](https://arxiv.org/abs/2408.07990) merge method using [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) as a base.
### Models Merged
The following models were included in the merge:
* [deepseek-ai/DeepSeek-R1-Distill-Llama-8B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B)
* [taide/Llama-3.1-TAIDE-LX-8B-Chat](https://huggingface.co/taide/Llama-3.1-TAIDE-LX-8B-Chat)
### Configuration
The following YAML configuration was used to produce this model:
```yaml
merge_method: sce
base_model: meta-llama/Llama-3.1-8B-Instruct
tokenizer:
source: taide/Llama-3.1-TAIDE-LX-8B-Chat
models:
- model: taide/Llama-3.1-TAIDE-LX-8B-Chat
- model: deepseek-ai/DeepSeek-R1-Distill-Llama-8B
```

40
config.json Normal file
View File

@@ -0,0 +1,40 @@
{
"_name_or_path": "meta-llama/Llama-3.1-8B-Instruct",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 128000,
"eos_token_id": [
128001,
128008,
128009
],
"head_dim": 128,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 131072,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": {
"factor": 8.0,
"high_freq_factor": 4.0,
"low_freq_factor": 1.0,
"original_max_position_embeddings": 8192,
"rope_type": "llama3"
},
"rope_theta": 500000.0,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.49.0",
"use_cache": true,
"vocab_size": 188256
}

1
configuration.json Normal file
View File

@@ -0,0 +1 @@
{"framework": "pytorch", "task": "text-generation", "allow_remote": true}

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:de3ba823ab5294c7cc3da417c428b94841d1a05b2ad7c13d4f5d659b12a594f4
size 17053084576

7
mergekit_config.yml Normal file
View File

@@ -0,0 +1,7 @@
merge_method: sce
base_model: meta-llama/Llama-3.1-8B-Instruct
tokenizer:
source: taide/Llama-3.1-TAIDE-LX-8B-Chat
models:
- model: taide/Llama-3.1-TAIDE-LX-8B-Chat
- model: deepseek-ai/DeepSeek-R1-Distill-Llama-8B

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:b0ad7dd1c1c957c27254a3bb48bf4544b914834645117f0ca9c3fd621e16164b
size 4946735656

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:0a1382cc17c1649bb63b4acac7feb3284a245bd990b5daf8c3e22f1f1b057d47
size 4915916176

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:0a5312676ffc048ec7c89be7483140d39b7b4ecd4f718e061ab8fb39708a3b80
size 4999819328

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:998a07afa0e032683e5c2b1bc8374cb1db721fda330a119f75e0fe3fa4a6452f
size 2181125184

File diff suppressed because one or more lines are too long

7
params Normal file
View File

@@ -0,0 +1,7 @@
{
"stop": ["</answer>","<|start_header_id|>","<|end_header_id|>","<|eot_id|>"],
"temperature": 0.6,
"top_p": 0.9,
"repeat_penalty": 1.1,
"num_ctx": 8192
}

16
special_tokens_map.json Normal file
View File

@@ -0,0 +1,16 @@
{
"bos_token": {
"content": "<|begin_of_text|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "<|eot_id|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

4
system Normal file
View File

@@ -0,0 +1,4 @@
你是一個來自台灣的AI助理你的名字是 TAIDE樂於以台灣人的立場幫助使用者會用繁體中文回答問題。
You first think about the reasoning process in the mind and then provide the user with the answer while reasoning step by step, and putting the final answer within \\boxed{}.
The reasoning process and answer are enclosed within <think> </think> and <answer> </answer> tags, respectively, i.e.,
<think> reasoning process here </think><answer> answer here </answer>.

48
template Normal file
View File

@@ -0,0 +1,48 @@
{{- if or .System .Tools }}<|start_header_id|>system<|end_header_id|>
{{- if .System }}
{{ .System }}
{{- end }}
{{- if .Tools }}
Cutting Knowledge Date: December 2023
When you receive a tool call response, use the output to format an answer to the orginal user question.
You are a helpful assistant with tool calling capabilities.
{{- end }}<|eot_id|>
{{- end }}
{{- range $i, $_ := .Messages }}
{{- $last := eq (len (slice $.Messages $i)) 1 }}
{{- if eq .Role "user" }}<|start_header_id|>user<|end_header_id|>
{{- if and $.Tools $last }}
Given the following functions, please respond with a JSON for a function call with its proper arguments that best answers the given prompt.
Respond in the format {"name": function name, "parameters": dictionary of argument name and its value}. Do not use variables.
{{ range $.Tools }}
{{- . }}
{{ end }}
Question: {{ .Content }}<|eot_id|>
{{- else }}
{{ .Content }}<|eot_id|>
{{- end }}{{ if $last }}<|start_header_id|>assistant<|end_header_id|>
{{ end }}
{{- else if eq .Role "assistant" }}<|start_header_id|>assistant<|end_header_id|>
{{- if .ToolCalls }}
{{ range .ToolCalls }}
{"name": "{{ .Function.Name }}", "parameters": {{ .Function.Arguments }}}{{ end }}
{{- else }}
{{ .Content }}
{{- end }}{{ if not $last }}<|eot_id|>{{ end }}
{{- else if eq .Role "tool" }}<|start_header_id|>ipython<|end_header_id|>
{{ .Content }}<|eot_id|>{{ if $last }}<|start_header_id|>assistant<|end_header_id|>
{{ end }}
{{- end }}
{{- end }}

3
tokenizer.json Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:28c3050d409c8efc2f75afc12eeec40d3f144a9dcbf29342d51ace4c96f8c283
size 28257705

3
tokenizer_config.json Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:1e16450e13bce13660a045c5554ad4a6050bc77a48aa64406483d690392c8c0e
size 10504183