Qwen3Fangwusha14B/README.md

---
license: apache-2.0
language:
- zh
tags:
- qwen3
- fangwusha
- text-generation
- chinese-llm
- 15b
library_name: transformers
pipeline_tag: text-generation
base_model: Qwen/Qwen3-14B
---

# Model Card for Yougen/Qwen3Fangwusha14B

<!-- Provide a quick summary of what the model is/does. -->

Qwen3Fangwusha14B是基于Qwen3-14B进行微调的中文大语言模型，专注于提升中文对话能力、指令遵循和通用任务表现。该模型属于Fangwusha系列，旨在为中文用户提供高质量、安全可靠的AI助手服务。

## Model Details

### Model Description

<!-- Provide a longer summary of what this model is. -->

Qwen3Fangwusha14B是一个150亿参数的自回归语言模型，在Qwen3-14B基础上通过高质量中文数据集进行了进一步微调。模型采用BF16精度训练，优化了中文语义理解、逻辑推理和多轮对话能力，适用于各种中文自然语言处理任务。

- **Developed by:** Yougen Yuan
- **Funded by [optional]:** [More Information Needed]
- **Shared by [optional]:** Yougen Yuan
- **Model type:** Auto-regressive language model (Decoder-only)
- **Language(s) (NLP):** 中文 (zh), 英文 (en)
- **License:** Apache-2.0
- **Finetuned from model [optional]:** Qwen/Qwen3-14B

### Model Sources [optional]

<!-- Provide the basic links for the model. -->

- **Repository:** https://huggingface.co/Yougen/Qwen3Fangwusha14B
- **Paper [optional]:** [More Information Needed]
- **Demo [optional]:** [More Information Needed]

## Uses

<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->

### Direct Use

<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->

该模型可直接用于以下任务：
- 中文对话与问答
- 文本生成与续写
- 信息提取与总结
- 翻译与语言转换
- 代码辅助与解释
- 创意写作与内容创作

### Downstream Use [optional]

<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->

该模型可进一步微调用于：
- 特定领域知识库问答
- 客户服务机器人
- 教育辅导系统
- 企业内部智能助手
- 内容审核与分类

### Out-of-Scope Use

<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->

该模型不应用于：
- 生成违法、有害、暴力或歧视性内容
- 未经授权的医疗诊断、法律建议或金融投资建议
- 冒充他人或进行欺诈活动
- 生成可能侵犯知识产权的内容
- 高风险决策系统（如自动驾驶、医疗设备控制等）

## Bias, Risks, and Limitations

<!-- This section is meant to convey both technical and sociotechnical limitations. -->

- 模型可能会生成不准确、不完整或误导性的信息，特别是在处理专业领域知识时
- 模型可能会反映训练数据中存在的偏见和刻板印象
- 模型在处理长文本时可能会出现上下文理解能力下降的情况
- 模型可能会产生幻觉，编造不存在的事实或引用
- 模型的英文能力相对中文较弱

### Recommendations

<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->

用户在使用该模型时应：
- 对模型生成的内容进行事实核查和验证
- 意识到模型可能存在的偏见和局限性
- 在高风险场景中谨慎使用，必要时咨询专业人士
- 遵守相关法律法规和道德规范
- 报告任何有害或不当的模型输出

## How to Get Started with the Model

Use the code below to get started with the model.

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_name = "Yougen/Qwen3Fangwusha14B"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

prompt = "你好，请介绍一下你自己。"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=512,
    temperature=0.7,
    top_p=0.9,
    do_sample=True
)

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
```

## Training Details

### Training Data

<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->

该模型使用了多种高质量中文数据集进行微调，包括：
- 通用对话数据集
- 指令遵循数据集
- 知识问答数据集
- 逻辑推理数据集

所有数据集均经过严格的质量过滤和去重处理，确保训练数据的质量和多样性。

### Training Procedure

<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->

#### Preprocessing [optional]

训练数据经过了以下预处理步骤：
- 文本清洗和标准化
- 格式统一和规范化
- 质量过滤和去重
- 数据增强和多样化

#### Training Hyperparameters

- **Training regime:** BF16 mixed precision
- **Optimizer:** AdamW
- **Learning rate:** [More Information Needed]
- **Batch size:** [More Information Needed]
- **Epochs:** [More Information Needed]
- **Warmup steps:** [More Information Needed]
- **Weight decay:** [More Information Needed]

#### Speeds, Sizes, Times [optional]

<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->

- **Model size:** 15B parameters
- **Checkpoint size:** ~30GB (BF16)
- **Training duration:** [More Information Needed]
- **Training hardware:** [More Information Needed]

## Evaluation

<!-- This section describes the evaluation protocols and provides the results. -->

### Testing Data, Factors & Metrics

#### Testing Data

<!-- This should link to a Dataset Card if possible. -->

模型在以下基准测试集上进行了评估：
- C-Eval (中文通用能力评估)
- MMLU (多任务语言理解)
- GSM8K (数学推理)
- HumanEval (代码生成)

#### Factors

<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->

评估涵盖了以下维度：
- 知识掌握程度
- 逻辑推理能力
- 指令遵循能力
- 中文理解与生成能力
- 代码生成能力

#### Metrics

<!-- These are the evaluation metrics being used, ideally with a description of why. -->

- **Accuracy:** 用于知识问答和选择题任务
- **Pass@k:** 用于代码生成任务
- **BLEU/ROUGE:** 用于文本生成和翻译任务
- **Human evaluation:** 用于对话质量和整体表现评估

### Results

[More Information Needed]

#### Summary

[More Information Needed]

## Model Examination [optional]

<!-- Relevant interpretability work for the model goes here -->

[More Information Needed]

## Environmental Impact

<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->

Carbon emissions can be estimated using the [Machine Learning Impact calculator](sslocal://flow/file_open?url=https%3A%2F%2Fmlco2.github.io%2Fimpact%23compute&flow_extra=eyJsaW5rX3R5cGUiOiJjb2RlX2ludGVycHJldGVyIn0=) presented in [Lacoste et al. (2019)](sslocal://flow/file_open?url=https%3A%2F%2Farxiv.org%2Fabs%2F1910.09700&flow_extra=eyJsaW5rX3R5cGUiOiJjb2RlX2ludGVycHJldGVyIn0=).

- **Hardware Type:** [More Information Needed]
- **Hours used:** [More Information Needed]
- **Cloud Provider:** [More Information Needed]
- **Compute Region:** [More Information Needed]
- **Carbon Emitted:** [More Information Needed]

## Technical Specifications [optional]

### Model Architecture and Objective

该模型基于Qwen3架构，采用解码器-only的Transformer结构：
- 上下文窗口大小：[More Information Needed]
- 注意力机制：Grouped-Query Attention (GQA)
- 激活函数：SwiGLU
- 词表大小：[More Information Needed]

### Compute Infrastructure

[More Information Needed]

#### Hardware

[More Information Needed]

#### Software

- **Framework:** PyTorch 2.x
- **Training library:** LLaMA-Factory
- **Inference library:** Transformers 4.x
- **Acceleration:** FlashAttention-2

## Citation [optional]

<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->

**BibTeX:**

```bibtex
@misc{qwen3fangwusha14b,
  author = {Yuan, Yougen},
  title = {Qwen3Fangwusha14B: A Fine-tuned Chinese Large Language Model},
  year = {2026},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/Yougen/Qwen3Fangwusha14B}}
}
```

**APA:**

Yuan, Y. (2026). Qwen3Fangwusha14B: A Fine-tuned Chinese Large Language Model. Hugging Face. https://huggingface.co/Yougen/Qwen3Fangwusha14B

## Glossary [optional]

<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->

[More Information Needed]

## More Information [optional]

该模型是Fangwusha系列的一部分，更多相关模型可在以下集合中找到：
- [Fangwusha Collection](sslocal://flow/file_open?url=https%3A%2F%2Fhuggingface.co%2Fcollections%2FYougen%2Ffangwusha-6615a7f8a7f8d9a7b8c6d5e4&flow_extra=eyJsaW5rX3R5cGUiOiJjb2RlX2ludGVycHJldGVyIn0=)

## Model Card Authors [optional]

Yougen Yuan

## Model Card Contact

[More Information Needed]
-												初始化项目，由ModelHub XC社区提供模型

Model: Yougen/Qwen3Fangwusha14B
Source: Original Platform

											
										
										
											2026-04-26 14:48:02 +08:00
+								---
 								license: apache-2.0
 								language:
 								- zh
 								tags:
 								- qwen3
 								- fangwusha
 								- text-generation
 								- chinese-llm
 								- 15b
 								library_name: transformers
 								pipeline_tag: text-generation
 								base_model: Qwen/Qwen3-14B
 								---
 								# Model Card for Yougen/Qwen3Fangwusha14B
 								<!-- Provide a quick summary of what the model is/does. -->
 								Qwen3Fangwusha14B是基于Qwen3-14B进行微调的中文大语言模型，专注于提升中文对话能力、指令遵循和通用任务表现。该模型属于Fangwusha系列，旨在为中文用户提供高质量、安全可靠的AI助手服务。
 								## Model Details
 								### Model Description
 								<!-- Provide a longer summary of what this model is. -->
 								Qwen3Fangwusha14B是一个150亿参数的自回归语言模型，在Qwen3-14B基础上通过高质量中文数据集进行了进一步微调。模型采用BF16精度训练，优化了中文语义理解、逻辑推理和多轮对话能力，适用于各种中文自然语言处理任务。
 								- **Developed by:** Yougen Yuan
 								- **Funded by [optional]:** [More Information Needed]
 								- **Shared by [optional]:** Yougen Yuan
 								- **Model type:** Auto-regressive language model (Decoder-only)
 								- **Language(s) (NLP):** 中文 (zh), 英文 (en)
 								- **License:** Apache-2.0
 								- **Finetuned from model [optional]:** Qwen/Qwen3-14B
 								### Model Sources [optional]
 								<!-- Provide the basic links for the model. -->
 								- **Repository:** https://huggingface.co/Yougen/Qwen3Fangwusha14B
 								- **Paper [optional]:** [More Information Needed]
 								- **Demo [optional]:** [More Information Needed]
 								## Uses
 								<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
 								### Direct Use
 								<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
 								该模型可直接用于以下任务：
 								- 中文对话与问答
 								- 文本生成与续写
 								- 信息提取与总结
 								- 翻译与语言转换
 								- 代码辅助与解释
 								- 创意写作与内容创作
 								### Downstream Use [optional]
 								<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
 								该模型可进一步微调用于：
 								- 特定领域知识库问答
 								- 客户服务机器人
 								- 教育辅导系统
 								- 企业内部智能助手
 								- 内容审核与分类
 								### Out-of-Scope Use
 								<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
 								该模型不应用于：
 								- 生成违法、有害、暴力或歧视性内容
 								- 未经授权的医疗诊断、法律建议或金融投资建议
 								- 冒充他人或进行欺诈活动
 								- 生成可能侵犯知识产权的内容
 								- 高风险决策系统（如自动驾驶、医疗设备控制等）
 								## Bias, Risks, and Limitations
 								<!-- This section is meant to convey both technical and sociotechnical limitations. -->
 								- 模型可能会生成不准确、不完整或误导性的信息，特别是在处理专业领域知识时
 								- 模型可能会反映训练数据中存在的偏见和刻板印象
 								- 模型在处理长文本时可能会出现上下文理解能力下降的情况
 								- 模型可能会产生幻觉，编造不存在的事实或引用
 								- 模型的英文能力相对中文较弱
 								### Recommendations
 								<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
 								用户在使用该模型时应：
 								- 对模型生成的内容进行事实核查和验证
 								- 意识到模型可能存在的偏见和局限性
 								- 在高风险场景中谨慎使用，必要时咨询专业人士
 								- 遵守相关法律法规和道德规范
 								- 报告任何有害或不当的模型输出
 								## How to Get Started with the Model
 								Use the code below to get started with the model.
 								```python
 								from transformers import AutoTokenizer, AutoModelForCausalLM
 								import torch
 								model_name = "Yougen/Qwen3Fangwusha14B"
 								tokenizer = AutoTokenizer.from_pretrained(model_name)
 								model = AutoModelForCausalLM.from_pretrained(
 								    model_name,
 								    torch_dtype=torch.bfloat16,
 								    device_map="auto"
 								)
 								prompt = "你好，请介绍一下你自己。"
 								inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
 								outputs = model.generate(
 								    **inputs,
 								    max_new_tokens=512,
 								    temperature=0.7,
 								    top_p=0.9,
 								    do_sample=True
 								)
 								response = tokenizer.decode(outputs[0], skip_special_tokens=True)
 								print(response)
 								```
 								## Training Details
 								### Training Data
 								<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
 								该模型使用了多种高质量中文数据集进行微调，包括：
 								- 通用对话数据集
 								- 指令遵循数据集
 								- 知识问答数据集
 								- 逻辑推理数据集
 								所有数据集均经过严格的质量过滤和去重处理，确保训练数据的质量和多样性。
 								### Training Procedure
 								<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
 								#### Preprocessing [optional]
 								训练数据经过了以下预处理步骤：
 								- 文本清洗和标准化
 								- 格式统一和规范化
 								- 质量过滤和去重
 								- 数据增强和多样化
 								#### Training Hyperparameters
 								- **Training regime:** BF16 mixed precision
 								- **Optimizer:** AdamW
 								- **Learning rate:** [More Information Needed]
 								- **Batch size:** [More Information Needed]
 								- **Epochs:** [More Information Needed]
 								- **Warmup steps:** [More Information Needed]
 								- **Weight decay:** [More Information Needed]
 								#### Speeds, Sizes, Times [optional]
 								<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
 								- **Model size:** 15B parameters
 								- **Checkpoint size:** ~30GB (BF16)
 								- **Training duration:** [More Information Needed]
 								- **Training hardware:** [More Information Needed]
 								## Evaluation
 								<!-- This section describes the evaluation protocols and provides the results. -->
 								### Testing Data, Factors & Metrics
 								#### Testing Data
 								<!-- This should link to a Dataset Card if possible. -->
 								模型在以下基准测试集上进行了评估：
 								- C-Eval (中文通用能力评估)
 								- MMLU (多任务语言理解)
 								- GSM8K (数学推理)
 								- HumanEval (代码生成)
 								#### Factors
 								<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
 								评估涵盖了以下维度：
 								- 知识掌握程度
 								- 逻辑推理能力
 								- 指令遵循能力
 								- 中文理解与生成能力
 								- 代码生成能力
 								#### Metrics
 								<!-- These are the evaluation metrics being used, ideally with a description of why. -->
 								- **Accuracy:** 用于知识问答和选择题任务
 								- **Pass@k:** 用于代码生成任务
 								- **BLEU/ROUGE:** 用于文本生成和翻译任务
 								- **Human evaluation:** 用于对话质量和整体表现评估
 								### Results
 								[More Information Needed]
 								#### Summary
 								[More Information Needed]
 								## Model Examination [optional]
 								<!-- Relevant interpretability work for the model goes here -->
 								[More Information Needed]
 								## Environmental Impact
 								<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
 								Carbon emissions can be estimated using the [Machine Learning Impact calculator](sslocal://flow/file_open?url=https%3A%2F%2Fmlco2.github.io%2Fimpact%23compute&flow_extra=eyJsaW5rX3R5cGUiOiJjb2RlX2ludGVycHJldGVyIn0=) presented in [Lacoste et al. (2019)](sslocal://flow/file_open?url=https%3A%2F%2Farxiv.org%2Fabs%2F1910.09700&flow_extra=eyJsaW5rX3R5cGUiOiJjb2RlX2ludGVycHJldGVyIn0=).
 								- **Hardware Type:** [More Information Needed]
 								- **Hours used:** [More Information Needed]
 								- **Cloud Provider:** [More Information Needed]
 								- **Compute Region:** [More Information Needed]
 								- **Carbon Emitted:** [More Information Needed]
 								## Technical Specifications [optional]
 								### Model Architecture and Objective
 								该模型基于Qwen3架构，采用解码器-only的Transformer结构：
 								- 上下文窗口大小：[More Information Needed]
 								- 注意力机制：Grouped-Query Attention (GQA)
 								- 激活函数：SwiGLU
 								- 词表大小：[More Information Needed]
 								### Compute Infrastructure
 								[More Information Needed]
 								#### Hardware
 								[More Information Needed]
 								#### Software
 								- **Framework:** PyTorch 2.x
 								- **Training library:** LLaMA-Factory
 								- **Inference library:** Transformers 4.x
 								- **Acceleration:** FlashAttention-2
 								## Citation [optional]
 								<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
 								**BibTeX:**
 								```bibtex
 								@misc{qwen3fangwusha14b,
 								  author = {Yuan, Yougen},
 								  title = {Qwen3Fangwusha14B: A Fine-tuned Chinese Large Language Model},
 								  year = {2026},
 								  publisher = {Hugging Face},
 								  howpublished = {\url{https://huggingface.co/Yougen/Qwen3Fangwusha14B}}
 								}
 								```
 								**APA:**
 								Yuan, Y. (2026). Qwen3Fangwusha14B: A Fine-tuned Chinese Large Language Model. Hugging Face. https://huggingface.co/Yougen/Qwen3Fangwusha14B
 								## Glossary [optional]
 								<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
 								[More Information Needed]
 								## More Information [optional]
 								该模型是Fangwusha系列的一部分，更多相关模型可在以下集合中找到：
 								- [Fangwusha Collection](sslocal://flow/file_open?url=https%3A%2F%2Fhuggingface.co%2Fcollections%2FYougen%2Ffangwusha-6615a7f8a7f8d9a7b8c6d5e4&flow_extra=eyJsaW5rX3R5cGUiOiJjb2RlX2ludGVycHJldGVyIn0=)
 								## Model Card Authors [optional]
 								Yougen Yuan
 								## Model Card Contact
 								[More Information Needed]