Files
Qwen3Fangwusha14B/README.md
ModelHub XC 7b99a9a807 初始化项目,由ModelHub XC社区提供模型
Model: Yougen/Qwen3Fangwusha14B
Source: Original Platform
2026-04-26 14:48:02 +08:00

306 lines
9.4 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
license: apache-2.0
language:
- zh
tags:
- qwen3
- fangwusha
- text-generation
- chinese-llm
- 15b
library_name: transformers
pipeline_tag: text-generation
base_model: Qwen/Qwen3-14B
---
# Model Card for Yougen/Qwen3Fangwusha14B
<!-- Provide a quick summary of what the model is/does. -->
Qwen3Fangwusha14B是基于Qwen3-14B进行微调的中文大语言模型专注于提升中文对话能力、指令遵循和通用任务表现。该模型属于Fangwusha系列旨在为中文用户提供高质量、安全可靠的AI助手服务。
## Model Details
### Model Description
<!-- Provide a longer summary of what this model is. -->
Qwen3Fangwusha14B是一个150亿参数的自回归语言模型在Qwen3-14B基础上通过高质量中文数据集进行了进一步微调。模型采用BF16精度训练优化了中文语义理解、逻辑推理和多轮对话能力适用于各种中文自然语言处理任务。
- **Developed by:** Yougen Yuan
- **Funded by [optional]:** [More Information Needed]
- **Shared by [optional]:** Yougen Yuan
- **Model type:** Auto-regressive language model (Decoder-only)
- **Language(s) (NLP):** 中文 (zh), 英文 (en)
- **License:** Apache-2.0
- **Finetuned from model [optional]:** Qwen/Qwen3-14B
### Model Sources [optional]
<!-- Provide the basic links for the model. -->
- **Repository:** https://huggingface.co/Yougen/Qwen3Fangwusha14B
- **Paper [optional]:** [More Information Needed]
- **Demo [optional]:** [More Information Needed]
## Uses
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
### Direct Use
<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
该模型可直接用于以下任务:
- 中文对话与问答
- 文本生成与续写
- 信息提取与总结
- 翻译与语言转换
- 代码辅助与解释
- 创意写作与内容创作
### Downstream Use [optional]
<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
该模型可进一步微调用于:
- 特定领域知识库问答
- 客户服务机器人
- 教育辅导系统
- 企业内部智能助手
- 内容审核与分类
### Out-of-Scope Use
<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
该模型不应用于:
- 生成违法、有害、暴力或歧视性内容
- 未经授权的医疗诊断、法律建议或金融投资建议
- 冒充他人或进行欺诈活动
- 生成可能侵犯知识产权的内容
- 高风险决策系统(如自动驾驶、医疗设备控制等)
## Bias, Risks, and Limitations
<!-- This section is meant to convey both technical and sociotechnical limitations. -->
- 模型可能会生成不准确、不完整或误导性的信息,特别是在处理专业领域知识时
- 模型可能会反映训练数据中存在的偏见和刻板印象
- 模型在处理长文本时可能会出现上下文理解能力下降的情况
- 模型可能会产生幻觉,编造不存在的事实或引用
- 模型的英文能力相对中文较弱
### Recommendations
<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
用户在使用该模型时应:
- 对模型生成的内容进行事实核查和验证
- 意识到模型可能存在的偏见和局限性
- 在高风险场景中谨慎使用,必要时咨询专业人士
- 遵守相关法律法规和道德规范
- 报告任何有害或不当的模型输出
## How to Get Started with the Model
Use the code below to get started with the model.
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_name = "Yougen/Qwen3Fangwusha14B"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.bfloat16,
device_map="auto"
)
prompt = "你好,请介绍一下你自己。"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=512,
temperature=0.7,
top_p=0.9,
do_sample=True
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
```
## Training Details
### Training Data
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
该模型使用了多种高质量中文数据集进行微调,包括:
- 通用对话数据集
- 指令遵循数据集
- 知识问答数据集
- 逻辑推理数据集
所有数据集均经过严格的质量过滤和去重处理,确保训练数据的质量和多样性。
### Training Procedure
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
#### Preprocessing [optional]
训练数据经过了以下预处理步骤:
- 文本清洗和标准化
- 格式统一和规范化
- 质量过滤和去重
- 数据增强和多样化
#### Training Hyperparameters
- **Training regime:** BF16 mixed precision
- **Optimizer:** AdamW
- **Learning rate:** [More Information Needed]
- **Batch size:** [More Information Needed]
- **Epochs:** [More Information Needed]
- **Warmup steps:** [More Information Needed]
- **Weight decay:** [More Information Needed]
#### Speeds, Sizes, Times [optional]
<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
- **Model size:** 15B parameters
- **Checkpoint size:** ~30GB (BF16)
- **Training duration:** [More Information Needed]
- **Training hardware:** [More Information Needed]
## Evaluation
<!-- This section describes the evaluation protocols and provides the results. -->
### Testing Data, Factors & Metrics
#### Testing Data
<!-- This should link to a Dataset Card if possible. -->
模型在以下基准测试集上进行了评估:
- C-Eval (中文通用能力评估)
- MMLU (多任务语言理解)
- GSM8K (数学推理)
- HumanEval (代码生成)
#### Factors
<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
评估涵盖了以下维度:
- 知识掌握程度
- 逻辑推理能力
- 指令遵循能力
- 中文理解与生成能力
- 代码生成能力
#### Metrics
<!-- These are the evaluation metrics being used, ideally with a description of why. -->
- **Accuracy:** 用于知识问答和选择题任务
- **Pass@k:** 用于代码生成任务
- **BLEU/ROUGE:** 用于文本生成和翻译任务
- **Human evaluation:** 用于对话质量和整体表现评估
### Results
[More Information Needed]
#### Summary
[More Information Needed]
## Model Examination [optional]
<!-- Relevant interpretability work for the model goes here -->
[More Information Needed]
## Environmental Impact
<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
Carbon emissions can be estimated using the [Machine Learning Impact calculator](sslocal://flow/file_open?url=https%3A%2F%2Fmlco2.github.io%2Fimpact%23compute&flow_extra=eyJsaW5rX3R5cGUiOiJjb2RlX2ludGVycHJldGVyIn0=) presented in [Lacoste et al. (2019)](sslocal://flow/file_open?url=https%3A%2F%2Farxiv.org%2Fabs%2F1910.09700&flow_extra=eyJsaW5rX3R5cGUiOiJjb2RlX2ludGVycHJldGVyIn0=).
- **Hardware Type:** [More Information Needed]
- **Hours used:** [More Information Needed]
- **Cloud Provider:** [More Information Needed]
- **Compute Region:** [More Information Needed]
- **Carbon Emitted:** [More Information Needed]
## Technical Specifications [optional]
### Model Architecture and Objective
该模型基于Qwen3架构采用解码器-only的Transformer结构
- 上下文窗口大小:[More Information Needed]
- 注意力机制Grouped-Query Attention (GQA)
- 激活函数SwiGLU
- 词表大小:[More Information Needed]
### Compute Infrastructure
[More Information Needed]
#### Hardware
[More Information Needed]
#### Software
- **Framework:** PyTorch 2.x
- **Training library:** LLaMA-Factory
- **Inference library:** Transformers 4.x
- **Acceleration:** FlashAttention-2
## Citation [optional]
<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
**BibTeX:**
```bibtex
@misc{qwen3fangwusha14b,
author = {Yuan, Yougen},
title = {Qwen3Fangwusha14B: A Fine-tuned Chinese Large Language Model},
year = {2026},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/Yougen/Qwen3Fangwusha14B}}
}
```
**APA:**
Yuan, Y. (2026). Qwen3Fangwusha14B: A Fine-tuned Chinese Large Language Model. Hugging Face. https://huggingface.co/Yougen/Qwen3Fangwusha14B
## Glossary [optional]
<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
[More Information Needed]
## More Information [optional]
该模型是Fangwusha系列的一部分更多相关模型可在以下集合中找到
- [Fangwusha Collection](sslocal://flow/file_open?url=https%3A%2F%2Fhuggingface.co%2Fcollections%2FYougen%2Ffangwusha-6615a7f8a7f8d9a7b8c6d5e4&flow_extra=eyJsaW5rX3R5cGUiOiJjb2RlX2ludGVycHJldGVyIn0=)
## Model Card Authors [optional]
Yougen Yuan
## Model Card Contact
[More Information Needed]