初始化项目,由ModelHub XC社区提供模型

Model: mncai/Qwen3-0.6B-v0.1
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-06-10 06:18:12 +08:00
commit 2d11b20dae
8 changed files with 2439 additions and 0 deletions

51
.gitattributes vendored Normal file
View File

@@ -0,0 +1,51 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zstandard filter=lfs diff=lfs merge=lfs -text
*.tfevents* filter=lfs diff=lfs merge=lfs -text
*.db* filter=lfs diff=lfs merge=lfs -text
*.ark* filter=lfs diff=lfs merge=lfs -text
**/*ckpt*data* filter=lfs diff=lfs merge=lfs -text
**/*ckpt*.meta filter=lfs diff=lfs merge=lfs -text
**/*ckpt*.index filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.gguf* filter=lfs diff=lfs merge=lfs -text
*.ggml filter=lfs diff=lfs merge=lfs -text
*.llamafile* filter=lfs diff=lfs merge=lfs -text
*.pt2 filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
pytorch_model.bin filter=lfs diff=lfs merge=lfs -text
merges.txt filter=lfs diff=lfs merge=lfs -text
vocab.json filter=lfs diff=lfs merge=lfs -text

62
README.md Normal file
View File

@@ -0,0 +1,62 @@
---
license: apache-2.0
base_model: "Qwen/Qwen3-0.6B"
tags:
- text-generation
- deepspeed
- fine-tuned
language:
- en
library_name: transformers
pipeline_tag: text-generation
---
# Qwen3-0.6B-v0.1
DeepSpeed-Chat으로 파인튜닝된 언어 모델
## Model Details
이 모델은 DeepSpeed-Chat을 사용하여 파인튜닝된 모델입니다.
- **Base Model**: 기본 모델 정보를 여기에 추가하세요
- **Fine-tuning Method**: DeepSpeed-Chat
- **Training Data**: 학습 데이터 정보를 여기에 추가하세요
## Usage
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("mncai/Qwen3-0.6B-v0.1")
model = AutoModelForCausalLM.from_pretrained("mncai/Qwen3-0.6B-v0.1")
# 텍스트 생성
input_text = "Your prompt here"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_length=100)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
```
## Training Details
- **Training Framework**: DeepSpeed
- **Training Script**: DeepSpeed-Chat Step 1 Supervised Fine-tuning
- **Upload Date**: N/A
## Limitations and Biases
이 모델의 한계점과 편향성에 대한 정보를 여기에 추가하세요.
## Citation
DeepSpeed-Chat을 사용했다면 다음을 인용해주세요:
```
@misc{deepspeed-chat,
title={DeepSpeed-Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-like Models at All Scales},
author={Yuxiao Zhuang et al.},
year={2023},
url={https://github.com/microsoft/DeepSpeed}
}
```

62
config.json Normal file
View File

@@ -0,0 +1,62 @@
{
"architectures": [
"Qwen3ForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 151643,
"end_token_id": 151645,
"eos_token_id": 151645,
"head_dim": 128,
"hidden_act": "silu",
"hidden_size": 1024,
"initializer_range": 0.02,
"intermediate_size": 3072,
"layer_types": [
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention"
],
"max_position_embeddings": 40960,
"max_window_layers": 28,
"model_type": "qwen3",
"num_attention_heads": 16,
"num_hidden_layers": 28,
"num_key_value_heads": 8,
"pad_token_id": 151645,
"rms_norm_eps": 1e-06,
"rope_scaling": null,
"rope_theta": 1000000,
"sliding_window": null,
"tie_word_embeddings": true,
"torch_dtype": "float32",
"transformers_version": "4.53.0",
"use_cache": true,
"use_sliding_window": false,
"vocab_size": 151672
}

1
configuration.json Normal file
View File

@@ -0,0 +1 @@
{"framework": "pytorch", "task": "text-generation", "allow_remote": true}

BIN
merges.txt (Stored with Git LFS) Normal file

Binary file not shown.

3
pytorch_model.bin Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:05568f391e8e2bb14dde542ba027cc2a37bbc80978ff922b7869b825e4d99358
size 1191623484

2254
training.log Normal file

File diff suppressed because it is too large Load Diff

BIN
vocab.json (Stored with Git LFS) Normal file

Binary file not shown.