初始化项目,由ModelHub XC社区提供模型

Model: Raghav-Singhal/normal-smollm-1p7b-500B-30n-2048sl-960gbsz
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-04-28 05:47:09 +08:00
commit 521510a44a
11 changed files with 294161 additions and 0 deletions

39
README.md Normal file
View File

@@ -0,0 +1,39 @@
---
language: en
license: apache-2.0
tags:
- smollm
- llama
- causal-lm
- pretraining
- base-model
model_type: llama
pipeline_tag: text-generation
---
# normal-smollm-1p7b-500B-30n-2048sl-960gbsz
This is the base (pretraining) checkpoint for a SmolLM2-style 1.7B model, converted to Hugging Face
`LlamaForCausalLM` format from a Megatron-LM distributed checkpoint.
## Details
- Parameters: ~1.7B
- Context length: 2048
- Vocab size: 49152
- Architecture: Llama (RMSNorm, SwiGLU, RoPE)
- Training: 500B tokens (pretraining)
## Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "REPLACE_WITH_OWNER/normal-smollm-1p7b-500B-30n-2048sl-960gbsz"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
```
## Notes
This is a base model (not instruction-tuned). For chat use, apply SFT/DPO on top of this checkpoint.