初始化项目，由ModelHub XC社区提供模型

Model: Raghav-Singhal/normal-smollm-1p7b-500B-30n-2048sl-960gbsz Source: Original Platform
2026-04-28 05:47:09 +08:00
commit 521510a44a
11 changed files with 294161 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,39 @@
+---
+language: en
+license: apache-2.0
+tags:
+  - smollm
+  - llama
+  - causal-lm
+  - pretraining
+  - base-model
+model_type: llama
+pipeline_tag: text-generation
+---
+
+# normal-smollm-1p7b-500B-30n-2048sl-960gbsz
+
+This is the base (pretraining) checkpoint for a SmolLM2-style 1.7B model, converted to Hugging Face
+`LlamaForCausalLM` format from a Megatron-LM distributed checkpoint.
+
+## Details
+
+- Parameters: ~1.7B
+- Context length: 2048
+- Vocab size: 49152
+- Architecture: Llama (RMSNorm, SwiGLU, RoPE)
+- Training: 500B tokens (pretraining)
+
+## Usage
+
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+
+model_id = "REPLACE_WITH_OWNER/normal-smollm-1p7b-500B-30n-2048sl-960gbsz"
+tokenizer = AutoTokenizer.from_pretrained(model_id)
+model = AutoModelForCausalLM.from_pretrained(model_id)
+```
+
+## Notes
+
+This is a base model (not instruction-tuned). For chat use, apply SFT/DPO on top of this checkpoint.