初始化项目，由ModelHub XC社区提供模型

Model: yasserrmd/glm5.1-distill Source: Original Platform
2026-05-31 01:31:28 +08:00
commit 7f9f483011
8 changed files with 328293 additions and 0 deletions
--- a/.gitattributes
+++ b/.gitattributes
@@ -0,0 +1,35 @@
 *.7z filter=lfs diff=lfs merge=lfs -text
 *.arrow filter=lfs diff=lfs merge=lfs -text
 *.bin filter=lfs diff=lfs merge=lfs -text
 *.bz2 filter=lfs diff=lfs merge=lfs -text
 *.ckpt filter=lfs diff=lfs merge=lfs -text
 *.ftz filter=lfs diff=lfs merge=lfs -text
 *.gz filter=lfs diff=lfs merge=lfs -text
 *.h5 filter=lfs diff=lfs merge=lfs -text
 *.joblib filter=lfs diff=lfs merge=lfs -text
 *.lfs.* filter=lfs diff=lfs merge=lfs -text
 *.mlmodel filter=lfs diff=lfs merge=lfs -text
 *.model filter=lfs diff=lfs merge=lfs -text
 *.msgpack filter=lfs diff=lfs merge=lfs -text
 *.npy filter=lfs diff=lfs merge=lfs -text
 *.npz filter=lfs diff=lfs merge=lfs -text
 *.onnx filter=lfs diff=lfs merge=lfs -text
 *.ot filter=lfs diff=lfs merge=lfs -text
 *.parquet filter=lfs diff=lfs merge=lfs -text
 *.pb filter=lfs diff=lfs merge=lfs -text
 *.pickle filter=lfs diff=lfs merge=lfs -text
 *.pkl filter=lfs diff=lfs merge=lfs -text
 *.pt filter=lfs diff=lfs merge=lfs -text
 *.pth filter=lfs diff=lfs merge=lfs -text
 *.rar filter=lfs diff=lfs merge=lfs -text
 *.safetensors filter=lfs diff=lfs merge=lfs -text
 saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.tar.* filter=lfs diff=lfs merge=lfs -text
 *.tar filter=lfs diff=lfs merge=lfs -text
 *.tflite filter=lfs diff=lfs merge=lfs -text
 *.tgz filter=lfs diff=lfs merge=lfs -text
 *.wasm filter=lfs diff=lfs merge=lfs -text
 *.xz filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
--- a/README.md
+++ b/README.md
@@ -0,0 +1,275 @@
 ---
 license: apache-2.0
 language:
 - en
 library_name: transformers
 pipeline_tag: text-generation
 base_model: LiquidAI/LFM2.5-1.2B-Base
 tags:
 - lfm2
 - liquid-ai
 - distillation
 - reasoning
 - glm
 - unsloth
 - trl
 - sft
 - text-generation-inference
 - conversational
 datasets:
 - Jackrong/GLM-5.1-Reasoning-1M-Cleaned
 model-index:
 - name: glm5.1-distill
  results: []
 ---
 # glm5.1-distill
 `yasserrmd/glm5.1-distill` is a 1.2B parameter instruction-tuned chat model
 built on top of [`LiquidAI/LFM2.5-1.2B-Base`](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Base).
 It is supervised-fine-tuned (SFT) on a 50k subset of
 [`Jackrong/GLM-5.1-Reasoning-1M-Cleaned`](https://huggingface.co/datasets/Jackrong/GLM-5.1-Reasoning-1M-Cleaned),
 a cleaned reasoning-style chat corpus distilled from the GLM-5.1 family.
 The goal is to bring some of the conversational reasoning behavior of larger
 GLM-5.1 teacher models into the small, efficient LFM2.5 architecture so it
 can run comfortably on a single consumer GPU, on edge devices, or via
 quantized runtimes such as ONNX, GGUF, or MLX.
 > **Note:** This is an independent community fine-tune. It is not affiliated
 > with or endorsed by Liquid AI or Z.ai/THUDM (the GLM authors).
 ---
 ## Model summary
 | Property | Value |
 |---|---|
 | Architecture | LFM2 (hybrid conv + attention) |
 | Parameters | ~1.2B |
 | Tensor dtype | BF16 |
 | Context length | 4096 (trained at 2048 with packing) |
 | Base model | `LiquidAI/LFM2.5-1.2B-Base` |
 | Fine-tuning method | LoRA SFT (merged back to base) |
 | Trainer | [Unsloth](https://github.com/unslothai/unsloth) + [TRL](https://github.com/huggingface/trl) `SFTTrainer` |
 | Chat template | LFM2 / ChatML-style (`<|im_start|>` … `<|im_end|>`) |
 | License | Apache 2.0 |
 ---
 ## Intended use
 This model is designed for:
 - General assistant-style chat
 - Lightweight reasoning, step-by-step answers, and explanations
 - On-device and edge deployments where a 1B class model is appropriate
 - A starting checkpoint for further domain-specific fine-tuning
 It is **not** a safety-aligned, production-ready assistant on its own. Treat
 its output as that of a small distilled student model: it can be confidently
 wrong, especially on long-horizon math, code correctness, current events,
 and anything safety-critical.
 ### Out of scope
 - Medical, legal, financial, or other high-stakes advice
 - Any setting that requires guaranteed factuality
 - Generating content that violates the Apache 2.0 license terms or the
  upstream LFM2.5 base model license
 ---
 ## Quickstart (Transformers)
 ```python
 import torch
 from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
 model_id = "yasserrmd/glm5.1-distill"
 tokenizer = AutoTokenizer.from_pretrained(model_id)
 model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
 )
 messages = [
    {"role": "user", "content": "Explain why the sky is blue in two short paragraphs."},
 ]
 inputs = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt",
    tokenize=True,
    return_dict=True,
 ).to(model.device)
 streamer = TextStreamer(tokenizer, skip_prompt=True)
 _ = model.generate(
    **inputs,
    max_new_tokens=512,
    temperature=0.1,
    top_k=50,
    top_p=0.1,
    repetition_penalty=1.05,
    streamer=streamer,
 )
 ```
 ### Recommended sampling
 The base LFM2.5 family is sensitive to sampling settings. The following
 defaults (inherited from Liquid AI's reference settings) work well:
 | Use case | temperature | top_k | top_p | repetition_penalty |
 |---|---|---|---|---|
 | Factual / short answers | 0.1 | 50 | 0.1 | 1.05 |
 | Creative / longer text | 0.7 | 50 | 0.9 | 1.10 |
 | Code / structured output | 0.2 | 40 | 0.9 | 1.05 |
 ---
 ## Chat template
 The tokenizer ships with a ChatML-style template. A two-turn example
 serializes to:
 ```
 <|im_start|>user
 Hello!<|im_end|>
 <|im_start|>assistant
 Hey there!<|im_end|>
 ```
 Always use `tokenizer.apply_chat_template(..., add_generation_prompt=True)`
 at inference time. Do not hand-roll the prompt.
 ---
 ## Training details
 ### Data
 - Source: `Jackrong/GLM-5.1-Reasoning-1M-Cleaned`, `main` config
 - Slice: first 50,000 rows of the `train` split
 - Format: ShareGPT-style multi-turn conversations, normalized via
  `unsloth.chat_templates.standardize_data_formats`
 - Loss masking: `train_on_responses_only` so only assistant tokens
  contribute to the loss
 ### LoRA configuration
 | Hyperparameter | Value |
 |---|---|
 | Rank `r` | 16 |
 | `lora_alpha` | 16 |
 | `lora_dropout` | 0 |
 | Bias | none |
 | Target modules | `q_proj`, `k_proj`, `v_proj`, `out_proj`, `in_proj`, `w1`, `w2`, `w3` |
 | Gradient checkpointing | `unsloth` |
 | Random seed | 3407 |
 ### SFT hyperparameters
 | Hyperparameter | Value |
 |---|---|
 | Epochs | 1 |
 | Per-device batch size | 32 |
 | Gradient accumulation | 1 |
 | Effective batch size | 32 |
 | Packing | True |
 | Max sequence length | 2048 |
 | Optimizer | `adamw_torch` |
 | Learning rate | 2e-5 |
 | LR scheduler | linear |
 | Warmup steps | 50 |
 | Weight decay | 0.01 |
 | Precision | BF16 |
 | Seed | 3407 |
 ### Merge & export
 After SFT, the LoRA adapters were merged into the base weights using
 Unsloth's `push_to_hub_merged(..., save_method="merged_16bit")`. The
 repository contains the resulting full BF16 model, not adapters.
 ### Hardware
 Trained on a single GPU using Unsloth's optimized kernels. End-to-end
 training memory and time are dominated by the 50k-row, packed-2048 setup
 described above.
 ---
 ## Evaluation
 No formal benchmark scores are reported for this checkpoint yet. It has
 been smoke-tested on:
 - General Q&A (e.g. "Why is the sky blue?")
 - Short creative writing prompts
 - Multi-turn instruction following
 Quantitative evaluations on benchmarks such as MMLU, GSM8K, IFEval, or
 MT-Bench are left as future work. Contributions via the HF community tab
 are welcome.
 ---
 ## Limitations and biases
 - Inherits all limitations and biases of the LFM2.5 base model and of the
  GLM-5.1-derived training data.
 - 1.2B parameters is small. Expect weaker performance than 7B+ chat
  models on hard reasoning, long context, and code generation.
 - The training corpus is predominantly English. Other languages will work
  to varying degrees but are not the target.
 - The model can hallucinate facts confidently. Verify anything important.
 ---
 ## ONNX version
 An ONNX export of this model is available at:
 **`yasserrmd/glm5.1-distill-onnx`**
 It can be used with `onnxruntime` and `optimum` for CPU and accelerated
 inference. See that repository's README for usage details.
 ---
 ## Citation
 If you use this checkpoint, please cite the upstream work as well:
 ```bibtex
@misc{yasserrmd_glm51_distill_2026,
  title  = {glm5.1-distill: a small LFM2.5 student fine-tuned on GLM-5.1 reasoning data},
  author = {Mohamed Yasser},
  year   = {2026},
  howpublished = {\url{https://huggingface.co/yasserrmd/glm5.1-distill}}
 }
 ```
 And the base model and dataset:
 - LiquidAI, *LFM2.5-1.2B-Base*, 2025.
 - Jackrong, *GLM-5.1-Reasoning-1M-Cleaned*, Hugging Face Datasets.
 ---
 ## Acknowledgements
 - [Liquid AI](https://huggingface.co/LiquidAI) for the LFM2.5 base model.
 - [Jackrong](https://huggingface.co/Jackrong) for the cleaned GLM-5.1
  reasoning dataset.
 - [Unsloth](https://github.com/unslothai/unsloth) for the 2x faster SFT
  pipeline and memory-efficient LoRA kernels.
 - [Hugging Face TRL](https://github.com/huggingface/trl) for `SFTTrainer`.
 [![Made with Unsloth](https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png)](https://github.com/unslothai/unsloth)
--- a/chat_template.jinja
+++ b/chat_template.jinja
@@ -0,0 +1,7 @@
 {{- bos_token -}}{%- set system_prompt = "" -%}{%- set ns = namespace(system_prompt="") -%}{%- if messages[0]["role"] == "system" -%} {%- set ns.system_prompt = messages[0]["content"] -%} {%- set messages = messages[1:] -%}{%- endif -%}{%- if tools -%} {%- set ns.system_prompt = ns.system_prompt + ("
 " if ns.system_prompt else "") + "List of tools: <|tool_list_start|>[" -%} {%- for tool in tools -%} {%- if tool is not string -%} {%- set tool = tool | tojson -%} {%- endif -%} {%- set ns.system_prompt = ns.system_prompt + tool -%} {%- if not loop.last -%} {%- set ns.system_prompt = ns.system_prompt + ", " -%} {%- endif -%} {%- endfor -%} {%- set ns.system_prompt = ns.system_prompt + "]<|tool_list_end|>" -%}{%- endif -%}{%- if ns.system_prompt -%} {{- "<|im_start|>system
 " + ns.system_prompt + "<|im_end|>
 " -}}{%- endif -%}{%- for message in messages -%} {{- "<|im_start|>" + message["role"] + "
 " -}} {%- set content = message["content"] -%} {%- if content is not string -%} {%- set content = content | tojson -%} {%- endif -%} {%- if message["role"] == "tool" -%} {%- set content = "<|tool_response_start|>" + content + "<|tool_response_end|>" -%} {%- endif -%} {{- content + "<|im_end|>
 " -}}{%- endfor -%}{%- if add_generation_prompt -%} {{- "<|im_start|>assistant
 " -}}{%- endif -%}
--- a/config.json
+++ b/config.json
@@ -0,0 +1,58 @@
 {
    "architectures": [
        "Lfm2ForCausalLM"
    ],
    "block_auto_adjust_ff_dim": true,
    "block_dim": 2048,
    "block_ff_dim": 12288,
    "block_ffn_dim_multiplier": 1.0,
    "block_mlp_init_scale": 1.0,
    "block_multiple_of": 256,
    "block_norm_eps": 1e-05,
    "block_out_init_scale": 1.0,
    "block_use_swiglu": true,
    "block_use_xavier_init": true,
    "bos_token_id": 1,
    "conv_L_cache": 3,
    "conv_bias": false,
    "conv_dim": 2048,
    "conv_use_xavier_init": true,
    "torch_dtype": "bfloat16",
    "eos_token_id": 7,
    "hidden_size": 2048,
    "initializer_range": 0.02,
    "intermediate_size": 12288,
    "layer_types": [
        "conv",
        "conv",
        "full_attention",
        "conv",
        "conv",
        "full_attention",
        "conv",
        "conv",
        "full_attention",
        "conv",
        "full_attention",
        "conv",
        "full_attention",
        "conv",
        "full_attention",
        "conv"
    ],
    "max_position_embeddings": 128000,
    "model_name": "LiquidAI/LFM2.5-1.2B-Base",
    "model_type": "lfm2",
    "norm_eps": 1e-05,
    "num_attention_heads": 32,
    "num_heads": 32,
    "num_hidden_layers": 16,
    "num_key_value_heads": 8,
    "pad_token_id": 0,
    "rope_theta": 1000000.0,
    "tie_embedding": true,
    "unsloth_version": "2026.4.8",
    "use_cache": true,
    "use_pos_enc": true,
    "vocab_size": 65536
 }
--- a/model.safetensors
+++ b/model.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:66b718a04a412036b152218ca83f93a6517bd1939bad83fe36fd863b3b7c3e53
 size 2340697936
--- a/special_tokens_map.json
+++ b/special_tokens_map.json
@@ -0,0 +1,23 @@
 {
  "bos_token": {
    "content": "<|startoftext|>",
    "lstrip": false,
    "normalized": false,
    "rstrip": false,
    "single_word": false
  },
  "eos_token": {
    "content": "<|im_end|>",
    "lstrip": false,
    "normalized": false,
    "rstrip": false,
    "single_word": false
  },
  "pad_token": {
    "content": "<|pad|>",
    "lstrip": false,
    "normalized": false,
    "rstrip": false,
    "single_word": false
  }
 }
--- a/tokenizer.json
+++ b/tokenizer.json
--- a/tokenizer_config.json
+++ b/tokenizer_config.json