Files
ModelHub XC 713ec7acaa 初始化项目,由ModelHub XC社区提供模型
Model: yasserrmd/GLM4.7-Distill-LFM2.5-1.2B
Source: Original Platform
2026-04-21 19:45:04 +08:00

236 lines
5.4 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
base_model: LiquidAI/LFM2.5-1.2B-Instruct
model_type: causal-lm
architecture: LFM2
tags:
- text-generation
- text-generation-inference
- instruction-tuned
- distilled
- synthetic-data
- transformers
- unsloth
- lfm2
- glm
- agentic
- edge
- efficient
license: apache-2.0
language:
- en
datasets:
- Open-Orca/FLAN
- databricks/databricks-dolly-15k
- OpenAssistant/oasst1
- BAAI/Infinity-Instruct
- sahil2801/CodeAlpaca-20k
- TIGER-Lab/MathInstruct
pipeline_tag: text-generation
---
# GLM4.7-Distill-LFM2.5-1.2B
<img src="logo_model.png" width="100%" />
## Model Overview
**GLM4.7-Distill-LFM2.5-1.2B** is a 1.2B-parameter instruction-following language model obtained via **offline distillation** from **GLM-4.7** into the **Liquid AI LFM2** architecture.
The model is designed to be:
* concise and non-verbose
* strong at instruction following
* efficient for local and edge deployments
* suitable for assistant, agentic, and system-integration use cases
This model does **not** include chain-of-thought reasoning and is optimized for **final-answer quality** rather than verbose explanations.
---
## Key Characteristics
* **Base architecture**: Liquid AI LFM2
* **Model size**: 1.2B parameters
* **Training method**: Offline supervised distillation (SFT with LoRA)
* **Teacher model**: GLM-4.7 (used only for data generation)
* **Inference dependency on teacher**: None
* **Reasoning traces**: Not included
* **Target behavior**: Clear, grounded, instruction-aligned responses
---
## Training Details
### Distillation Approach
This model was trained using **offline distillation**, where instruction-response pairs generated by **GLM-4.7** were combined with high-quality public instruction datasets.
The teacher model was **not used during training or inference**, and no teacher weights or logits are included.
Training focused on:
* instruction adherence
* response clarity
* reduced verbosity
* stable decision boundaries
### Datasets Used (Approx. 13K Samples)
The following datasets were sampled and combined:
* Open-Orca / FLAN
* Databricks Dolly 15K
* OpenAssistant OASST1
* BAAI Infinity-Instruct
* CodeAlpaca
* TIGER-Lab MathInstruct
These were augmented with **GLM-4.7generated instruction responses**, with explicit avoidance of chain-of-thought reasoning.
---
## Intended Use
This model is well suited for:
* general-purpose assistants
* planning and task decomposition
* summarization and explanation
* lightweight coding assistance
* agentic workflows
* system integration and automation
* on-device or edge inference scenarios
---
## Limitations
Like other compact distilled models, this model may:
* hallucinate when given insufficient or false premises
* struggle with adversarial logical inference (NLI-style tasks)
* lack temporal awareness of recent events
* provide confident answers where explicit uncertainty is required
For critical reasoning, verification layers or external tools are recommended.
---
## Ethical & Responsible Use
* This model was trained on a mixture of public datasets and synthetic data.
* It does not contain personal data by design.
* Outputs should not be treated as authoritative in medical, legal, or safety-critical contexts.
---
## Citation & Acknowledgements
If you use this model in research or applications, please acknowledge:
* **GLM-4.7** for teacher-generated distillation data
* **Liquid AI** for the LFM2 architecture
* The creators of the public instruction datasets listed above
---
## Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
model_id = "yasserrmd/GLM4.7-Distill-LFM2.5-1.2B"
model = AutoModelForCausalLM.from_pretrained(
model_id,
device_map="auto",
dtype="bfloat16",
# attn_implementation="flash_attention_2" # uncomment on compatible GPU
)
tokenizer = AutoTokenizer.from_pretrained(model_id)
streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
prompt = "Implement QuickSort algorithm with complexity analysis"
input_ids = tokenizer.apply_chat_template(
[{"role": "user", "content": prompt}],
add_generation_prompt=True,
return_tensors="pt",
tokenize=True,
).to(model.device)
output = model.generate(
input_ids,
do_sample=True,
temperature=0.1,
top_k=50,
top_p=0.1,
repetition_penalty=1.05,
max_new_tokens=512,
streamer=streamer,
)
```
## With vLLM (Production)
```python
from vllm import LLM, SamplingParams
llm = LLM(model="yasserrmd/GLM4.7-Distill-LFM2.5-1.2B")
sampling_params = SamplingParams(
temperature=0.1,
top_k=50,
top_p=0.1,
repetition_penalty=1.05,
max_tokens=512
)
prompts = [
"Implement QuickSort algorithm",
"Solve the Longest Common Subsequence problem",
"Design a hash table with collision handling"
]
outputs = llm.generate(prompts, sampling_params)
for output in outputs:
print(output.outputs[0].text)
```
## Recommended Use
- Technical interviews
- Algorithm learning
- Code generation
- Problem-solving
- Code refactoring
- Educational tutoring
## Not Recommended For
- Current events or recent information
- Factual knowledge queries
- Legal, medical, or safety-critical code
- Highly specialized domain problems
- Real-time critical systems without human review
## License
Please refer to the licenses of:
* the base LFM2 model
* the individual datasets used for training
This repository follows the same usage constraints as the upstream components.