--- base_model: LiquidAI/LFM2.5-1.2B-Instruct model_type: causal-lm architecture: LFM2 tags: - text-generation - text-generation-inference - instruction-tuned - distilled - synthetic-data - transformers - unsloth - lfm2 - glm - agentic - edge - efficient license: apache-2.0 language: - en datasets: - Open-Orca/FLAN - databricks/databricks-dolly-15k - OpenAssistant/oasst1 - BAAI/Infinity-Instruct - sahil2801/CodeAlpaca-20k - TIGER-Lab/MathInstruct pipeline_tag: text-generation --- # GLM4.7-Distill-LFM2.5-1.2B ## Model Overview **GLM4.7-Distill-LFM2.5-1.2B** is a 1.2B-parameter instruction-following language model obtained via **offline distillation** from **GLM-4.7** into the **Liquid AI LFM2** architecture. The model is designed to be: * concise and non-verbose * strong at instruction following * efficient for local and edge deployments * suitable for assistant, agentic, and system-integration use cases This model does **not** include chain-of-thought reasoning and is optimized for **final-answer quality** rather than verbose explanations. --- ## Key Characteristics * **Base architecture**: Liquid AI LFM2 * **Model size**: 1.2B parameters * **Training method**: Offline supervised distillation (SFT with LoRA) * **Teacher model**: GLM-4.7 (used only for data generation) * **Inference dependency on teacher**: None * **Reasoning traces**: Not included * **Target behavior**: Clear, grounded, instruction-aligned responses --- ## Training Details ### Distillation Approach This model was trained using **offline distillation**, where instruction-response pairs generated by **GLM-4.7** were combined with high-quality public instruction datasets. The teacher model was **not used during training or inference**, and no teacher weights or logits are included. Training focused on: * instruction adherence * response clarity * reduced verbosity * stable decision boundaries ### Datasets Used (Approx. 13K Samples) The following datasets were sampled and combined: * Open-Orca / FLAN * Databricks Dolly 15K * OpenAssistant OASST1 * BAAI Infinity-Instruct * CodeAlpaca * TIGER-Lab MathInstruct These were augmented with **GLM-4.7–generated instruction responses**, with explicit avoidance of chain-of-thought reasoning. --- ## Intended Use This model is well suited for: * general-purpose assistants * planning and task decomposition * summarization and explanation * lightweight coding assistance * agentic workflows * system integration and automation * on-device or edge inference scenarios --- ## Limitations Like other compact distilled models, this model may: * hallucinate when given insufficient or false premises * struggle with adversarial logical inference (NLI-style tasks) * lack temporal awareness of recent events * provide confident answers where explicit uncertainty is required For critical reasoning, verification layers or external tools are recommended. --- ## Ethical & Responsible Use * This model was trained on a mixture of public datasets and synthetic data. * It does not contain personal data by design. * Outputs should not be treated as authoritative in medical, legal, or safety-critical contexts. --- ## Citation & Acknowledgements If you use this model in research or applications, please acknowledge: * **GLM-4.7** for teacher-generated distillation data * **Liquid AI** for the LFM2 architecture * The creators of the public instruction datasets listed above --- ## Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer model_id = "yasserrmd/GLM4.7-Distill-LFM2.5-1.2B" model = AutoModelForCausalLM.from_pretrained( model_id, device_map="auto", dtype="bfloat16", # attn_implementation="flash_attention_2" # uncomment on compatible GPU ) tokenizer = AutoTokenizer.from_pretrained(model_id) streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True) prompt = "Implement QuickSort algorithm with complexity analysis" input_ids = tokenizer.apply_chat_template( [{"role": "user", "content": prompt}], add_generation_prompt=True, return_tensors="pt", tokenize=True, ).to(model.device) output = model.generate( input_ids, do_sample=True, temperature=0.1, top_k=50, top_p=0.1, repetition_penalty=1.05, max_new_tokens=512, streamer=streamer, ) ``` ## With vLLM (Production) ```python from vllm import LLM, SamplingParams llm = LLM(model="yasserrmd/GLM4.7-Distill-LFM2.5-1.2B") sampling_params = SamplingParams( temperature=0.1, top_k=50, top_p=0.1, repetition_penalty=1.05, max_tokens=512 ) prompts = [ "Implement QuickSort algorithm", "Solve the Longest Common Subsequence problem", "Design a hash table with collision handling" ] outputs = llm.generate(prompts, sampling_params) for output in outputs: print(output.outputs[0].text) ``` ## Recommended Use - Technical interviews - Algorithm learning - Code generation - Problem-solving - Code refactoring - Educational tutoring ## Not Recommended For - Current events or recent information - Factual knowledge queries - Legal, medical, or safety-critical code - Highly specialized domain problems - Real-time critical systems without human review ## License Please refer to the licenses of: * the base LFM2 model * the individual datasets used for training This repository follows the same usage constraints as the upstream components.