A lightweight, reasoning-enhanced multilingual model fine-tuned for step-by-step thinking in Spanish and French, built on top of Qwen2.5-7B-Instruct using LoRA.
🚀 Overview
This model enhances the reasoning capabilities of the base model by encouraging structured "thinking" before answering. It is optimized for:
🇪🇸 Spanish reasoning tasks
🇫🇷 French reasoning tasks
🧠 Step-by-step logical explanations
💬 Instruction-following with personality
The fine-tuning process leverages curated multilingual reasoning datasets to improve coherence, clarity, and depth in responses.
🏗️ Model Details
Component
Description
Base Model
Qwen2.5-7B-Instruct
Fine-tuning
LoRA (Low-Rank Adaptation) via Unsloth
Dataset
HuggingFaceH4/Multilingual-Thinking (Spanish & French filtered)
Quantization
4-bit (bitsandbytes)
Max Sequence Length
512 tokens
Framework
TRL + Unsloth
🎯 Capabilities
Generates chain-of-thought reasoning
Produces structured, step-by-step answers
Handles multilingual prompts (ES/FR/EN)
Maintains engaging and expressive tone
Efficient inference with low VRAM usage
⚠️ Limitations
Context limited to 512 tokens → long reasoning may truncate
Performance may degrade for:
highly technical domains (e.g., legal/medical)
languages outside ES/FR/EN
Chain-of-thought is learned behavior → may not always be consistent