Go to file

ModelHub XC 171910c623 初始化项目，由ModelHub XC社区提供模型

Model: puravky/chatOP
Source: Original Platform

2026-06-08 14:30:37 +08:00

.gitattributes

初始化项目，由ModelHub XC社区提供模型

2026-06-08 14:30:37 +08:00

chat_template.jinja

初始化项目，由ModelHub XC社区提供模型

2026-06-08 14:30:37 +08:00

config.json

初始化项目，由ModelHub XC社区提供模型

2026-06-08 14:30:37 +08:00

generation_config.json

初始化项目，由ModelHub XC社区提供模型

2026-06-08 14:30:37 +08:00

model.safetensors

初始化项目，由ModelHub XC社区提供模型

2026-06-08 14:30:37 +08:00

README.md

初始化项目，由ModelHub XC社区提供模型

2026-06-08 14:30:37 +08:00

tokenizer_config.json

初始化项目，由ModelHub XC社区提供模型

2026-06-08 14:30:37 +08:00

tokenizer.json

初始化项目，由ModelHub XC社区提供模型

2026-06-08 14:30:37 +08:00

README.md

language, license, base_model, tags, datasets, pipeline_tag

language

license

base_model

chatOP — SmolLM2-135M Fine-tuned

A fine-tuned version of SmolLM2-135M-Instruct trained to act as a friendly study assistant for CS and ML concepts.

Model Details


Base model	HuggingFaceTB/SmolLM2-135M-Instruct
Model type	Causal Language Model
Fine-tuning method	SFT + LoRA
Language	English
License	Apache 2.0

Training Details

Dataset

Name: HuggingFaceTB/smoltalk
Split used: all
Format: Conversational chat messages (system, user, assistant)

Hyperparameters

Parameter	Value
Learning rate	2e-4
Training steps	200
Batch size	2
Packing	True
Optimizer	AdamW
LR scheduler	Linear
Precision	fp16

LoRA Config

Parameter	Value
Rank (`r`)	16
Alpha (`lora_alpha`)	16
Dropout	0.05
Target modules	all-linear
Bias	none
Trainable parameters	~1.7M / 135M (1.26%)

Hardware

GPU: NVIDIA Tesla T4 (Google Colab)
Training time: ~15–20 minutes

Example Outputs

Prompt: What is machine learning?

Machine learning is a field of AI where systems learn from data to make predictions or decisions without being explicitly programmed for each task.

Prompt: Explain transformers in simple terms.

A Transformer is a neural network architecture that uses attention mechanisms to understand relationships between all words in a sentence simultaneously, rather than reading them one by one like older models did.

Prompt: What is the difference between LoRA and full fine-tuning?

Full fine-tuning updates all parameters of a model which requires a lot of memory. LoRA instead freezes the base model and adds small trainable matrices to each layer, training only ~1% of parameters while achieving similar results.

Limitations

Small model (135M parameters) — responses may be less detailed than larger models
Trained for only 200 steps — longer training would improve quality
Best for short explanations — may struggle with very complex multi-step reasoning
Not suitable for factual lookups — may hallucinate specific facts or numbers

Training Framework

Built as a learning project while studying the HuggingFace LLM Course — specifically Chapter 11 (Fine-tuning LLMs).

Libraries used:

🤗 Transformers
🤗 PEFT
TRL (SFTTrainer)
🤗 Datasets
Accelerate

Author

Made by puravky — undergrad student exploring ML and AI.

README.md Unescape Escape

chatOP — SmolLM2-135M Fine-tuned

Model Details

Training Details

Dataset

Hyperparameters

LoRA Config

Hardware

Example Outputs

Limitations

Training Framework

Author

README.md