75 lines
2.2 KiB
Markdown
75 lines
2.2 KiB
Markdown
|
|
---
|
||
|
|
license: mit
|
||
|
|
base_model: Qwen/Qwen2.5-Coder-1.5B-Instruct
|
||
|
|
tags:
|
||
|
|
- landing-page
|
||
|
|
- html
|
||
|
|
- css
|
||
|
|
- web-development
|
||
|
|
- distillation
|
||
|
|
- lora
|
||
|
|
- gguf
|
||
|
|
language:
|
||
|
|
- en
|
||
|
|
pipeline_tag: text-generation
|
||
|
|
---
|
||
|
|
|
||
|
|
# Landing Page Generator 1.5B
|
||
|
|
|
||
|
|
A fine-tuned Qwen2.5-Coder-1.5B model that generates complete, single-file HTML landing pages with embedded CSS.
|
||
|
|
|
||
|
|
Trained via **knowledge distillation** from DeepSeek V3 (685B) using LoRA on Apple Silicon.
|
||
|
|
|
||
|
|
## Model Variants
|
||
|
|
|
||
|
|
| File | Precision | Size | Description |
|
||
|
|
|------|-----------|------|-------------|
|
||
|
|
| `model-f16.gguf` | FP16 | 3.1 GB | Full precision, best quality |
|
||
|
|
| `model-q8.gguf` | Q8_0 | 1.6 GB | 8-bit quantized, near-identical quality |
|
||
|
|
| `model-q4.gguf` | Q4_K_M | 986 MB | 4-bit quantized, good quality, smallest |
|
||
|
|
|
||
|
|
## Usage with Ollama
|
||
|
|
|
||
|
|
1. Download a GGUF file
|
||
|
|
2. Create a `Modelfile`:
|
||
|
|
|
||
|
|
```
|
||
|
|
FROM ./model-q8.gguf
|
||
|
|
TEMPLATE """{{- if .System }}<|im_start|>system
|
||
|
|
{{ .System }}<|im_end|>
|
||
|
|
{{ end }}<|im_start|>user
|
||
|
|
{{ .Prompt }}<|im_end|>
|
||
|
|
<|im_start|>assistant
|
||
|
|
"""
|
||
|
|
PARAMETER stop "<|im_end|>"
|
||
|
|
PARAMETER temperature 0.3
|
||
|
|
SYSTEM "You are a web developer. When asked to create a landing page, output a complete single-file HTML document with embedded CSS and modern design. Use clean gradients, card layouts, and responsive design. Output only the HTML code, nothing else."
|
||
|
|
```
|
||
|
|
|
||
|
|
3. Import and run:
|
||
|
|
```bash
|
||
|
|
ollama create landing-page-gen -f Modelfile
|
||
|
|
ollama run landing-page-gen "Create a landing page for a space tourism company called Orbit Adventures"
|
||
|
|
```
|
||
|
|
|
||
|
|
## Training Details
|
||
|
|
|
||
|
|
- **Base model**: Qwen2.5-Coder-1.5B-Instruct (4-bit)
|
||
|
|
- **Teacher model**: DeepSeek V3 (685B parameters)
|
||
|
|
- **Method**: LoRA (rank 16, 0.3% of weights trainable)
|
||
|
|
- **Training data**: 500 diverse landing pages generated by DeepSeek V3
|
||
|
|
- **Training**: 600 iterations on Apple Silicon (M-series) using MLX
|
||
|
|
- **Best validation loss**: 0.218
|
||
|
|
|
||
|
|
## Training Data
|
||
|
|
|
||
|
|
The training dataset is available at [KalnRangelov/landing-page-training-data](https://huggingface.co/datasets/KalnRangelov/landing-page-training-data).
|
||
|
|
|
||
|
|
## Full Experiment
|
||
|
|
|
||
|
|
See the full experiment writeup, code, and example outputs on GitHub: [KalnRangelov/LLM-Landing-page-distillation](https://github.com/KalinRangelovRangelov/LLM-Landing-page-distillation)
|
||
|
|
|
||
|
|
## License
|
||
|
|
|
||
|
|
MIT
|