Go to file

ModelHub XC 1b197e8f71 初始化项目，由ModelHub XC社区提供模型

Model: Cyb3RQ/arabic-poetry-qwen3-8b-GGUF
Source: Original Platform

2026-06-06 09:54:15 +08:00

.gitattributes

初始化项目，由ModelHub XC社区提供模型

2026-06-06 09:54:15 +08:00

arabic-poetry-qwen3-8b-f16.gguf

初始化项目，由ModelHub XC社区提供模型

2026-06-06 09:54:15 +08:00

README.md

初始化项目，由ModelHub XC社区提供模型

2026-06-06 09:54:15 +08:00

README.md

license, language, base_model, tags, pipeline_tag

license

language

base_model

Arabic Poetry Qwen3-8B (LoRA, GGUF) — Experimental

Status: experimental / hobby project. This model produces coherent Arabic in a loose poetic register. It does not reliably produce correct classical meter (بحر) or rhyme (قافية), and output quality is uneven. Set expectations accordingly. Read the Limitations section before using.

A LoRA fine-tune of Qwen3-8B on a corpus of Arabic poetry that was OCR'd locally from 19 books. Trained as continued-pretraining on discrete poems.

What it actually does

Generates Arabic text with a poem-like shape (short lines, stops cleanly)
Stays in Arabic, on the prompt's theme more than the base model does
Style leans modern/free-verse (Darwish-ish), not classical ode

What it does NOT do well

No reliable meter or rhyme. It does not scan to a specific بحر.
Imagery is often weak or vague; some lines are semantically loose.
Quality varies a lot run-to-run.
Classical/Jahiliyya register is weak (the OCR corpus had artifacts).

This is a data-and-scale-limited result: ~2,244 OCR'd poems (with residual OCR noise) on an 8B model is not enough to install genuine Arabic prosody. It is shared as an experiment and a starting point, not a finished poetry engine.

Usage (LM Studio / llama.cpp)

Download arabic-poetry-qwen3-8b-f16.gguf. Recommended sampling:

Temperature 0.7, Top-p 0.92, Top-k 40
Repeat penalty 1.3 (lower values loop)
Reasoning/thinking: OFF
System prompt: empty

Prompt with Arabic openers, not English instructions:

قصيدة في وصف الصحراء:
في حضرة الغياب،
أحبكِ يا وطني،

Note: send prompts via a UTF-8-correct client. Some terminal/curl setups on Windows mangle Arabic UTF-8 and will make the model emit garbage — that is a client encoding bug, not the model.

Training


Base	`unsloth/Qwen3-8B` (full bf16, not quantized)
Method	LoRA r=32 α=32, attention-only, dropout 0.05
Data	2,244 discrete cleaned Arabic poems, EOS-terminated
Schedule	2 epochs, cosine LR 1.2e-4, manual training loop
Selection	best pre-overfit checkpoint by sample quality (not final)
Hardware	single RTX 4090, ~19 min

License

Apache-2.0 (inherits from Qwen3-8B).

Honest note

Built end-to-end on a single workstation (OCR → corpus cleaning → LoRA → GGUF). The most reusable artifact from the project is arguably the cleaned corpus and pipeline, not this particular adapter. Contributions / a cleaner meter-labelled Arabic corpus would meaningfully improve a v2.

README.md Unescape Escape

Arabic Poetry Qwen3-8B (LoRA, GGUF) — Experimental

What it actually does

What it does NOT do well

Usage (LM Studio / llama.cpp)

Training

License

Honest note

README.md