初始化项目,由ModelHub XC社区提供模型
Model: razy101/emojify-300m Source: Original Platform
This commit is contained in:
48
README.md
Normal file
48
README.md
Normal file
@@ -0,0 +1,48 @@
|
||||
---
|
||||
base_model: unsloth/gemma-3-270m-it
|
||||
tags:
|
||||
- text-generation-inference
|
||||
- transformers
|
||||
- unsloth
|
||||
- gemma3_text
|
||||
license: apache-2.0
|
||||
language:
|
||||
- en
|
||||
---
|
||||
|
||||
---
|
||||
license: apache-2.0
|
||||
base_model: google/gemma-3-270m-it
|
||||
library_name: transformers
|
||||
tags:
|
||||
- unsloth
|
||||
- gemma-3
|
||||
- emoji
|
||||
- translation
|
||||
- multilingual
|
||||
- gguf
|
||||
- edge-ai
|
||||
datasets:
|
||||
- custom-curated-emoji-distillation
|
||||
---
|
||||
|
||||
# 🚀 Emojify-300M (Gemma-3-270M Fine-tuned)
|
||||
|
||||
**Emojify-300M** is a hyper-compact, specialized Large Language Model (LLM) designed for **semantic text-to-emoji distillation**. Based on the **Gemma-3-270M-IT** architecture, this model is optimized for edge computing and ultra-low latency applications.
|
||||
|
||||
## 🛠 Technical Specifications
|
||||
- **Architecture:** Gemma-3 (270M parameters)
|
||||
- **Fine-tuning Method:** LoRA (Low-Rank Adaptation) via **Unsloth**
|
||||
- **Context Window:** 2048 tokens
|
||||
- **Language Support:** Native Multilingual
|
||||
|
||||
## 📈 Performance & Benchmarks (Local CPU)
|
||||
Tested on consumer-grade hardware (e.g., laptop CPU):
|
||||
- **Prompt Evaluation:** ~210 tokens/s
|
||||
- **Token Generation (Eval):** ~48 tokens/s
|
||||
- **Total Latency:** < 500ms (near-instant response)
|
||||
|
||||
## 🎯 Key Features
|
||||
- **Zero-Shot Cross-Lingual Transfer:** Leveraging Gemma 3's robust base weights, the model accurately processes languages not explicitly present in the fine-tuning set.
|
||||
- **Noise Suppression:** Specifically trained to inhibit conversational filler and "hallucinated" text, focusing strictly on relevant emoji output.
|
||||
- **EOS Stability:** Fine-tuned to respect End-of-Sequence (EOS) tokens, preventing the common "looping" behavior seen in smaller models.
|
||||
Reference in New Issue
Block a user