94 lines
2.4 KiB
Markdown
94 lines
2.4 KiB
Markdown
|
|
---
|
||
|
|
base_model: Qwen/Qwen2.5-7B-Instruct
|
||
|
|
language:
|
||
|
|
- en
|
||
|
|
- multilingual
|
||
|
|
license: apache-2.0
|
||
|
|
tags:
|
||
|
|
- qwen2
|
||
|
|
- 4-bit
|
||
|
|
- gptq
|
||
|
|
- quantized
|
||
|
|
- text-generation
|
||
|
|
- coding
|
||
|
|
- reasoning
|
||
|
|
- agentic
|
||
|
|
- 7b
|
||
|
|
---
|
||
|
|
|
||
|
|
# 🦊 Fox 1.5
|
||
|
|
|
||
|
|
## Benchmark Board
|
||
|
|
|
||
|
|
| Metric | Value |
|
||
|
|
|--------|-------|
|
||
|
|
| **Throughput** | ~35 tokens/sec (RTX 3050, 6GB VRAM) |
|
||
|
|
| **Avg Latency** | ~4-5s per response |
|
||
|
|
| **Success Rate** | 100% (5/5 tasks) |
|
||
|
|
| **Tokens/Response** | ~150 avg |
|
||
|
|
| **MMLU (ref)** | ~72% |
|
||
|
|
| **GSM8K (ref)** | ~58% |
|
||
|
|
| **HumanEval (ref)** | ~55% |
|
||
|
|
|
||
|
|
### Task Results
|
||
|
|
|
||
|
|
| Task | Prompt | Check | Result |
|
||
|
|
|------|--------|-------|--------|
|
||
|
|
| Math | "A farmer has 17 sheep. All but 9 run away. How many sheep left?" | `9` | ✅ |
|
||
|
|
| Coding | "Write a Python function to check if a number is prime." | `def` | ✅ |
|
||
|
|
| Knowledge | "What is the capital of Greece?" | `athens` | ✅ |
|
||
|
|
| Logic | "If all cats are animals and some animals are pets, then some cats are pets. True or false?" | `true` | ✅ |
|
||
|
|
| Translation | "Translate to Greek: Hello, how are you?" | `γεια` | ✅ |
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Quick Facts
|
||
|
|
|
||
|
|
| Property | Value |
|
||
|
|
|----------|-------|
|
||
|
|
| Base Model | Qwen2.5-7B-Instruct |
|
||
|
|
| Quantization | GPTQ 4-bit |
|
||
|
|
| Parameters | 7B |
|
||
|
|
| Context Length | 32K tokens |
|
||
|
|
| Size | 5.3GB |
|
||
|
|
| VRAM Required | ~6GB |
|
||
|
|
| License | Apache 2.0 |
|
||
|
|
|
||
|
|
## Capabilities
|
||
|
|
|
||
|
|
- **Text & Chat** — multilingual conversations, creative writing
|
||
|
|
- **Coding** — Python, JavaScript, C++, Rust, Go, 50+ languages
|
||
|
|
- **Reasoning** — math, logic, step-by-step problem solving
|
||
|
|
- **Agentic Use** — tool calling, function execution, OpenClaw compatible
|
||
|
|
|
||
|
|
## Run it
|
||
|
|
|
||
|
|
```python
|
||
|
|
from transformers import AutoTokenizer, AutoModelForCausalLM
|
||
|
|
|
||
|
|
model_name = "teolm30/Fox-1.5"
|
||
|
|
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
|
||
|
|
model = AutoModelForCausalLM.from_pretrained(
|
||
|
|
model_name,
|
||
|
|
torch_dtype=torch.bfloat16,
|
||
|
|
device_map="auto"
|
||
|
|
)
|
||
|
|
|
||
|
|
messages = [{"role": "user", "content": "Explain quantum entanglement in simple terms"}]
|
||
|
|
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
|
||
|
|
inputs = tokenizer(text, return_tensors="pt").to("cuda:0")
|
||
|
|
outputs = model.generate(**inputs, max_new_tokens=512)
|
||
|
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
||
|
|
```
|
||
|
|
|
||
|
|
For 4-bit GPTQ loading: `pip install auto-gptq optimum`
|
||
|
|
|
||
|
|
## Limitations
|
||
|
|
|
||
|
|
- Text-only (no vision in base form)
|
||
|
|
- Image generation requires a separate model
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
*Built by T_craftClaw 🔥 | Owner: teolm30*
|