初始化项目,由ModelHub XC社区提供模型
Model: richardyoung/OLMo-3-7B-RLZero-Math-GGUF Source: Original Platform
This commit is contained in:
139
README.md
Normal file
139
README.md
Normal file
@@ -0,0 +1,139 @@
|
||||
---
|
||||
license: apache-2.0
|
||||
base_model: allenai/OLMo-3-7B-RLZero-Math
|
||||
language:
|
||||
- en
|
||||
library_name: transformers
|
||||
tags:
|
||||
- gguf
|
||||
- mlx
|
||||
- ollama
|
||||
- math
|
||||
- reasoning
|
||||
- olmo
|
||||
- quantized
|
||||
pipeline_tag: text-generation
|
||||
---
|
||||
|
||||
# OLMo-3-7B-RLZero-Math GGUF
|
||||
|
||||
> **GGUF & MLX quantizations** of Allen Institute for AI's mathematical reasoning model, optimized for local inference with llama.cpp, Ollama, and Apple Silicon.
|
||||
|
||||
## Highlights
|
||||
|
||||
| | |
|
||||
|---|---|
|
||||
| **Math Specialist** | Fine-tuned with RL-Zero for step-by-step mathematical reasoning |
|
||||
| **65K Context** | 65,536 token context window with YaRN scaling |
|
||||
| **Apple Silicon Ready** | MLX-optimized 4-bit quantization included |
|
||||
| **Runs Anywhere** | From 4GB RAM to full precision |
|
||||
|
||||
## Model Specifications
|
||||
|
||||
| Property | Value |
|
||||
|----------|-------|
|
||||
| **Parameters** | 7 billion |
|
||||
| **Architecture** | OLMo2 |
|
||||
| **Context Length** | 65,536 tokens |
|
||||
| **Training** | RL-Zero mathematical reasoning |
|
||||
| **License** | Apache 2.0 |
|
||||
|
||||
## Available Versions
|
||||
|
||||
### GGUF Quantizations
|
||||
|
||||
| Quantization | Size | Quality | Use Case |
|
||||
|--------------|------|---------|----------|
|
||||
| `F16` | 14 GB | Near-perfect | Maximum quality, research |
|
||||
| `Q8_0` | 7.2 GB | Excellent | Near-lossless, high-end hardware |
|
||||
| `Q5_K_M` | 4.9 GB | Very Good | Excellent quality/size balance |
|
||||
| `Q4_K_M` | 4.2 GB | Good | **Recommended** for most users |
|
||||
| `IQ4_XS` | 3.8 GB | Good | Compact 4-bit |
|
||||
| `IQ3_M` | 3.2 GB | Acceptable | Ultra-compact, constrained devices |
|
||||
|
||||
### MLX (Apple Silicon)
|
||||
|
||||
4-bit quantized version in `MLX-4bit/` folder - optimized for M1/M2/M3/M4 Macs.
|
||||
|
||||
## Quick Start
|
||||
|
||||
### Ollama (Easiest)
|
||||
|
||||
```bash
|
||||
ollama run richardyoung/olmo-3-7b-rlzero-math
|
||||
```
|
||||
|
||||
### llama.cpp
|
||||
|
||||
```bash
|
||||
# Download Q4_K_M (recommended)
|
||||
wget https://huggingface.co/richardyoung/OLMo-3-7B-RLZero-Math-GGUF/resolve/main/Olmo-3-7B-RLZero-Math-Q4_K_M.gguf
|
||||
|
||||
# Run inference
|
||||
./llama-cli -m Olmo-3-7B-RLZero-Math-Q4_K_M.gguf \
|
||||
-p "Solve step by step: What is 15% of 240?" \
|
||||
-n 512
|
||||
```
|
||||
|
||||
### MLX (Apple Silicon)
|
||||
|
||||
```bash
|
||||
pip install mlx-lm
|
||||
|
||||
mlx_lm.generate \
|
||||
--model richardyoung/OLMo-3-7B-RLZero-Math-GGUF \
|
||||
--prompt "Solve: Find the derivative of x^3 + 2x" \
|
||||
--trust-remote-code
|
||||
```
|
||||
|
||||
### Python
|
||||
|
||||
```python
|
||||
from llama_cpp import Llama
|
||||
|
||||
llm = Llama(
|
||||
model_path="Olmo-3-7B-RLZero-Math-Q4_K_M.gguf",
|
||||
n_ctx=4096
|
||||
)
|
||||
|
||||
output = llm(
|
||||
"Solve step by step: What is the sum of the first 10 prime numbers?",
|
||||
max_tokens=512
|
||||
)
|
||||
print(output["choices"][0]["text"])
|
||||
```
|
||||
|
||||
## System Requirements
|
||||
|
||||
| Quantization | Min RAM | Recommended | Apple Silicon |
|
||||
|--------------|---------|-------------|---------------|
|
||||
| IQ3_M | 4 GB | 8 GB | M1 8GB |
|
||||
| IQ4_XS / Q4_K_M | 6 GB | 12 GB | M1 8GB |
|
||||
| Q5_K_M / Q8_0 | 8 GB | 16 GB | M1 16GB |
|
||||
| F16 | 16 GB | 24 GB | M2 Pro+ |
|
||||
|
||||
## Prompt Format
|
||||
|
||||
```
|
||||
Solve the following math problem step by step:
|
||||
{your problem here}
|
||||
```
|
||||
|
||||
**Example:**
|
||||
```
|
||||
Solve the following math problem step by step:
|
||||
A train travels 120 miles in 2 hours. If it continues at the same speed,
|
||||
how long will it take to travel 300 miles?
|
||||
```
|
||||
|
||||
## Links
|
||||
|
||||
| Resource | Link |
|
||||
|----------|------|
|
||||
| **Original Model** | [allenai/OLMo-3-7B-RLZero-Math](https://huggingface.co/allenai/OLMo-3-7B-RLZero-Math) |
|
||||
| **Ollama** | [richardyoung/olmo-3-7b-rlzero-math](https://ollama.com/richardyoung/olmo-3-7b-rlzero-math) |
|
||||
| **Allen AI** | [allenai.org](https://allenai.org/) |
|
||||
|
||||
---
|
||||
|
||||
**Quantization by** [Richard Young](https://huggingface.co/richardyoung) | **Original model by** [Allen Institute for AI](https://allenai.org/)
|
||||
Reference in New Issue
Block a user