初始化项目，由ModelHub XC社区提供模型

Model: DJLougen/Nemotron-Research-GooseReason-4B-Instruct-MLX-16bit Source: Original Platform
2026-04-25 02:19:05 +08:00
commit ced89bb2e1
16 changed files with 152396 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,106 @@
+---
+license: cc-by-nc-4.0
+language:
+- en
+base_model: nvidia/Nemotron-Research-GooseReason-4B-Instruct
+pipeline_tag: text-generation
+library_name: mlx
+tags:
+- mlx
+- qwen3
+- reasoning
+- rlvr
+- math
+- code
+- stem
+- nvidia
+---
+
+# GooseReason-4B-Instruct — MLX 16-bit (Full Precision)
+
+This is the **full-precision MLX** version of [nvidia/Nemotron-Research-GooseReason-4B-Instruct](https://huggingface.co/nvidia/Nemotron-Research-GooseReason-4B-Instruct), converted for inference using [MLX](https://github.com/ml-explore/mlx).
+
+## Model Overview
+
+| Attribute | Value |
+|---|---|
+| **Original Model** | [nvidia/Nemotron-Research-GooseReason-4B-Instruct](https://huggingface.co/nvidia/Nemotron-Research-GooseReason-4B-Instruct) |
+| **Architecture** | Qwen3 (4.4B parameters) |
+| **Precision** | 16-bit (BFloat16, no quantization) |
+| **Base Model** | Qwen3-4B-Instruct-2507 |
+| **Training Method** | RLVR (Reinforcement Learning with Verifiable Rewards) |
+| **Max Sequence Length** | 32,768 tokens |
+| **License** | CC-BY-NC-4.0 |
+
+## About GooseReason-4B
+
+Nemotron-Research-GooseReason-4B-Instruct is NVIDIA's reasoning model built on Qwen3-4B-Instruct-2507 using RLVR. It achieves strong performance on math, code, and STEM reasoning benchmarks while remaining compact at 4B parameters.
+
+### Key Capabilities
+
+- **Math Reasoning**: Strong performance on AIME 2025 and AMC benchmarks
+- **Code Generation**: Competitive on LiveCodeBench and HumanEval
+- **STEM**: Broad science and technical reasoning capabilities
+- **Thinking Mode**: Uses extended thinking (`<think>` tags) for complex reasoning tasks
+
+### Benchmark Highlights
+
+| Benchmark | GooseReason-4B |
+|---|---|
+| AIME 2025 (avg@64) | 55.0 |
+| AMC (avg@64) | 82.2 |
+| LiveCodeBench v6 (pass@1) | 30.1 |
+| GPQA Diamond (avg@8) | 47.5 |
+
+## Usage with MLX
+
+```bash
+pip install mlx-lm
+```
+
+```python
+from mlx_lm import load, generate
+
+model, tokenizer = load("DJLougen/Nemotron-Research-GooseReason-4B-Instruct-MLX-16bit")
+
+messages = [
+    {"role": "user", "content": "Solve: What is the sum of all prime numbers less than 20?"}
+]
+
+prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
+response = generate(model, tokenizer, prompt=prompt, max_tokens=2048)
+print(response)
+```
+
+### Enabling Extended Thinking
+
+For complex reasoning tasks, the model uses `<think>` tags automatically. You can also prompt it explicitly:
+
+```python
+messages = [
+    {
+        "role": "system",
+        "content": "Think step by step before answering."
+    },
+    {
+        "role": "user",
+        "content": "Find all positive integers n such that n^2 + 2n + 2 is divisible by 7."
+    }
+]
+```
+
+## All Available Formats
+
+| Variant | Link | Size |
+|---|---|---|
+| MLX 16-bit | **This repo** | ~8.8 GB |
+| MLX 8-bit | [DJLougen/Nemotron-Research-GooseReason-4B-Instruct-MLX-8bit](https://huggingface.co/DJLougen/Nemotron-Research-GooseReason-4B-Instruct-MLX-8bit) | ~4.6 GB |
+| MLX 6-bit | [DJLougen/Nemotron-Research-GooseReason-4B-Instruct-MLX-6bit](https://huggingface.co/DJLougen/Nemotron-Research-GooseReason-4B-Instruct-MLX-6bit) | ~3.5 GB |
+| MLX 4-bit | [DJLougen/Nemotron-Research-GooseReason-4B-Instruct-MLX-4bit](https://huggingface.co/DJLougen/Nemotron-Research-GooseReason-4B-Instruct-MLX-4bit) | ~2.5 GB |
+| Full Weights | [nvidia/Nemotron-Research-GooseReason-4B-Instruct](https://huggingface.co/nvidia/Nemotron-Research-GooseReason-4B-Instruct) | ~8.8 GB |
+
+## Acknowledgments
+
+- [NVIDIA](https://huggingface.co/nvidia) for the GooseReason-4B model and RLVR research
+- [Qwen Team](https://huggingface.co/Qwen) for the Qwen3-4B-Instruct-2507 base model
+- [Apple MLX Team](https://github.com/ml-explore/mlx) for the MLX framework