初始化项目，由ModelHub XC社区提供模型

Model: Shahansha/Manthan-1.5B Source: Original Platform
2026-04-22 13:19:58 +08:00
commit 23c99c3afa
14 changed files with 497 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,188 @@
+---
+language:
+- en
+license: apache-2.0
+base_model: Qwen/Qwen2.5-1.5B-Instruct
+base_model_relation: finetune
+library_name: transformers
+pipeline_tag: text-generation
+tags:
+- genesis-agi
+- manthan
+- qwen2
+- tool-calling
+- agent
+- reasoning
+- grpo
+- qlora
+- chatml
+- smolagents
+datasets:
+- Shahansha/manthan-tool-reasoning-v1
+- glaiveai/glaive-function-calling-v2
+- NousResearch/hermes-function-calling-v1
+metrics:
+- accuracy
+- pass@1
+model-index:
+-
+    name: Manthan-1.5B
+    results:
+    -
+        task:
+            type: text-generation
+            name: Tool-Augmented Generation
+        dataset:
+            name: GSM8K
+            type: gsm8k
+        metrics:
+        -
+            name: Tool-Augmented Accuracy
+            type: accuracy
+            value: 65.0
+    -
+        task:
+            type: text-generation
+            name: Code Generation
+        dataset:
+            name: MBPP
+            type: mbpp
+        metrics:
+        -
+            name: pass@1
+            type: pass@1
+            value: 50.0
+---
+
+# Genesis Manthan - 1.5B
+
+Genesis Manthan is a small language model fine-tuned to reason through tool interaction instead of verbal chain-of-thought. It is built on top of Qwen2.5-1.5B-Instruct and tuned for tool-first responses, agent workflows, and smolagents-style execution loops.
+
+## Model Summary
+
+- Base model: `Qwen/Qwen2.5-1.5B-Instruct`
+- Published model: `Shahansha/Manthan-1.5B`
+- Training recipe: QLoRA SFT -> GRPO with tool-execution rewards -> budget forcing at inference time
+- Primary behavior: emit structured tool calls before final answers
+- Intended ecosystem: Hugging Face Transformers, Gradio Spaces, smolagents, local agent runners
+
+## Why this model exists
+
+Most small open models still answer by generating verbose text, even when the task would be better solved through an external tool. Manthan is designed around a different behavior: call a tool, observe the result, and then answer. The target is not hidden verbal reasoning. The target is reliable action traces that small models can actually execute.
+
+spaces:
+  - Shahansha/Manthan-Demo
+    
+## Benchmark Snapshot
+
+| Benchmark | Metric | Reported Result |
+|---|---:|---:|
+| GSM8K | Tool-augmented accuracy | 65.0 |
+| MBPP | pass@1 | 50.0 |
+
+*Reported benchmark numbers are early project metrics and should be independently reproduced before strong claims are made.
+
+## Quickstart
+
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+import torch
+
+model_id = "Shahansha/Manthan-1.5B"
+tokenizer = AutoTokenizer.from_pretrained(model_id)
+model = AutoModelForCausalLM.from_pretrained(
+    model_id,
+    dtype=torch.float16,
+    device_map="auto",
+)
+model.generation_config.max_length = None
+
+messages = [
+    {
+        "role": "system",
+        "content": (
+            "You are Genesis Manthan, an AI agent that solves problems by calling tools. "
+            "Never reason verbally - always reason through tool execution."
+        ),
+    },
+    {"role": "user", "content": "What is 144 + 256?"},
+]
+
+prompt = tokenizer.apply_chat_template(
+    messages,
+    tokenize=False,
+    add_generation_prompt=True,
+)
+inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
+outputs = model.generate(
+    **inputs,
+    max_new_tokens=256,
+    do_sample=True,
+    temperature=0.2,
+)
+
+print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=False))
+```
+
+Expected behavior: the completion should include a `<tool_call>` block before the final answer.
+
+## Prompting Guidance
+
+This model performs best when the system prompt explicitly instructs it to solve problems by calling tools. If you omit that instruction, it may drift back toward plain-text assistant behavior.
+
+Recommended system message:
+
+```text
+You are Genesis Manthan, an AI agent that solves problems by calling tools. Never reason verbally - always reason through tool execution.
+```
+
+## Training Details
+
+- Base checkpoint: `Qwen/Qwen2.5-1.5B-Instruct`
+- Fine-tuning method: QLoRA SFT
+- Reinforcement learning: GRPO with composable rewards for tool execution, answer correctness, and format compliance
+- Data format: ChatML with custom tool roles and structured `<tool_call>` blocks
+- Primary training data: `Shahansha/manthan-tool-reasoning-v1` plus function-calling traces derived from Glaive and Hermes datasets
+
+## Intended Use
+
+- Agentic math and reasoning tasks where external execution is available
+- Tool-augmented code and debugging workflows
+- Research experiments around small-model tool use
+- Gradio demos and Hugging Face Spaces showcasing action-first reasoning
+
+## Limitations
+
+- This is a research model, not a general factual authority
+- Reported benchmark numbers are early project metrics and should be independently reproduced before strong claims are made
+- The model relies heavily on the surrounding prompt and tool scaffolding
+- Small models can still emit malformed tool calls or conclude too early without budget forcing or downstream validation
+
+## Safety and Responsible Use
+
+- Do not treat tool-call output as inherently safe to execute without sandboxing
+- Validate JSON arguments and restrict available tools in production
+- Review outputs carefully in coding, shell, or data-execution environments
+- This model was not trained for high-stakes legal, medical, or safety-critical decisions
+
+## Project Links
+
+- Model: https://huggingface.co/Shahansha/Manthan-1.5B
+- Dataset: https://huggingface.co/datasets/Shahansha/manthan-tool-reasoning-v1
+- Code: https://github.com/shaik-shahansha/manthan
+- Deployment guide: https://github.com/shaik-shahansha/manthan/blob/main/docs/HUGGINGFACE_DEPLOY.md
+- Author: https://shahansha.com
+- Org: https://genesisagi.in
+
+## Citation
+
+```bibtex
+@misc{shaik2026manthan,
+    title={Genesis Manthan-1.5B: Tool-Mediated Reasoning for Small Language Models},
+    author={Shahansha Shaik},
+    year={2026},
+    url={https://huggingface.co/Shahansha/Manthan-1.5B}
+}
+```
+
+---