初始化项目,由ModelHub XC社区提供模型
Model: applexml/kimi-k2 Source: Original Platform
This commit is contained in:
151
README.md
Normal file
151
README.md
Normal file
@@ -0,0 +1,151 @@
|
||||
---
|
||||
language:
|
||||
- en
|
||||
license: apache-2.0
|
||||
tags:
|
||||
- llm
|
||||
- tool-calling
|
||||
- lightweight
|
||||
- agentic-tasks
|
||||
- react
|
||||
- mlx
|
||||
model-index:
|
||||
- name: NanoAgent
|
||||
results: []
|
||||
datasets:
|
||||
- microsoft/orca-agentinstruct-1M-v1
|
||||
- microsoft/orca-math-word-problems-200k
|
||||
- allenai/tulu-3-sft-personas-instruction-following
|
||||
- xingyaoww/code-act
|
||||
- m-a-p/Code-Feedback
|
||||
- weijie210/gsm8k_decomposed
|
||||
- Locutusque/function-calling-chatml
|
||||
- HuggingFaceTB/smoltalk
|
||||
base_model:
|
||||
- HuggingFaceTB/SmolLM2-135M-Instruct
|
||||
pipeline_tag: text-generation
|
||||
---
|
||||
# POC
|
||||
|
||||
# FORKED FROM
|
||||
# 🧠 NanoAgent — 135M Parameter Agentic LLM
|
||||
|
||||
NanoAgent is a compact 135M parameter, 8k context-length language model trained to **perform tool calls** and **generate responses based on tool outputs**.
|
||||
Despite its small size (~135 MB in 8-bit precision), it’s optimized for agentic use cases and runs easily on personal devices.
|
||||
|
||||
**Github:** [NanoAgent](https://github.com/QuwsarOhi/NanoAgent)
|
||||
|
||||
**Inference resource:** [link](https://github.com/QuwsarOhi/NanoAgent/blob/main/notebooks/inference.ipynb)
|
||||
|
||||
---
|
||||
|
||||
## ✨ Features
|
||||
|
||||
- 🧰 **Tool Calling** — understands and responds with structured outputs from tool calls.
|
||||
- 🧭 **Instruction Following** — strong instruction following abilities.
|
||||
- 🧠 **Basic Reasoning** — handles lightweight reasoning and ReAct-style interactions.
|
||||
- ⚡ **Lightweight** — runs on local hardware with minimal resources.
|
||||
|
||||
---
|
||||
|
||||
## 🧪 Training Overview
|
||||
|
||||
**Base model:** [`SmolLM2-135M-Instruct`](https://huggingface.co/HuggingFaceTB/SmolLM2-135M-Instruct)
|
||||
**Fine-tuning method:** [Dynamic Fine-Tuning (DFT)](https://github.com/yongliang-wu/DFT/tree/master)
|
||||
**Hardware:** Apple Mac M1 (16 GB Unified Memory) using MLX.
|
||||
|
||||
### 📚 Datasets Used
|
||||
- `microsoft/orca-agentinstruct-1M-v1` — agentic tasks, RAG answers, classification
|
||||
- `microsoft/orca-math-word-problems-200k` — lightweight reasoning
|
||||
- `allenai/tulu-3-sft-personas-instruction-following` — instruction following
|
||||
- `xingyaoww/code-act` — ReAct style reasoning and action
|
||||
- `m-a-p/Code-Feedback` — alignment via feedback
|
||||
- `HuggingFaceTB/smoltalk` + `/apigen` — tool calling stabilization
|
||||
- `weijie210/gsm8k_decomposed` — question decomposition
|
||||
- `Locutusque/function-calling-chatml` — tool call response structure
|
||||
|
||||
---
|
||||
|
||||
## ⚠️ Disclaimer
|
||||
|
||||
This is a **beta model**.
|
||||
- It may produce **incorrect** or **incomplete** outputs.
|
||||
- Tool call execution is **basic** and can fail in some cases.
|
||||
- Intended for **research and experimentation** only — not production use.
|
||||
|
||||
---
|
||||
|
||||
## 🧭 Roadmap
|
||||
|
||||
- ✅ Initial release with DFT fine-tuning
|
||||
- 🧪 Benchmarking on agentic tasks
|
||||
- ~~🔬 Experimenting with GRPO for tool calling (failed)~~
|
||||
- 🧠 Weight merging experiments for improved performance
|
||||
- Add more tool calling dataset
|
||||
|
||||
---
|
||||
|
||||
## 📥 Model Size
|
||||
|
||||
- 135M parameters
|
||||
- ~135 MB in 8-bit precision
|
||||
- 8k context length
|
||||
|
||||
---
|
||||
|
||||
## ⚡ Example Usage
|
||||
|
||||
```python
|
||||
from transformers import AutoModelForCausalLM, AutoTokenizer
|
||||
|
||||
model_name = "quwsarohi/NanoAgent-135M"
|
||||
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
||||
model = AutoModelForCausalLM.from_pretrained(model_name)
|
||||
|
||||
def inference(messages, max_new_tokens=256, temperature=0.3, min_p=0.15, **kwargs):
|
||||
input_text = tokenizer.apply_chat_template(
|
||||
messages, tokenize=False, add_generation_prompt=True
|
||||
)
|
||||
inputs = tokenizer.encode(input_text, return_tensors="pt")
|
||||
outputs = model.generate(
|
||||
inputs,
|
||||
max_new_tokens=max_new_tokens,
|
||||
do_sample=True,
|
||||
min_p=0.15,
|
||||
temperature=temperature,
|
||||
**kwargs
|
||||
)
|
||||
return tokenizer.decode(outputs[0][inputs.shape[1] :], skip_special_tokens=True)
|
||||
|
||||
messages = [{"role": "user", "content": "Hi! Do you have a name?"}]
|
||||
print(inference(messages))
|
||||
```
|
||||
|
||||
Use the following template for tool calling:
|
||||
```python
|
||||
TOOL_TEMPLATE = """You are a helpful AI assistant. You have a set of possible functions/tools inside <tools></tools> tags.
|
||||
Based on question, you may need to make one or more function/tool calls to answer user.
|
||||
|
||||
You have access to the following tools/functions:
|
||||
<tools>{tools}</tools>
|
||||
|
||||
For each function call, return a JSON list object with function name and arguments within <tool_call></tool_call> tags."""
|
||||
```
|
||||
|
||||
Sample tool call definition:
|
||||
```json
|
||||
{
|
||||
"name": "web_search",
|
||||
"description": "Performs a web search for a query and returns a string of the top search results formatted as markdown with titles, links, and descriptions.",
|
||||
"parameters": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"query": {
|
||||
"type": "string",
|
||||
"description": "The search query to perform.",
|
||||
}
|
||||
},
|
||||
"required": ["query"],
|
||||
},
|
||||
}
|
||||
```
|
||||
Reference in New Issue
Block a user