151 lines
4.6 KiB
Markdown
151 lines
4.6 KiB
Markdown
---
|
||
language:
|
||
- en
|
||
license: apache-2.0
|
||
tags:
|
||
- llm
|
||
- tool-calling
|
||
- lightweight
|
||
- agentic-tasks
|
||
- react
|
||
- mlx
|
||
model-index:
|
||
- name: NanoAgent
|
||
results: []
|
||
datasets:
|
||
- microsoft/orca-agentinstruct-1M-v1
|
||
- microsoft/orca-math-word-problems-200k
|
||
- allenai/tulu-3-sft-personas-instruction-following
|
||
- xingyaoww/code-act
|
||
- m-a-p/Code-Feedback
|
||
- weijie210/gsm8k_decomposed
|
||
- Locutusque/function-calling-chatml
|
||
- HuggingFaceTB/smoltalk
|
||
base_model:
|
||
- HuggingFaceTB/SmolLM2-135M-Instruct
|
||
pipeline_tag: text-generation
|
||
---
|
||
# POC
|
||
|
||
# FORKED FROM
|
||
# 🧠 NanoAgent — 135M Parameter Agentic LLM
|
||
|
||
NanoAgent is a compact 135M parameter, 8k context-length language model trained to **perform tool calls** and **generate responses based on tool outputs**.
|
||
Despite its small size (~135 MB in 8-bit precision), it’s optimized for agentic use cases and runs easily on personal devices.
|
||
|
||
**Github:** [NanoAgent](https://github.com/QuwsarOhi/NanoAgent)
|
||
|
||
**Inference resource:** [link](https://github.com/QuwsarOhi/NanoAgent/blob/main/notebooks/inference.ipynb)
|
||
|
||
---
|
||
|
||
## ✨ Features
|
||
|
||
- 🧰 **Tool Calling** — understands and responds with structured outputs from tool calls.
|
||
- 🧭 **Instruction Following** — strong instruction following abilities.
|
||
- 🧠 **Basic Reasoning** — handles lightweight reasoning and ReAct-style interactions.
|
||
- ⚡ **Lightweight** — runs on local hardware with minimal resources.
|
||
|
||
---
|
||
|
||
## 🧪 Training Overview
|
||
|
||
**Base model:** [`SmolLM2-135M-Instruct`](https://huggingface.co/HuggingFaceTB/SmolLM2-135M-Instruct)
|
||
**Fine-tuning method:** [Dynamic Fine-Tuning (DFT)](https://github.com/yongliang-wu/DFT/tree/master)
|
||
**Hardware:** Apple Mac M1 (16 GB Unified Memory) using MLX.
|
||
|
||
### 📚 Datasets Used
|
||
- `microsoft/orca-agentinstruct-1M-v1` — agentic tasks, RAG answers, classification
|
||
- `microsoft/orca-math-word-problems-200k` — lightweight reasoning
|
||
- `allenai/tulu-3-sft-personas-instruction-following` — instruction following
|
||
- `xingyaoww/code-act` — ReAct style reasoning and action
|
||
- `m-a-p/Code-Feedback` — alignment via feedback
|
||
- `HuggingFaceTB/smoltalk` + `/apigen` — tool calling stabilization
|
||
- `weijie210/gsm8k_decomposed` — question decomposition
|
||
- `Locutusque/function-calling-chatml` — tool call response structure
|
||
|
||
---
|
||
|
||
## ⚠️ Disclaimer
|
||
|
||
This is a **beta model**.
|
||
- It may produce **incorrect** or **incomplete** outputs.
|
||
- Tool call execution is **basic** and can fail in some cases.
|
||
- Intended for **research and experimentation** only — not production use.
|
||
|
||
---
|
||
|
||
## 🧭 Roadmap
|
||
|
||
- ✅ Initial release with DFT fine-tuning
|
||
- 🧪 Benchmarking on agentic tasks
|
||
- ~~🔬 Experimenting with GRPO for tool calling (failed)~~
|
||
- 🧠 Weight merging experiments for improved performance
|
||
- Add more tool calling dataset
|
||
|
||
---
|
||
|
||
## 📥 Model Size
|
||
|
||
- 135M parameters
|
||
- ~135 MB in 8-bit precision
|
||
- 8k context length
|
||
|
||
---
|
||
|
||
## ⚡ Example Usage
|
||
|
||
```python
|
||
from transformers import AutoModelForCausalLM, AutoTokenizer
|
||
|
||
model_name = "quwsarohi/NanoAgent-135M"
|
||
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
||
model = AutoModelForCausalLM.from_pretrained(model_name)
|
||
|
||
def inference(messages, max_new_tokens=256, temperature=0.3, min_p=0.15, **kwargs):
|
||
input_text = tokenizer.apply_chat_template(
|
||
messages, tokenize=False, add_generation_prompt=True
|
||
)
|
||
inputs = tokenizer.encode(input_text, return_tensors="pt")
|
||
outputs = model.generate(
|
||
inputs,
|
||
max_new_tokens=max_new_tokens,
|
||
do_sample=True,
|
||
min_p=0.15,
|
||
temperature=temperature,
|
||
**kwargs
|
||
)
|
||
return tokenizer.decode(outputs[0][inputs.shape[1] :], skip_special_tokens=True)
|
||
|
||
messages = [{"role": "user", "content": "Hi! Do you have a name?"}]
|
||
print(inference(messages))
|
||
```
|
||
|
||
Use the following template for tool calling:
|
||
```python
|
||
TOOL_TEMPLATE = """You are a helpful AI assistant. You have a set of possible functions/tools inside <tools></tools> tags.
|
||
Based on question, you may need to make one or more function/tool calls to answer user.
|
||
|
||
You have access to the following tools/functions:
|
||
<tools>{tools}</tools>
|
||
|
||
For each function call, return a JSON list object with function name and arguments within <tool_call></tool_call> tags."""
|
||
```
|
||
|
||
Sample tool call definition:
|
||
```json
|
||
{
|
||
"name": "web_search",
|
||
"description": "Performs a web search for a query and returns a string of the top search results formatted as markdown with titles, links, and descriptions.",
|
||
"parameters": {
|
||
"type": "object",
|
||
"properties": {
|
||
"query": {
|
||
"type": "string",
|
||
"description": "The search query to perform.",
|
||
}
|
||
},
|
||
"required": ["query"],
|
||
},
|
||
}
|
||
``` |