初始化项目，由ModelHub XC社区提供模型

Model: quotientai/limbic-tool-use-0.5B-32K Source: Original Platform
2026-06-04 16:47:12 +08:00
commit d679c9b775
13 changed files with 612 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,160 @@
+---
+base_model: Qwen/Qwen2.5-0.5B-Instruct
+tags:
+- text-generation-inference
+- transformers
+- unsloth
+- qwen2
+license: apache-2.0
+language:
+- en
+datasets:
+- quotientai/limbic-eval-tool-use-mcp
+---
+
+# Limbic-Tool-Use MCP Function Call Evaluator
+
+This model is a fine-tuned version of Qwen2.5-0.5B-Instruct specifically designed for evaluating function calls in the context of Model Context Protocol (MCP) tools. It can assess whether a function call is correct, uses the wrong tool, has incorrect parameter names, or has incorrect parameter values.
+
+## Model Details
+
+- **Base Model**: [Qwen/Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct)
+- **Fine-tuning Method**: LoRA (Low-Rank Adaptation)
+- **Task**: Function Call Evaluation for MCP (Model Context Protocol)
+- **Training Data**: MCP Server Tools data from public MCP servers, with augmentation / synthetic data generation
+- **Model Size**: ~40MB (LoRA adapters only)
+- **Context Length**: 32,768 tokens
+
+# Model Usage
+
+## Model Prompts
+
+The prompt for the model takes two inputs: 
+- `available_tools` - a list of the tool schemas
+- `message_history` - the user request and model tool call response as a list of jsons
+
+```
+EVALUATOR_PROMPT = """\
+# TOOL CALL EVALUATION RUBRIC
+
+## EVALUATION CRITERIA
+
+### 1. TOOL SELECTION
+- [ ] Function name exists in available tools
+- [ ] Function purpose matches user intent
+
+### 2. PARAMETER STRUCTURE  
+- [ ] All required and relevant parameters are present
+- [ ] No hallucinated parameter names
+- [ ] Parameter names match tool schema exactly
+
+### 3. PARAMETER VALUES
+- [ ] Data types match expected types
+- [ ] Values align with user request
+- [ ] No fabricated or incorrect values
+
+## CLASSIFICATION RULES
+- All criteria passed → `correct`
+- Failed criteria 1 → `incorrect_tool`
+- Failed criteria 2 → `incorrect_parameter_names`  
+- Failed criteria 3 → `incorrect_parameter_values`
+
+---
+### AVAILABLE TOOLS
+{available_tools}
+
+---
+### MESSAGE HISTORY
+{message_history}
+
+---
+## OUTPUT REQUIREMENT
+{{
+    "score": < correct | incorrect_tool | incorrect_parameter_names | incorrect_parameter_values >,
+    "reason": < [if incorrect, provide a brief list of reasons] >
+}}
+
+### EVALUATION:
+"""
+```
+```
+SYSTEM_PROMPT = "You are an expert evaluator of function calls. You will be given a function call and a list of available tools. You will need to evaluate the function call and return a score and a reason for the score."
+```
+
+### Example Inputs
+```
+available_tools = [
+    {
+        "name": "google-play-developer",
+        "description": "Get apps by a developer on Google Play",
+        "input_schema": {
+            "type": "object",
+            "properties": {
+                "devId": {"type": "string", "description": "Developer ID"},
+                "num": {"type": "number", "default": 60, "description": "Number of results"},
+                "lang": {"type": "string", "default": "en", "description": "Language code"},
+                "country": {"type": "string", "default": "us", "description": "Country code"}
+            },
+            "required": ["devId"]
+        }
+    }
+]
+
+message_history = [
+    {"role": "user", "content": "I'm looking to evaluate the performance of all the apps developed by 'Example Developer' on the Google Play Store. Could you provide me with a list of their recent applications, specifically in English and focused on the US market? Please limit the results to 50 apps for a quicker review."},
+    {"role": "assistant", "content": {"function": "name": "google-play-developer", "arguments": {"devId": "com.example.developer", "num": 50, "lang": "en", "country": "us"}}}
+]
+```
+
+## Output Format
+The model outputs evaluations in JSON format:
+
+```json
+{
+    "score": "correct|incorrect_tool|incorrect_parameter_names|incorrect_parameter_values",
+    "reason": ["reasons for failure if incorrect"]
+}
+```
+
+#### Score Categories
+
+- **correct**: Function call matches available tools and parameters exactly
+- **incorrect_tool**: Function name doesn't exist in available tools
+- **incorrect_parameter_names**: Function exists but parameter names are wrong
+- **incorrect_parameter_values**: Function and parameters exist but values are inappropriate
+
+
+## Load the Model
+```
+from transformers import AutoTokenizer, AutoModelForCausalLM
+
+tokenizer = AutoTokenizer.from_pretrained("quotientai/limbic-tool-use-0.5B-32K")
+model = AutoModelForCausalLM.from_pretrained("quotientai/limbic-tool-use-0.5B-32K")
+```
+
+## Generate a Prediction
+To make a prediction, you must convert the formatted prompt into its chat format.
+```
+chat_template = [
+  {"role": "system", "content": SYSTEM_PROMPT},
+  {"role": "user", "content": "<your-formatted-user-prompt>"}
+]
+# Apply the chat template
+text = tokenizer.apply_chat_template(chat_template, tokenize=False, add_generation_prompt=True)
+
+# Tokenize with truncation
+inputs = tokenizer(text, return_tensors="pt", truncation=True).to("cuda")
+
+# Generate your prediction
+result = model.generate(**inputs, max_new_tokens=128, use_cache=True)
+```
+
+## Citation
+```bibtex
+@model{limbic-tool-use-0.5B-32K,
+  title={Limbic Tool Use Evaluator},
+  author={QuotientAI},
+  year={2025},
+  url={https://huggingface.co/quotientai/limbic-tool-use-0.5B-32K}
+}
+```