初始化项目，由ModelHub XC社区提供模型

Model: intuit/agent-tool-optimizer Source: Original Platform
2026-06-04 12:57:56 +08:00
commit 4b295dcee9
18 changed files with 152520 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,223 @@
+---
+license: apache-2.0
+language:
+- en
+datasets:
+- intuit/tool-optimizer-dataset
+base_model:
+- Qwen/Qwen3-4B-Instruct-2507
+pipeline_tag: text-generation
+library_name: transformers
+tags:
+- agents
+- tool-use
+- sft
+- documentation
+- text-generation
+---
+
+# Agent Tool Optimizer (`intuit/agent-tool-optimizer`)
+
+`intuit/agent-tool-optimizer` is a **supervised fine-tuned (SFT)** model that rewrites **tool / API descriptions** to be more usable by **LLM agents**. Given a tool name, a parameter schema, and a baseline (often human-written) description, the model produces an improved description that helps an agent:
+
+- decide **when to use vs. not use** the tool
+- generate **valid parameters** (required vs optional, constraints, defaults)
+- avoid common mistakes and likely validation failures
+
+This model is trained to work in a **trace-free** setting at inference time (i.e., **no tool execution traces required**).
+
+For the accompanying codebase (inference + training), see: [Agent Tool Interface Optimizer](https://github.com/intuit-ai-research/tool-optimizer).
+
+---
+
+## What problem does this solve?
+
+Tool interfaces (descriptions + parameter schemas) are the “contract” between agents and tools, but are typically written for humans. When descriptions under-specify **required parameters**, omit **constraints**, or fail to explain **tool boundaries**, agent performance can plateau and can degrade as the number of available tools increases.
+
+We study tool interface improvement as a scalable complement to agent fine-tuning, and propose **Trace-Free+**: a curriculum-learning approach that transfers knowledge learned from trace-rich training to trace-free inference for unseen tools.
+
+---
+
+## Paper (arXiv)
+
+This model is released alongside the preprint:
+
+- **Learning to Rewrite Tool Descriptions for Reliable LLM-Agent Tool Use**  
+  Ruocheng Guo, Kaiwen Dong, Xiang Gao, Kamalika Das  
+  arXiv:2602.20426 (2026) — [paper](https://arxiv.org/abs/2602.20426)
+
+### Citation
+
+```bibtex
+@misc{guo2026learningrewritetooldescriptions,
+      title={Learning to Rewrite Tool Descriptions for Reliable LLM-Agent Tool Use}, 
+      author={Ruocheng Guo and Kaiwen Dong and Xiang Gao and Kamalika Das},
+      year={2026},
+      eprint={2602.20426},
+      archivePrefix={arXiv},
+      primaryClass={cs.AI},
+      url={https://arxiv.org/abs/2602.20426}, 
+}
+```
+
+---
+
+## Recommended prompt (trace-free)
+
+This is the **canonical inference prompt** used for trace-free tool description generation (also available as `tool_prompt.txt` in the `tool-optimizer` repo).
+
+```
+You are an API documentation specialist.
+
+Rewrite the API description so an AI agent can:
+1) Decide when to use this API
+2) Generate valid parameters
+
+Inputs:
+- API name: {tool_name}
+- Parameter schema: {parameter_json}
+- Baseline description: {original_description}
+
+Infer (do not output):
+- When to use vs not use this API
+- Required vs optional parameters
+- Parameter meanings and constraints
+- Cross-parameter dependencies or exclusions
+- Common parameter mistakes
+  - no examples are provided, infer from the schema and baseline description only
+
+Write a clear API description that:
+- States when to use and NOT use the API
+- Does not invent or reference non-provided APIs
+- Explains each parameter's meaning, type, required/optional status, constraints, and defaults
+- Describes likely validation failures and how to avoid them
+- Abstracts patterns into general rules
+- Does not restate the full schema verbatim
+- Does not mention whether examples were provided
+
+You may replace the baseline description entirely.
+
+Output ONLY valid JSON (no markdown, no code blocks):
+{{"description": "<your improved API description here>"}}
+```
+
+### Inputs
+
+- **`tool_name`**: the tool/API name
+- **`parameter_json`**: a JSON string describing the parameter schema (treat this as authoritative)
+- **`original_description`**: the baseline description you want to improve
+
+### Output
+
+The model is trained to output **only valid JSON** with a single field:
+
+- **`description`**: the improved tool description (string)
+
+---
+
+## Prompt variation guidance (SFT-sensitive)
+
+Because this model is SFT to follow a specific prompt and output contract, it can be sensitive to prompt changes. The safest strategy is to treat the prompt as a template and apply only **minimal, well-scoped edits**.
+
+### Prompt invariants (do not change)
+
+- Keep the three input slots exactly: `{tool_name}`, `{parameter_json}`, `{original_description}`
+- Keep: **“Output ONLY valid JSON (no markdown, no code blocks)”**
+- Keep the output schema exactly: `{"description": "..."}` (same key name; no extra keys)
+
+### Safe, minimal edits (usually OK)
+
+- Add 1–3 bullets under **“Infer (do not output)”** to clarify what to pay attention to
+- Add constraints under **“Write a clear API description that:”** as additional bullets
+- Add brief reminders about schema authority, parameter-name exactness, or concision
+
+### Risky edits (often break JSON / behavior)
+
+- Reordering or removing the output-format lines
+- Asking for examples, multi-part outputs, markdown, or extra keys
+- Changing placeholder names or introducing new “inputs” not present during training
+
+### Concrete example: minimal diff that still tends to work
+
+The prompt below is a conservative variation. It adds clarifications without changing the core structure or output contract:
+
+```diff
+ Infer (do not output):
+ - Preserve key lexical tokens from the baseline description that may match user queries
+ - Clarify boundaries if this API could be confused with similar tools
+
+ Write a clear API description that:
+ - Treats the parameter schema as authoritative and does not introduce fields, types, or requirements not defined in it
+ - Explains each parameter's meaning ... while keeping parameter names exactly as defined in the schema
+ - Lists REQUIRED parameters before optional ones
+ - Uses enumerated or candidate values exactly as defined in the schema when applicable
+ - Describes likely validation failures strictly based on schema-defined constraints ...
+ - Keeps the description concise and avoids unnecessary verbosity
+```
+
+---
+
+## Inference
+
+### Option A: Use the `tool-optimizer` library (recommended)
+
+The open-source repo includes a working CLI that runs this model with either **vLLM** or **Hugging Face Transformers**:
+
+```bash
+git clone https://github.com/intuit-ai-research/tool-optimizer
+cd tool-optimizer
+
+# Install (one option)
+python -m pip install -e .
+
+# Run inference (vLLM default)
+python src/agent_tool_optimizer/inference_main.py \
+  --model_name intuit/agent-tool-optimizer \
+  --dataset_id intuit/tool-optimizer-dataset
+```
+
+Notes:
+
+- `--inference_engine vllm` (default) or `--inference_engine hf`
+- The dataset is expected to have a `test` split with a `prompt` field.
+
+### Option B: Transformers (direct)
+
+```python
+import json
+from transformers import pipeline
+import torch
+
+model_id = "intuit/agent-tool-optimizer"
+gen = pipeline(
+    "text-generation",
+    model=model_id,
+    torch_dtype=torch.bfloat16,
+    device_map="auto",
+    trust_remote_code=True,
+)
+
+prompt = """<prompt above>"""
+
+out = gen(
+    [{\"role\": \"user\", \"content\": prompt}],
+    max_new_tokens=512,
+    do_sample=True,
+    temperature=0.6,
+    top_p=0.95,
+    top_k=40,
+    return_full_text=False,
+)
+result = out[0][\"generated_text\"]
+print(result)
+
+# Optional: validate JSON
+json.loads(result)
+```
+
+---
+
+## Example (Before vs After)
+
+![Screenshot 2026-02-20 at 5.23.36 PM](https://cdn-uploads.huggingface.co/production/uploads/65dcb410bda21d181b38321b/dFj0XgXancXD51iyGxC83.png)
+