初始化项目，由ModelHub XC社区提供模型

Model: katanemo/Plano-Orchestrator-4B Source: Original Platform
2026-05-21 01:16:17 +08:00
commit 41f77a59f0
14 changed files with 152388 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,125 @@
+---
+license: other
+license_name: katanemo-research
+license_link: >-
+  https://huggingface.co/katanemo/Plano-Orchestrator-4B/blob/main/LICENSE
+base_model:
+- Qwen/Qwen3-4B-Instruct-2507
+language:
+- en
+pipeline_tag: text-generation
+---
+# katanemo/Plano-Orchestrator-4B
+
+## Overview
+
+**Plano-Orchestrator** is a family of state-of-the-art routing and orchestration models that decide which agent(s) or LLM(s) should handle each request, and in what sequence. Built for multi-agent orchestration systems, Plano-Orchestrator excels at analyzing user intent and conversation context to make precise routing and orchestration decisions. Designed for real-world deployments, it delivers strong performance across general conversations, coding tasks, and long-context multi-turn conversations, while remaining efficient enough for low-latency production environments. 
+
+#### Key capabilities
+- **Multi-turn Context Understanding**: Makes routing decisions based on full conversation history, maintaining contextual awareness across extended dialogues with evolving user needs.
+- **Multi-intent Detection**: Identifies when a single user message requires multiple agents simultaneously, enabling parallel/sequential routing to fulfill complex requests.
+- **Context-dependent Routing**: Correctly interprets ambiguous or referential messages by leveraging prior conversation context for accurate routing decisions.
+- **Conversational Flow Handling**: Understands diverse interaction patterns including follow-ups, clarifications, confirmations, and corrections within ongoing conversations.
+- **Negative Case Detection**: Recognizes when no specialized routing is needed, avoiding unnecessary LLM or agent calls for casual conversation.
+
+## Benchmark
+
+We evaluate on **1,958 user messages** across **605 multi-turn conversations** with more than **130 different agents**, covering three scenarios:
+
+- **General** (1,438 messages): Everyday conversational queries spanning diverse topics and agent types
+- **Coding** (285 messages): Development-focused conversations including debugging, code generation, and technical assistance
+- **Long-context** (235 messages): Extended conversations requiring understanding of extensive prior context
+
+ Each message is annotated with routing-relevant attributes, including not limited to intent multiplicity, context dependency, and continuation type. Below is the evaluation 
+ result.
+
+<div align="center">
+  <img width="100%" height="auto" src="./assets/Plano-Orchestrator.png"></a>
+</div>
+
+> [!NOTE]
+> For evaluation, please note that all models were evaluated with minimal reasoning to ensure routing remains efficient.
+
+## Example
+
+```python
+import json
+import torch
+
+from transformers import AutoTokenizer, AutoModelForCausalLM
+
+
+ORCHESTRATION_PROMPT = (
+    "You are a helpful assistant that selects the most suitable routes based on user intent.\n"
+    "You are provided with a list of available routes enclosed within <routes></routes> XML tags:\n"
+    "<routes>\n{routes}\n</routes>\n\n"
+    "You are also given the conversation context enclosed within <conversation></conversation> XML tags:\n"
+    "<conversation>\n{conversation}\n</conversation>\n\n"
+    "## Instructions\n"
+    "1. Analyze the latest user intent from the conversation.\n"
+    "2. Compare it against the available routes to find which routes can help fulfill the request.\n"
+    "3. Respond only with the exact route names from <routes>.\n"
+    "4. If no routes can help or the intent is already fulfilled, return an empty list.\n\n"
+    "## Response Format\n"
+    "Return your answer strictly in JSON as follows:\n"
+    '{{"route": ["route_name_1", "route_name_2", "..."]}}\n'
+    "If no routes are needed, return an empty list for `route`."
+)
+
+def convert_agents_to_routes(agents):
+    tools = [
+        {
+            "name": agent["name"],
+            "description": agent["description"],
+        }
+        for agent in agents
+    ]
+    return "\n".join([json.dumps(tool, ensure_ascii=False) for tool in tools])
+
+def build_messages(available_agents, conversation):
+    routes = convert_agents_to_routes(available_agents)
+    conversation_str = json.dumps(conversation, indent=4, ensure_ascii=False)
+    prompt = ORCHESTRATION_PROMPT.format(routes=routes, conversation=conversation_str)
+    return [{"role": "user", "content": prompt}]
+
+# Load model
+model_name = "katanemo/Plano-Orchestrator-4B"
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = AutoModelForCausalLM.from_pretrained(
+    model_name,
+    torch_dtype=torch.float16,
+    device_map="auto"
+)
+
+# Define available agents
+available_agents = [
+    {"name": "WeatherAgent", "description": "Provides weather forecasts and current conditions for any location"},
+    {"name": "CodeAgent", "description": "Generates, debugs, explains, and reviews code in multiple programming languages"}
+]
+
+# Conversation history
+conversation = [
+    {"role": "user", "content": "What's the weather like today?"},
+    {"role": "assistant", "content": "I can help you with that. Could you tell me your location?"},
+    {"role": "user", "content": "San Francisco"},
+]
+
+# Build messages and generate
+model_inputs = tokenizer.apply_chat_template(
+    messages, add_generation_prompt=True, return_tensors="pt", return_dict=True
+).to(model.device)
+
+generated_ids = model.generate(**model_inputs, max_new_tokens=32768)
+generated_ids = [
+    output_ids[len(input_ids) :]
+    for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
+]
+
+response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
+print(response)
+# Output: {"route": ["WeatherAgent"]}
+```
+
+## License
+
+The Plano-Orchestrator collection is distributed under the [Katanemo license](https://huggingface.co/katanemo/Plano-Orchestrator-4B/blob/main/LICENSE).