初始化项目，由ModelHub XC社区提供模型

Model: watt-ai/watt-tool-8B Source: Original Platform
2026-05-21 10:52:45 +08:00
commit ba4077a005
11 changed files with 412853 additions and 0 deletions
--- a/.gitattributes
+++ b/.gitattributes
@@ -0,0 +1,35 @@
 *.7z filter=lfs diff=lfs merge=lfs -text
 *.arrow filter=lfs diff=lfs merge=lfs -text
 *.bin filter=lfs diff=lfs merge=lfs -text
 *.bz2 filter=lfs diff=lfs merge=lfs -text
 *.ckpt filter=lfs diff=lfs merge=lfs -text
 *.ftz filter=lfs diff=lfs merge=lfs -text
 *.gz filter=lfs diff=lfs merge=lfs -text
 *.h5 filter=lfs diff=lfs merge=lfs -text
 *.joblib filter=lfs diff=lfs merge=lfs -text
 *.lfs.* filter=lfs diff=lfs merge=lfs -text
 *.mlmodel filter=lfs diff=lfs merge=lfs -text
 *.model filter=lfs diff=lfs merge=lfs -text
 *.msgpack filter=lfs diff=lfs merge=lfs -text
 *.npy filter=lfs diff=lfs merge=lfs -text
 *.npz filter=lfs diff=lfs merge=lfs -text
 *.onnx filter=lfs diff=lfs merge=lfs -text
 *.ot filter=lfs diff=lfs merge=lfs -text
 *.parquet filter=lfs diff=lfs merge=lfs -text
 *.pb filter=lfs diff=lfs merge=lfs -text
 *.pickle filter=lfs diff=lfs merge=lfs -text
 *.pkl filter=lfs diff=lfs merge=lfs -text
 *.pt filter=lfs diff=lfs merge=lfs -text
 *.pth filter=lfs diff=lfs merge=lfs -text
 *.rar filter=lfs diff=lfs merge=lfs -text
 *.safetensors filter=lfs diff=lfs merge=lfs -text
 saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.tar.* filter=lfs diff=lfs merge=lfs -text
 *.tar filter=lfs diff=lfs merge=lfs -text
 *.tflite filter=lfs diff=lfs merge=lfs -text
 *.tgz filter=lfs diff=lfs merge=lfs -text
 *.wasm filter=lfs diff=lfs merge=lfs -text
 *.xz filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
--- a/README.md
+++ b/README.md
@@ -0,0 +1,121 @@
 ---
 license: apache-2.0
 language:
 - en
 base_model:
 - meta-llama/Llama-3.1-8B-Instruct
 tags:
 - function-calling
 - tool-use
 - llama
 - bfcl
 ---
 # watt-tool-8B
 watt-tool-8B is a fine-tuned language model based on LLaMa-3.1-8B-Instruct, optimized for tool usage and multi-turn dialogue. It achieves state-of-the-art performance on the Berkeley Function-Calling Leaderboard (BFCL).
 ## Model Description
 This model is specifically designed to excel at complex tool usage scenarios that require multi-turn interactions, making it ideal for empowering platforms like [Lupan](https://lupan.watt.chat), an AI-powered workflow building tool. By leveraging a carefully curated and optimized dataset, watt-tool-8B demonstrates superior capabilities in understanding user requests, selecting appropriate tools, and effectively utilizing them across multiple turns of conversation.
 Target Application: AI Workflow Building as in [https://lupan.watt.chat/](https://lupan.watt.chat/) and [Coze](https://www.coze.com/).
 ## Key Features
 *   **Enhanced Tool Usage:** Fine-tuned for precise and efficient tool selection and execution.
 *   **Multi-Turn Dialogue:** Optimized for maintaining context and effectively utilizing tools across multiple turns of conversation, enabling more complex task completion.
 *   **State-of-the-Art Performance:** Achieves top performance on the BFCL, demonstrating its capabilities in function calling and tool usage.
 ## Training Methodology
 watt-tool-8B is trained using supervised fine-tuning on a specialized dataset designed for tool usage and multi-turn dialogue. We use CoT techniques to synthesize high-quality multi-turn dialogue data.
 The training process is inspired by the principles outlined in the paper: ["Direct Multi-Turn Preference Optimization for Language Agents"](https://arxiv.org/abs/2406.14868).
 We use SFT and DMPO to further enhance the model's performance in multi-turn agent tasks.
 ## How to Use
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
 model_id = "watt-ai/watt-tool-8B"
 tokenizer = AutoTokenizer.from_pretrained(model_id)
 model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype='auto', device_map="auto")
 # Example usage (adapt as needed for your specific tool usage scenario)
 """You are an expert in composing functions. You are given a question and a set of possible functions. Based on the question, you will need to make one or more function/tool calls to achieve the purpose.
 If none of the function can be used, point it out. If the given question lacks the parameters required by the function, also point it out.
 You should only return the function call in tools call sections.
 If you decide to invoke any of the function(s), you MUST put it in the format of [func_name1(params_name1=params_value1, params_name2=params_value2...), func_name2(params)]
 You SHOULD NOT include any other text in the response.
 Here is a list of functions in JSON format that you can invoke.\n{functions}\n
 """
 # User query
 query = "Find me the sales growth rate for company XYZ for the last 3 years and also the interest coverage ratio for the same duration."
 tools = [
    {
        "name": "financial_ratios.interest_coverage", "description": "Calculate a company's interest coverage ratio given the company name and duration",
        "arguments": {
            "type": "dict",
            "properties": {
                "company_name": {
                    "type": "string",
                    "description": "The name of the company."
                }, 
                "years": {
                    "type": "integer",
                    "description": "Number of past years to calculate the ratio."
                }
            }, 
            "required": ["company_name", "years"]
        }
    },
    {
        "name": "sales_growth.calculate",
        "description": "Calculate a company's sales growth rate given the company name and duration",
        "arguments": {
            "type": "dict", 
            "properties": {
                "company": {
                    "type": "string",
                    "description": "The company that you want to get the sales growth rate for."
                }, 
                "years": {
                    "type": "integer",
                    "description": "Number of past years for which to calculate the sales growth rate."
                }
            }, 
            "required": ["company", "years"]
        }
    },
    {
        "name": "weather_forecast",
        "description": "Retrieve a weather forecast for a specific location and time frame.",
        "arguments": {
            "type": "dict",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "The city that you want to get the weather for."
                }, 
                "days": {
                    "type": "integer",
                    "description": "Number of days for the forecast."
                }
            },
            "required": ["location", "days"]
        }
    }
 ]
 messages = [
    {'role': 'system', 'content': system_prompt.format(functions=tools)},
    {'role': 'user', 'content': query}
 ]
 inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
 outputs = model.generate(inputs, max_new_tokens=512, do_sample=False, num_return_sequences=1, eos_token_id=tokenizer.eos_token_id)
 print(tokenizer.decode(outputs[0][len(inputs[0]):], skip_special_tokens=True))
--- a/config.json
+++ b/config.json
@@ -0,0 +1,39 @@
 {
  "architectures": [
    "LlamaForCausalLM"
  ],
  "attention_bias": false,
  "attention_dropout": 0.0,
  "bos_token_id": 128000,
  "eos_token_id": [
    128001,
    128008,
    128009
  ],
  "head_dim": 128,
  "hidden_act": "silu",
  "hidden_size": 4096,
  "initializer_range": 0.02,
  "intermediate_size": 14336,
  "max_position_embeddings": 131072,
  "mlp_bias": false,
  "model_type": "llama",
  "num_attention_heads": 32,
  "num_hidden_layers": 32,
  "num_key_value_heads": 8,
  "pretraining_tp": 1,
  "rms_norm_eps": 1e-05,
  "rope_scaling": {
    "factor": 8.0,
    "high_freq_factor": 4.0,
    "low_freq_factor": 1.0,
    "original_max_position_embeddings": 8192,
    "rope_type": "llama3"
  },
  "rope_theta": 500000.0,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.45.2",
  "use_cache": true,
  "vocab_size": 128256
 }
--- a/model-00001-of-00004.safetensors
+++ b/model-00001-of-00004.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:0307d16f5db00de594f14ce4b9afddbe7f2ea0fd9b5be8bbe1bfff06ed2d3267
 size 4953586384
--- a/model-00002-of-00004.safetensors
+++ b/model-00002-of-00004.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:4de644f0ac53b290ea639895637c18247e61c551ecdc1c4d0fdc9258726d8e2a
 size 4999819336
--- a/model-00003-of-00004.safetensors
+++ b/model-00003-of-00004.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:c9c41563f27545d9bead7b1bc583222a9645ef4f1ac7b1b2238c24508c74a234
 size 4915916144
--- a/model-00004-of-00004.safetensors
+++ b/model-00004-of-00004.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:520deaaf56c628695b6f9be942d50402d584218bed3eeadfbba4a12764c347e5
 size 1191234472
--- a/model.safetensors.index.json
+++ b/model.safetensors.index.json
--- a/special_tokens_map.json
+++ b/special_tokens_map.json
@@ -0,0 +1,17 @@
 {
  "bos_token": {
    "content": "<|begin_of_text|>",
    "lstrip": false,
    "normalized": false,
    "rstrip": false,
    "single_word": false
  },
  "eos_token": {
    "content": "<|eot_id|>",
    "lstrip": false,
    "normalized": false,
    "rstrip": false,
    "single_word": false
  },
  "pad_token": "<|eot_id|>"
 }
--- a/tokenizer.json
+++ b/tokenizer.json
--- a/tokenizer_config.json
+++ b/tokenizer_config.json