初始化项目,由ModelHub XC社区提供模型
Model: MadeAgents/Hammer-1.5b Source: Original Platform
This commit is contained in:
35
.gitattributes
vendored
Normal file
35
.gitattributes
vendored
Normal file
@@ -0,0 +1,35 @@
|
|||||||
|
*.7z filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.arrow filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.bin filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.ftz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.gz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.h5 filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.joblib filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.model filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.npy filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.npz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.onnx filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.ot filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.parquet filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.pb filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.pickle filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.pkl filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.pt filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.pth filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.rar filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
||||||
|
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.tar filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.tflite filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.tgz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.wasm filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.xz filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.zip filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*.zst filter=lfs diff=lfs merge=lfs -text
|
||||||
|
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
||||||
159
README.md
Normal file
159
README.md
Normal file
@@ -0,0 +1,159 @@
|
|||||||
|
---
|
||||||
|
license: cc-by-4.0
|
||||||
|
datasets:
|
||||||
|
- Salesforce/xlam-function-calling-60k
|
||||||
|
- MadeAgents/xlam-irrelevance-7.5k
|
||||||
|
base_model: Qwen/Qwen2-1.5B-Instruct
|
||||||
|
---
|
||||||
|
|
||||||
|
# Hammer-1.5b Function Calling Model
|
||||||
|
|
||||||
|
## <font color=red>\[Updates!!!\]</font> Hammer 2.0 Series have been Published
|
||||||
|
|
||||||
|
We're excited to release lightweight Hammer 2.0 models ([0.5B](https://huggingface.co/MadeAgents/Hammer2.0-0.5b) , [1.5B](https://huggingface.co/MadeAgents/Hammer2.0-1.5b) , [3B](https://huggingface.co/MadeAgents/Hammer2.0-3b) , and [7B](https://huggingface.co/MadeAgents/Hammer2.0-7b)) with strong function calling capability, which empower developers to build personalized, on-device agentic applications.
|
||||||
|
|
||||||
|
|
||||||
|
## Introduction
|
||||||
|
**Hammer** is a series of cutting-edge Large Language Models (LLMs) crafted to boost the critical capability of AI agents: function calling. Differing from existing models focusing on training data refinement, Hammer optimizes performance primarily through advanced training techniques. Focusing on on-device applications, we release a number of models from [1.5B](https://huggingface.co/MadeAgents/Hammer-1.5b), [4B](https://huggingface.co/MadeAgents/Hammer-4b) to [7B](https://huggingface.co/MadeAgents/Hammer-7b) parameters.
|
||||||
|
|
||||||
|
## Model Details
|
||||||
|
Hammer finetuned based on [Qwen 2.0 series](https://huggingface.co/collections/Qwen/qwen2-6659360b33528ced941e557f) using function masking techniques. It's trained using the [APIGen Function Calling Datasets](https://huggingface.co/datasets/Salesforce/xlam-function-calling-60k) containing 60,000 samples, supplemented by [xlam-irrelevance-7.5k](https://huggingface.co/datasets/MadeAgents/xlam-irrelevance-7.5k) we generated. Hammer has achieved exceptional performances across numerous function calling benchmarks. For more details, please refer to [Hammer: Robust Function-Calling for On-Device Language Models via Function Masking](https://arxiv.org/abs/2410.04587) and [Hammer GitHub repository](https://github.com/MadeAgents/Hammer).
|
||||||
|
|
||||||
|
## Evaluation
|
||||||
|
First, we evaluate Hammer series on the Berkeley Function-Calling Leaderboard (BFCL-v2):
|
||||||
|
|
||||||
|
<div style="text-align: center;">
|
||||||
|
<img src="figures/bfcl.PNG" alt="overview" width="1480" style="margin: auto;">
|
||||||
|
</div>
|
||||||
|
|
||||||
|
The above table indicates that within the BFCL framework, our Hammer series consistently achieves corresponding sota performance at comparable scales, particularly Hammer-7B, whose overall performance ranks second only to the proprietary GPT-4.
|
||||||
|
|
||||||
|
In addition, we evaluated our Hammer series (1.5b, 4b, 7b) on other academic benchmarks to further show our model's generalization ability:
|
||||||
|
|
||||||
|
<div style="text-align: center;">
|
||||||
|
<img src="figures/others.PNG" alt="overview" width="1000" style="margin: auto;">
|
||||||
|
</div>
|
||||||
|
|
||||||
|
Hammer models showcase highly stable performance, suggesting the robustness of Hammer series. In contrast, the baseline approaches display varying levels of effectiveness.
|
||||||
|
|
||||||
|
## Requiements
|
||||||
|
The code of Hammer-7b has been in the latest Hugging face transformers and we advise you to install `transformers>=4.37.0`.
|
||||||
|
|
||||||
|
## How to Use
|
||||||
|
This is a simple example of how to use our model.
|
||||||
|
~~~python
|
||||||
|
import json
|
||||||
|
import torch
|
||||||
|
from transformers import AutoModelForCausalLM, AutoTokenizer
|
||||||
|
|
||||||
|
model_name = "MadeAgents/Hammer-1.5b"
|
||||||
|
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype="auto", trust_remote_code=True)
|
||||||
|
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
||||||
|
|
||||||
|
# Please use our provided instruction prompt for best performance
|
||||||
|
TASK_INSTRUCTION = """You are a tool calling assistant. In order to complete the user's request, you need to select one or more appropriate tools from the following tools and fill in the correct values for the tool parameters. Your specific tasks are:
|
||||||
|
1. Make one or more function/tool calls to meet the request based on the question.
|
||||||
|
2. If none of the function can be used, point it out and refuse to answer.
|
||||||
|
3. If the given question lacks the parameters required by the function, also point it out.
|
||||||
|
"""
|
||||||
|
|
||||||
|
FORMAT_INSTRUCTION = """
|
||||||
|
The output MUST strictly adhere to the following JSON format, and NO other text MUST be included.
|
||||||
|
The example format is as follows. Please make sure the parameter type is correct. If no function call is needed, please directly output an empty list '[]'
|
||||||
|
```
|
||||||
|
[
|
||||||
|
{"name": "func_name1", "arguments": {"argument1": "value1", "argument2": "value2"}},
|
||||||
|
... (more tool calls as required)
|
||||||
|
]
|
||||||
|
```
|
||||||
|
"""
|
||||||
|
|
||||||
|
# Define the input query and available tools
|
||||||
|
query = "Where can I find live giveaways for beta access and games? And what's the weather like in New York, US?"
|
||||||
|
|
||||||
|
live_giveaways_by_type = {
|
||||||
|
"name": "live_giveaways_by_type",
|
||||||
|
"description": "Retrieve live giveaways from the GamerPower API based on the specified type.",
|
||||||
|
"parameters": {
|
||||||
|
"type": "object",
|
||||||
|
"properties": {
|
||||||
|
"type": {
|
||||||
|
"type": "string",
|
||||||
|
"description": "The type of giveaways to retrieve (e.g., game, loot, beta).",
|
||||||
|
"default": "game"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"required": ["type"]
|
||||||
|
}
|
||||||
|
}
|
||||||
|
get_current_weather={
|
||||||
|
"name": "get_current_weather",
|
||||||
|
"description": "Get the current weather",
|
||||||
|
"parameters": {
|
||||||
|
"type": "object",
|
||||||
|
"properties": {
|
||||||
|
"location": {
|
||||||
|
"type": "string",
|
||||||
|
"description": "The city and state, e.g. San Francisco, CA"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"required": ["location"]
|
||||||
|
}
|
||||||
|
}
|
||||||
|
get_stock_price={
|
||||||
|
"name": "get_stock_price",
|
||||||
|
"description": "Retrieves the current stock price for a given ticker symbol. The ticker symbol must be a valid symbol for a publicly traded company on a major US stock exchange like NYSE or NASDAQ. The tool will return the latest trade price in USD. It should be used when the user asks about the current or most recent price of a specific stock. It will not provide any other information about the stock or company.",
|
||||||
|
"parameters": {
|
||||||
|
"type": "object",
|
||||||
|
"properties": {
|
||||||
|
"ticker": {
|
||||||
|
"type": "string",
|
||||||
|
"description": "The stock ticker symbol, e.g. AAPL for Apple Inc."
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"required": ["ticker"]
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
def convert_to_format_tool(tools):
|
||||||
|
''''''
|
||||||
|
if isinstance(tools, dict):
|
||||||
|
format_tools = {
|
||||||
|
"name": tools["name"],
|
||||||
|
"description": tools["description"],
|
||||||
|
"parameters": tools["parameters"].get("properties", {}),
|
||||||
|
}
|
||||||
|
required = tools["parameters"].get("required", [])
|
||||||
|
for param in required:
|
||||||
|
format_tools["parameters"][param]["required"] = True
|
||||||
|
for param in format_tools["parameters"].keys():
|
||||||
|
if "default" in format_tools["parameters"][param]:
|
||||||
|
default = format_tools["parameters"][param]["default"]
|
||||||
|
format_tools["parameters"][param]["description"]+=f"default is \'{default}\'"
|
||||||
|
return format_tools
|
||||||
|
elif isinstance(tools, list):
|
||||||
|
return [convert_to_format_tool(tool) for tool in tools]
|
||||||
|
else:
|
||||||
|
return tools
|
||||||
|
# Helper function to build the input prompt for our model
|
||||||
|
def build_prompt(task_instruction: str, format_instruction: str, tools: list, query: str):
|
||||||
|
prompt = f"[BEGIN OF TASK INSTRUCTION]\n{task_instruction}\n[END OF TASK INSTRUCTION]\n\n"
|
||||||
|
prompt += f"[BEGIN OF AVAILABLE TOOLS]\n{json.dumps(tools)}\n[END OF AVAILABLE TOOLS]\n\n"
|
||||||
|
prompt += f"[BEGIN OF FORMAT INSTRUCTION]\n{format_instruction}\n[END OF FORMAT INSTRUCTION]\n\n"
|
||||||
|
prompt += f"[BEGIN OF QUERY]\n{query}\n[END OF QUERY]\n\n"
|
||||||
|
return prompt
|
||||||
|
|
||||||
|
# Build the input and start the inference
|
||||||
|
openai_format_tools = [live_giveaways_by_type, get_current_weather,get_stock_price]
|
||||||
|
format_tools = convert_to_format_tool(openai_format_tools)
|
||||||
|
content = build_prompt(TASK_INSTRUCTION, FORMAT_INSTRUCTION, format_tools, query)
|
||||||
|
|
||||||
|
messages=[
|
||||||
|
{ 'role': 'user', 'content': content}
|
||||||
|
]
|
||||||
|
inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
|
||||||
|
|
||||||
|
# tokenizer.eos_token_id is the id of <|EOT|> token
|
||||||
|
outputs = model.generate(inputs, max_new_tokens=512, do_sample=False, num_return_sequences=1, eos_token_id=tokenizer.eos_token_id)
|
||||||
|
print(tokenizer.decode(outputs[0][len(inputs[0]):], skip_special_tokens=True))
|
||||||
|
~~~
|
||||||
5
added_tokens.json
Normal file
5
added_tokens.json
Normal file
@@ -0,0 +1,5 @@
|
|||||||
|
{
|
||||||
|
"<|endoftext|>": 151643,
|
||||||
|
"<|im_end|>": 151645,
|
||||||
|
"<|im_start|>": 151644
|
||||||
|
}
|
||||||
28
config.json
Normal file
28
config.json
Normal file
@@ -0,0 +1,28 @@
|
|||||||
|
{
|
||||||
|
"_name_or_path": "/home/notebook/data/group/ComplexTaskDecision/Hammer/ckpt/select_caller/xlam_1b/xlam_mask3_0.33_hammer_qwen1.5b_batch32/merge_step4220",
|
||||||
|
"architectures": [
|
||||||
|
"Qwen2ForCausalLM"
|
||||||
|
],
|
||||||
|
"attention_dropout": 0.0,
|
||||||
|
"bos_token_id": 151643,
|
||||||
|
"eos_token_id": 151645,
|
||||||
|
"hidden_act": "silu",
|
||||||
|
"hidden_size": 1536,
|
||||||
|
"initializer_range": 0.02,
|
||||||
|
"intermediate_size": 8960,
|
||||||
|
"max_position_embeddings": 32768,
|
||||||
|
"max_window_layers": 28,
|
||||||
|
"model_type": "qwen2",
|
||||||
|
"num_attention_heads": 12,
|
||||||
|
"num_hidden_layers": 28,
|
||||||
|
"num_key_value_heads": 2,
|
||||||
|
"rms_norm_eps": 1e-06,
|
||||||
|
"rope_theta": 1000000.0,
|
||||||
|
"sliding_window": null,
|
||||||
|
"tie_word_embeddings": true,
|
||||||
|
"torch_dtype": "bfloat16",
|
||||||
|
"transformers_version": "4.41.2",
|
||||||
|
"use_cache": true,
|
||||||
|
"use_sliding_window": false,
|
||||||
|
"vocab_size": 151646
|
||||||
|
}
|
||||||
BIN
figures/bfcl.PNG
Normal file
BIN
figures/bfcl.PNG
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 324 KiB |
BIN
figures/others.PNG
Normal file
BIN
figures/others.PNG
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 111 KiB |
14
generation_config.json
Normal file
14
generation_config.json
Normal file
@@ -0,0 +1,14 @@
|
|||||||
|
{
|
||||||
|
"bos_token_id": 151643,
|
||||||
|
"do_sample": true,
|
||||||
|
"eos_token_id": [
|
||||||
|
151645,
|
||||||
|
151643
|
||||||
|
],
|
||||||
|
"pad_token_id": 151643,
|
||||||
|
"repetition_penalty": 1.1,
|
||||||
|
"temperature": 0.7,
|
||||||
|
"top_k": 20,
|
||||||
|
"top_p": 0.8,
|
||||||
|
"transformers_version": "4.41.2"
|
||||||
|
}
|
||||||
151388
merges.txt
Normal file
151388
merges.txt
Normal file
File diff suppressed because it is too large
Load Diff
3
model.safetensors
Normal file
3
model.safetensors
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:fb957615ebe2f1cc490e4fe8401d94aac0bd972f6eba9f30a5fb7ac40483bb67
|
||||||
|
size 3086576264
|
||||||
20
special_tokens_map.json
Normal file
20
special_tokens_map.json
Normal file
@@ -0,0 +1,20 @@
|
|||||||
|
{
|
||||||
|
"additional_special_tokens": [
|
||||||
|
"<|im_start|>",
|
||||||
|
"<|im_end|>"
|
||||||
|
],
|
||||||
|
"eos_token": {
|
||||||
|
"content": "<|im_end|>",
|
||||||
|
"lstrip": false,
|
||||||
|
"normalized": false,
|
||||||
|
"rstrip": false,
|
||||||
|
"single_word": false
|
||||||
|
},
|
||||||
|
"pad_token": {
|
||||||
|
"content": "<|endoftext|>",
|
||||||
|
"lstrip": false,
|
||||||
|
"normalized": false,
|
||||||
|
"rstrip": false,
|
||||||
|
"single_word": false
|
||||||
|
}
|
||||||
|
}
|
||||||
303112
tokenizer.json
Normal file
303112
tokenizer.json
Normal file
File diff suppressed because it is too large
Load Diff
44
tokenizer_config.json
Normal file
44
tokenizer_config.json
Normal file
@@ -0,0 +1,44 @@
|
|||||||
|
{
|
||||||
|
"add_prefix_space": false,
|
||||||
|
"added_tokens_decoder": {
|
||||||
|
"151643": {
|
||||||
|
"content": "<|endoftext|>",
|
||||||
|
"lstrip": false,
|
||||||
|
"normalized": false,
|
||||||
|
"rstrip": false,
|
||||||
|
"single_word": false,
|
||||||
|
"special": true
|
||||||
|
},
|
||||||
|
"151644": {
|
||||||
|
"content": "<|im_start|>",
|
||||||
|
"lstrip": false,
|
||||||
|
"normalized": false,
|
||||||
|
"rstrip": false,
|
||||||
|
"single_word": false,
|
||||||
|
"special": true
|
||||||
|
},
|
||||||
|
"151645": {
|
||||||
|
"content": "<|im_end|>",
|
||||||
|
"lstrip": false,
|
||||||
|
"normalized": false,
|
||||||
|
"rstrip": false,
|
||||||
|
"single_word": false,
|
||||||
|
"special": true
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"additional_special_tokens": [
|
||||||
|
"<|im_start|>",
|
||||||
|
"<|im_end|>"
|
||||||
|
],
|
||||||
|
"bos_token": null,
|
||||||
|
"chat_template": "{% set system_message = 'You are a helpful assistant.' %}{% if messages[0]['role'] == 'system' %}{% set system_message = messages[0]['content'] %}{% endif %}{% if system_message is defined %}{{ '<|im_start|>system\n' + system_message + '<|im_end|>\n' }}{% endif %}{% for message in messages %}{% set content = message['content'] %}{% if message['role'] == 'user' %}{{ '<|im_start|>user\n' + content + '<|im_end|>\n<|im_start|>assistant\n' }}{% elif message['role'] == 'assistant' %}{{ content + '<|im_end|>' + '\n' }}{% endif %}{% endfor %}",
|
||||||
|
"clean_up_tokenization_spaces": false,
|
||||||
|
"eos_token": "<|im_end|>",
|
||||||
|
"errors": "replace",
|
||||||
|
"model_max_length": 32768,
|
||||||
|
"pad_token": "<|endoftext|>",
|
||||||
|
"padding_side": "right",
|
||||||
|
"split_special_tokens": false,
|
||||||
|
"tokenizer_class": "Qwen2Tokenizer",
|
||||||
|
"unk_token": null
|
||||||
|
}
|
||||||
1
vocab.json
Normal file
1
vocab.json
Normal file
File diff suppressed because one or more lines are too long
Reference in New Issue
Block a user