初始化项目,由ModelHub XC社区提供模型

Model: dicta-il/DictaLM-3.0-1.7B-Instruct
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-06-04 10:52:17 +08:00
commit 796eba0e16
13 changed files with 151978 additions and 0 deletions

36
.gitattributes vendored Normal file
View File

@@ -0,0 +1,36 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
tokenizer.json filter=lfs diff=lfs merge=lfs -text

0
COMPLETED Normal file
View File

92
README.md Normal file
View File

@@ -0,0 +1,92 @@
---
license: apache-2.0
pipeline_tag: text-generation
language:
- en
- he
tags:
- pretrained
inference:
parameters:
temperature: 0.6
---
[<img src="https://i.ibb.co/5Lbwyr1/dicta-logo.jpg" width="300px"/>](https://dicta.org.il)
# Dicta-LM 3.0: Advancing The Frontier of Hebrew Sovereign LLMs
Dicta-LM 3.0 is a powerful open-weight collection of LLMs, trained on extensive corpora of Hebrew and English texts. The models are available for download and for unlimited use. The models set a new SOTA for their weight-class for Hebrew, both as base models and chat models.
This is the 1.7-billion-parameter instruct model, with full precision (BF16), originally initialized from [Qwen3-1.7B-Base](https://huggingface.co/Qwen/Qwen3-1.7B-Base).
For full details of this model please read our [release blog post](https://dicta.org.il/dicta-lm-3) or the [technical report](https://www.dicta.org.il/publications/DictaLM_3_0___Techincal_Report.pdf).
You can view and access the full collection of base/instruct unquantized/quantized versions of `DictaLM 3.0` [here](https://huggingface.co/collections/dicta-il/dictalm-30-collection).
## Instruction format
In order to leverage instruction fine-tuning, your prompt should be rendered using the chat template specified for this model. Most libraries deal with this automatically, so you can just let them do it.
## Usage
We recommend using vLLM, but you can use Transformers as well:
### Transformers
```python
from transformers import pipeline
generator = pipeline('text-generation', model="dicta-il/DictaLM-3.0-1.7B-Instruct")
messages = [
{"role": "user", "content": "איזה רוטב אהוב עליך?"},
{"role": "assistant", "content": "טוב, אני די מחבב כמה טיפות מיץ לימון סחוט טרי. זה מוסיף בדיוק את הכמות הנכונה של טעם חמצמץ לכל מה שאני מבשל במטבח!"},
{"role": "user", "content": "האם יש לך מתכונים למיונז?"}
]
print(generator(messages)[0]['generated_text'][-1]) # just print the last message
# {'role': 'assistant', 'content': 'בהחלט! הנה מתכון פשוט למיונז בסיסי:\n\nמרכיבים:\n- 1 ביצה\n- 1 כף חומץ לבן (חומץ תפוחים או חומץ בן יין לבן עובד היטב)\n- 1/4 כוס שמנת כבדה או חלב מרוכז\n- 1/4 כפית מלח\n- 1/4 כפית פלפל שחור גרוס\n- 1 כף שום גבישי, קצוץ דק\n\nהוראות:\n1. בקערה קטנה, טורפים יחד את הביצה, חומץ, שמנת, מלח ופלפל.\n2. מניחים את הבלילה על מגש, מקפלים אותה ליצירת גלילים דקים.\n3. מכסים את המגש בניילון נצמד ומתפיחים במקרר למשך שעה לפחות.\n4. מקמחים קלות את הגלילים, ואז מגלגלים אותם בין שני קצוות של נייר אפייה עד שהצד החלק כלפי מטה.\n5. מחממים את התנור ל-175°C (350'}
```
### vLLM
```bash
vllm serve dicta-il/DictaLM-3.0-1.7B-Instruct --enable-auto-tool-choice --tool-call-parser hermes
```
And then you can access it via the openai library:
```python
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:8000/v1",
api_key="sk-no-key-required"
)
response = client.chat.completions.create(
model="dicta-il/DictaLM-3.0-1.7B-Instruct",
messages=[
{"role": "user", "content": "Hello, how are you?"}
],
)
print(response.choices[0].message.content)
```
The model supports tool-calling, enabling integration with external tools and APIs. For example how to use the tool calling, see the [vLLM documentation](https://docs.vllm.ai/en/stable/features/tool_calling/#tool-calling).
## Citation
If you use this model, please cite:
```bibtex
@article{Shmidman2025DictaLM3,
title={{Dicta-LM 3.0: Advancing The Frontier of Hebrew Sovereign LLMs}},
author={Shaltiel Shmidman and Avi Shmidman and Amir DN Cohen and Moshe Koppel},
year={2025},
publisher={{DICTA / Jerusalem, Israel}},
note={https://www.dicta.org.il/publications/DictaLM_3_0___Techincal_Report.pdf}
}
```

28
added_tokens.json Normal file
View File

@@ -0,0 +1,28 @@
{
"</think>": 151668,
"</tool_call>": 151658,
"</tool_response>": 151666,
"<think>": 151667,
"<tool_call>": 151657,
"<tool_response>": 151665,
"<|box_end|>": 151649,
"<|box_start|>": 151648,
"<|endoftext|>": 151643,
"<|file_sep|>": 151664,
"<|fim_middle|>": 151660,
"<|fim_pad|>": 151662,
"<|fim_prefix|>": 151659,
"<|fim_suffix|>": 151661,
"<|im_end|>": 151645,
"<|im_start|>": 151644,
"<|image_pad|>": 151655,
"<|object_ref_end|>": 151647,
"<|object_ref_start|>": 151646,
"<|quad_end|>": 151651,
"<|quad_start|>": 151650,
"<|repo_name|>": 151663,
"<|video_pad|>": 151656,
"<|vision_end|>": 151653,
"<|vision_pad|>": 151654,
"<|vision_start|>": 151652
}

77
chat_template.jinja Normal file
View File

@@ -0,0 +1,77 @@
{# ───── header (system message) ───── #}
{{- "<|im_start|>system\n" -}}
{# ───── get the custom instructions + optional thinking override ───── #}
{%- if messages[0].role == "system" -%}
{%- set custom_instructions = messages[0].content.rstrip() -%}
{%- endif -%}
{# ───── set the system prompt ───── #}
{%- if custom_instructions -%}
{{- custom_instructions -}}
{%- else -%}
{{- "You are a helpful AI assistant named Dicta-LM 3.0, Trained by Dicta, the Israel Center for Text Analysis. Your role is to provide accurate, helpful, and well-structured responses to user questions and requests.\nProvide clear, logical, and precise answers that thoroughly address what the user is asking for. Structure your responses in a way that is easy to understand and follow." -}}
{%- endif -%}
{%- set ns = namespace(multi_step_tool=true, last_query_index=messages | length - 1) -%}
{%- for message in messages[::-1] -%}
{%- set index = messages | length - 1 - loop.index0 -%}
{%- if ns.multi_step_tool and message.role == "user" and message.content is string and not (message.content.startswith("<tool_response>") and message.content.endswith("</tool_response>")) -%}
{%- set ns.multi_step_tool = false -%}
{%- set ns.last_query_index = index -%}
{%- endif -%}
{%- endfor -%}
{%- if tools -%}
{{- "\n\n## Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" -}}
{%- for tool in tools -%}
{{- "\n" -}}
{{- tool | tojson -}}
{%- endfor -%}
{{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call>" -}}
{%- endif -%}
{{- "<|im_end|>" -}}
{%- for message in messages -%}
{%- if message.content is string -%}
{%- set content = message.content -%}
{%- else -%}
{%- set content = "" -%}
{%- endif -%}
{%- if message.role == "user" or message.role == "system" and not loop.first -%}
{{- "\n<|im_start|>" + message.role + "\n" + content + "<|im_end|>" -}}
{%- elif message.role == "assistant" -%}
{% generation %}
{{- "\n<|im_start|>" + message.role + "\n" + content -}}
{%- if message.tool_calls -%}
{%- for tool_call in message.tool_calls -%}
{%- if loop.first and content or not loop.first -%}
{{- "\n" -}}
{%- endif -%}
{%- if tool_call.function -%}
{%- set tool_call = tool_call.function -%}
{%- endif -%}
{{- "<tool_call>\n{\"name\": \"" -}}
{{- tool_call.name -}}
{{- "\", \"arguments\": " -}}
{%- if tool_call.arguments is string -%}
{{- tool_call.arguments -}}
{%- else -%}
{{- tool_call.arguments | tojson -}}
{%- endif -%}
{{- "}\n</tool_call>" -}}
{%- endfor -%}
{%- endif -%}
{{- "<|im_end|>" -}}
{% endgeneration %}
{%- elif message.role == "tool" -%}
{%- if loop.first or messages[loop.index0 - 1].role != "tool" -%}
{{- "\n<|im_start|>user" -}}
{%- endif -%}
{{- "\n<tool_response>\n" -}}
{{- content -}}
{{- "\n</tool_response>" -}}
{%- if loop.last or messages[loop.index0 + 1].role != "tool" -%}
{{- "<|im_end|>" -}}
{%- endif -%}
{%- endif -%}
{%- endfor -%}
{{- "\n" -}}
{%- if add_generation_prompt -%}
{{- "<|im_start|>assistant\n" -}}
{%- endif -%}

58
config.json Normal file
View File

@@ -0,0 +1,58 @@
{
"architectures": [
"Qwen3ForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"head_dim": 128,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 6144,
"layer_types": [
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention",
"full_attention"
],
"max_position_embeddings": 62080,
"max_window_layers": 28,
"model_type": "qwen3",
"num_attention_heads": 16,
"num_hidden_layers": 28,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-06,
"rope_scaling": null,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": true,
"torch_dtype": "bfloat16",
"transformers_version": "4.55.4",
"use_cache": true,
"use_sliding_window": false,
"vocab_size": 151936
}

14
generation_config.json Normal file
View File

@@ -0,0 +1,14 @@
{
"bos_token_id": 151643,
"do_sample": true,
"eos_token_id": [
151645,
151643
],
"pad_token_id": 151643,
"temperature": 0.6,
"top_k": 20,
"top_p": 0.95,
"transformers_version": "4.55.4",
"trust_remote_code": true
}

151388
merges.txt Normal file

File diff suppressed because it is too large Load Diff

3
model.safetensors Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:e5caa19697d69a45380861dd4af6efa043c17da4466f1d92045d7557aa998e21
size 3441185576

38
special_tokens_map.json Normal file
View File

@@ -0,0 +1,38 @@
{
"additional_special_tokens": [
"<|im_start|>",
"<|im_end|>",
"<|object_ref_start|>",
"<|object_ref_end|>",
"<|box_start|>",
"<|box_end|>",
"<|quad_start|>",
"<|quad_end|>",
"<|vision_start|>",
"<|vision_end|>",
"<|vision_pad|>",
"<|image_pad|>",
"<|video_pad|>"
],
"eos_token": {
"content": "<|endoftext|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"pad_token": {
"content": "<|endoftext|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"sep_token": {
"content": "<|endoftext|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

3
tokenizer.json Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:3c3dfe474a8bbe89b0e83627fd9ff784ad71027f12fd8c618708c818e808789d
size 11422648

240
tokenizer_config.json Normal file
View File

@@ -0,0 +1,240 @@
{
"add_bos_token": false,
"add_prefix_space": false,
"added_tokens_decoder": {
"151643": {
"content": "<|endoftext|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151644": {
"content": "<|im_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151645": {
"content": "<|im_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151646": {
"content": "<|object_ref_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151647": {
"content": "<|object_ref_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151648": {
"content": "<|box_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151649": {
"content": "<|box_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151650": {
"content": "<|quad_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151651": {
"content": "<|quad_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151652": {
"content": "<|vision_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151653": {
"content": "<|vision_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151654": {
"content": "<|vision_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151655": {
"content": "<|image_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151656": {
"content": "<|video_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"151657": {
"content": "<tool_call>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151658": {
"content": "</tool_call>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151659": {
"content": "<|fim_prefix|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151660": {
"content": "<|fim_middle|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151661": {
"content": "<|fim_suffix|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151662": {
"content": "<|fim_pad|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151663": {
"content": "<|repo_name|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151664": {
"content": "<|file_sep|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151665": {
"content": "<tool_response>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151666": {
"content": "</tool_response>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151667": {
"content": "<think>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"151668": {
"content": "</think>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
}
},
"additional_special_tokens": [
"<|im_start|>",
"<|im_end|>",
"<|object_ref_start|>",
"<|object_ref_end|>",
"<|box_start|>",
"<|box_end|>",
"<|quad_start|>",
"<|quad_end|>",
"<|vision_start|>",
"<|vision_end|>",
"<|vision_pad|>",
"<|image_pad|>",
"<|video_pad|>"
],
"bos_token": null,
"clean_up_tokenization_spaces": false,
"eos_token": "<|endoftext|>",
"errors": "replace",
"extra_special_tokens": {},
"model_max_length": 131072,
"pad_token": "<|endoftext|>",
"sep_token": "<|endoftext|>",
"split_special_tokens": false,
"tokenizer_class": "Qwen2Tokenizer",
"unk_token": null
}

1
vocab.json Normal file

File diff suppressed because one or more lines are too long