diff --git a/.gitattributes b/.gitattributes
index 53d7257..d2fc868 100644
--- a/.gitattributes
+++ b/.gitattributes
@@ -44,4 +44,8 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
-*tfevents* filter=lfs diff=lfs merge=lfs -text
\ No newline at end of file
+*tfevents* filter=lfs diff=lfs merge=lfs -text
+
+merges.txt filter=lfs diff=lfs merge=lfs -text
+tokenizer.json filter=lfs diff=lfs merge=lfs -text
+vocab.json filter=lfs diff=lfs merge=lfs -text
\ No newline at end of file
diff --git a/README.md b/README.md
index 5f8b3b3..c2636f0 100644
--- a/README.md
+++ b/README.md
@@ -1,47 +1,111 @@
---
-license: Apache License 2.0
-
-#model-type:
-##如 gpt、phi、llama、chatglm、baichuan 等
-#- gpt
-
-#domain:
-##如 nlp、cv、audio、multi-modal
-#- nlp
-
-#language:
-##语言代码列表 https://help.aliyun.com/document_detail/215387.html?spm=a2c4g.11186623.0.0.9f8d7467kni6Aa
-#- cn
-
-#metrics:
-##如 CIDEr、Blue、ROUGE 等
-#- CIDEr
-
-#tags:
-##各种自定义,包括 pretrained、fine-tuned、instruction-tuned、RL-tuned 等训练方法和其他
-#- pretrained
-
-#tools:
-##如 vllm、fastchat、llamacpp、AdaSeq 等
-#- vllm
+license: apache-2.0
+base_model:
+- Qwen/Qwen3-1.7B
+datasets:
+- prithivMLmods/Demeter-LongCoT-400K
+language:
+- en
+pipeline_tag: text-generation
+library_name: transformers
+tags:
+- text-generation-inference
+- LongCoT
+- moe
+- trl
+- math
+- code
+- stem
---
-### 当前模型的贡献者未提供更加详细的模型介绍。模型文件和权重,可浏览“模型文件”页面获取。
-#### 您可以通过如下git clone命令,或者ModelScope SDK来下载模型
-SDK下载
-```bash
-#安装ModelScope
-pip install modelscope
-```
+
+
+# **Demeter-LongCoT-Qwen3-1.7B**
+
+> **Demeter-LongCoT-Qwen3-1.7B** is a reasoning-focused model fine-tuned on **Qwen/Qwen3-1.7B** using the **Demeter-LongCoT-400K** dataset.
+> It is designed for **math and code chain-of-thought reasoning**, blending symbolic precision, scientific logic, and structured output fluency—making it an effective tool for developers, educators, and researchers seeking reliable step-by-step reasoning.
+
+> \[!note]
+> GGUF: [https://huggingface.co/prithivMLmods/Demeter-LongCoT-Qwen3-1.7B-GGUF](https://huggingface.co/prithivMLmods/Demeter-LongCoT-Qwen3-1.7B-GGUF)
+
+---
+
+## **Key Features**
+
+1. **Unified Reasoning in Math & Code**
+ Fine-tuned on **Demeter-LongCoT-400K**, which emphasizes extended chain-of-thought reasoning in mathematics, algorithms, and programming workflows.
+
+2. **Advanced Code Understanding & Generation**
+ Handles multi-language programming tasks with explanations, optimization hints, and error detection—suited for algorithm synthesis, debugging, and prototyping.
+
+3. **Mathematical Problem Solving**
+ Excels at step-by-step derivations, symbolic manipulations, and applied problem solving across calculus, algebra, and logic-based reasoning.
+
+4. **Chain-of-Thought Focused Reasoning**
+ Optimized to produce clear, structured thought processes for both **STEM explanations** and **computational logic** tasks.
+
+5. **Structured Output Mastery**
+ Generates well-formed outputs in **LaTeX**, **Markdown**, **JSON**, **CSV**, and **YAML**, enabling smooth integration with research pipelines and technical documentation.
+
+6. **Balanced Performance for Deployment**
+ Designed to deliver strong reasoning under moderate compute budgets, deployable on **mid-range GPUs**, **offline clusters**, and **specialized edge AI systems**.
+
+---
+
+## **Quickstart with Transformers**
+
```python
-#SDK模型下载
-from modelscope import snapshot_download
-model_dir = snapshot_download('prithivMLmods/Demeter-LongCoT-Qwen3-1.7B')
-```
-Git下载
-```
-#Git模型下载
-git clone https://www.modelscope.cn/prithivMLmods/Demeter-LongCoT-Qwen3-1.7B.git
+from transformers import AutoModelForCausalLM, AutoTokenizer
+
+model_name = "prithivMLmods/Demeter-LongCoT-Qwen3-1.7B"
+
+model = AutoModelForCausalLM.from_pretrained(
+ model_name,
+ torch_dtype="auto",
+ device_map="auto"
+)
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+
+prompt = "Solve the integral of x^2 * e^x step by step."
+
+messages = [
+ {"role": "system", "content": "You are a tutor skilled in math, code, and step-by-step reasoning."},
+ {"role": "user", "content": prompt}
+]
+
+text = tokenizer.apply_chat_template(
+ messages,
+ tokenize=False,
+ add_generation_prompt=True
+)
+
+model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
+
+generated_ids = model.generate(
+ **model_inputs,
+ max_new_tokens=512
+)
+generated_ids = [
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
+]
+
+response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
+print(response)
```
-
如果您是本模型的贡献者,我们邀请您根据模型贡献文档,及时完善模型卡片内容。
\ No newline at end of file
+---
+
+## **Intended Use**
+
+* Step-by-step math tutoring and symbolic derivation
+* Advanced coding assistant for algorithms, debugging, and structured reasoning
+* Chain-of-thought generation for research and education tools
+* Producing structured outputs for technical documentation and computational pipelines
+* Deployments requiring reliable reasoning under constrained compute
+
+## **Limitations**
+
+* Not tuned for general-purpose or conversational tasks
+* May underperform in long-form multi-document contexts
+* Specialized in math and code—general writing or casual dialogue may be weak
+* Prioritizes structured reasoning over natural or emotional tone generation
\ No newline at end of file
diff --git a/added_tokens.json b/added_tokens.json
new file mode 100644
index 0000000..b54f913
--- /dev/null
+++ b/added_tokens.json
@@ -0,0 +1,28 @@
+{
+ "": 151668,
+ "": 151658,
+ "": 151666,
+ "": 151667,
+ "": 151657,
+ "": 151665,
+ "<|box_end|>": 151649,
+ "<|box_start|>": 151648,
+ "<|endoftext|>": 151643,
+ "<|file_sep|>": 151664,
+ "<|fim_middle|>": 151660,
+ "<|fim_pad|>": 151662,
+ "<|fim_prefix|>": 151659,
+ "<|fim_suffix|>": 151661,
+ "<|im_end|>": 151645,
+ "<|im_start|>": 151644,
+ "<|image_pad|>": 151655,
+ "<|object_ref_end|>": 151647,
+ "<|object_ref_start|>": 151646,
+ "<|quad_end|>": 151651,
+ "<|quad_start|>": 151650,
+ "<|repo_name|>": 151663,
+ "<|video_pad|>": 151656,
+ "<|vision_end|>": 151653,
+ "<|vision_pad|>": 151654,
+ "<|vision_start|>": 151652
+}
diff --git a/chat_template.jinja b/chat_template.jinja
new file mode 100644
index 0000000..e079919
--- /dev/null
+++ b/chat_template.jinja
@@ -0,0 +1,98 @@
+{%- if tools %}
+ {{- '<|im_start|>system\n' }}
+ {%- if messages[0].role == 'system' %}
+ {{- messages[0].content + '\n\n' }}
+ {%- endif %}
+ {{- "# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within XML tags:\n" }}
+ {%- for tool in tools %}
+ {{- "\n" }}
+ {{- tool | tojson }}
+ {%- endfor %}
+ {{- "\n\n\nFor each function call, return a json object with function name and arguments within XML tags:\n\n{\"name\": , \"arguments\": }\n<|im_end|>\n" }}
+{%- else %}
+ {%- if messages[0].role == 'system' %}
+ {{- '<|im_start|>system\n' + messages[0].content + '<|im_end|>\n' }}
+ {%- endif %}
+{%- endif %}
+{%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %}
+{%- for forward_message in messages %}
+ {%- set index = (messages|length - 1) - loop.index0 %}
+ {%- set message = messages[index] %}
+ {%- set current_content = message.content if message.content is not none else '' %}
+ {%- set tool_start = '' %}
+ {%- set tool_start_length = tool_start|length %}
+ {%- set start_of_message = current_content[:tool_start_length] %}
+ {%- set tool_end = '' %}
+ {%- set tool_end_length = tool_end|length %}
+ {%- set start_pos = (current_content|length) - tool_end_length %}
+ {%- if start_pos < 0 %}
+ {%- set start_pos = 0 %}
+ {%- endif %}
+ {%- set end_of_message = current_content[start_pos:] %}
+ {%- if ns.multi_step_tool and message.role == "user" and not(start_of_message == tool_start and end_of_message == tool_end) %}
+ {%- set ns.multi_step_tool = false %}
+ {%- set ns.last_query_index = index %}
+ {%- endif %}
+{%- endfor %}
+{%- for message in messages %}
+ {%- if (message.role == "user") or (message.role == "system" and not loop.first) %}
+ {{- '<|im_start|>' + message.role + '\n' + message.content + '<|im_end|>' + '\n' }}
+ {%- elif message.role == "assistant" %}
+ {%- set content = message.content %}
+ {%- set reasoning_content = '' %}
+ {%- if message.reasoning_content is defined and message.reasoning_content is not none %}
+ {%- set reasoning_content = message.reasoning_content %}
+ {%- else %}
+ {%- if '' in message.content %}
+ {%- set content = (message.content.split('')|last).lstrip('\n') %}
+ {%- set reasoning_content = (message.content.split('')|first).rstrip('\n') %}
+ {%- set reasoning_content = (reasoning_content.split('')|last).lstrip('\n') %}
+ {%- endif %}
+ {%- endif %}
+ {%- if loop.index0 > ns.last_query_index %}
+ {%- if loop.last or (not loop.last and reasoning_content) %}
+ {{- '<|im_start|>' + message.role + '\n\n' + reasoning_content.strip('\n') + '\n\n\n' + content.lstrip('\n') }}
+ {%- else %}
+ {{- '<|im_start|>' + message.role + '\n' + content }}
+ {%- endif %}
+ {%- else %}
+ {{- '<|im_start|>' + message.role + '\n' + content }}
+ {%- endif %}
+ {%- if message.tool_calls %}
+ {%- for tool_call in message.tool_calls %}
+ {%- if (loop.first and content) or (not loop.first) %}
+ {{- '\n' }}
+ {%- endif %}
+ {%- if tool_call.function %}
+ {%- set tool_call = tool_call.function %}
+ {%- endif %}
+ {{- '\n{"name": "' }}
+ {{- tool_call.name }}
+ {{- '", "arguments": ' }}
+ {%- if tool_call.arguments is string %}
+ {{- tool_call.arguments }}
+ {%- else %}
+ {{- tool_call.arguments | tojson }}
+ {%- endif %}
+ {{- '}\n' }}
+ {%- endfor %}
+ {%- endif %}
+ {{- '<|im_end|>\n' }}
+ {%- elif message.role == "tool" %}
+ {%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
+ {{- '<|im_start|>user' }}
+ {%- endif %}
+ {{- '\n\n' }}
+ {{- message.content }}
+ {{- '\n' }}
+ {%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
+ {{- '<|im_end|>\n' }}
+ {%- endif %}
+ {%- endif %}
+{%- endfor %}
+{%- if add_generation_prompt %}
+ {{- '<|im_start|>assistant\n' }}
+ {%- if enable_thinking is defined and enable_thinking is false %}
+ {{- '\n\n\n\n' }}
+ {%- endif %}
+{%- endif %}
\ No newline at end of file
diff --git a/config.json b/config.json
new file mode 100644
index 0000000..a520b07
--- /dev/null
+++ b/config.json
@@ -0,0 +1,60 @@
+{
+ "architectures": [
+ "Qwen3ForCausalLM"
+ ],
+ "attention_bias": false,
+ "attention_dropout": 0.0,
+ "eos_token_id": 151645,
+ "head_dim": 128,
+ "hidden_act": "silu",
+ "hidden_size": 2048,
+ "initializer_range": 0.02,
+ "intermediate_size": 6144,
+ "layer_types": [
+ "full_attention",
+ "full_attention",
+ "full_attention",
+ "full_attention",
+ "full_attention",
+ "full_attention",
+ "full_attention",
+ "full_attention",
+ "full_attention",
+ "full_attention",
+ "full_attention",
+ "full_attention",
+ "full_attention",
+ "full_attention",
+ "full_attention",
+ "full_attention",
+ "full_attention",
+ "full_attention",
+ "full_attention",
+ "full_attention",
+ "full_attention",
+ "full_attention",
+ "full_attention",
+ "full_attention",
+ "full_attention",
+ "full_attention",
+ "full_attention",
+ "full_attention"
+ ],
+ "max_position_embeddings": 40960,
+ "max_window_layers": 28,
+ "model_type": "qwen3",
+ "num_attention_heads": 16,
+ "num_hidden_layers": 28,
+ "num_key_value_heads": 8,
+ "pad_token_id": 151654,
+ "rms_norm_eps": 1e-06,
+ "rope_scaling": null,
+ "rope_theta": 1000000,
+ "sliding_window": null,
+ "tie_word_embeddings": true,
+ "torch_dtype": "float16",
+ "transformers_version": "4.55.2",
+ "use_cache": true,
+ "use_sliding_window": false,
+ "vocab_size": 151936
+}
diff --git a/configuration.json b/configuration.json
new file mode 100644
index 0000000..bbeeda1
--- /dev/null
+++ b/configuration.json
@@ -0,0 +1 @@
+{"framework": "pytorch", "task": "text-generation", "allow_remote": true}
\ No newline at end of file
diff --git a/generation_config.json b/generation_config.json
new file mode 100644
index 0000000..215aaea
--- /dev/null
+++ b/generation_config.json
@@ -0,0 +1,14 @@
+{
+ "bos_token_id": 151643,
+ "do_sample": true,
+ "eos_token_id": [
+ 151645,
+ 151643
+ ],
+ "max_length": 40960,
+ "pad_token_id": 151654,
+ "temperature": 0.6,
+ "top_k": 20,
+ "top_p": 0.95,
+ "transformers_version": "4.55.2"
+}
diff --git a/merges.txt b/merges.txt
new file mode 100644
index 0000000..80c1a19
--- /dev/null
+++ b/merges.txt
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:8831e4f1a044471340f7c0a83d7bd71306a5b867e95fd870f74d0c5308a904d5
+size 1671853
diff --git a/model.safetensors b/model.safetensors
new file mode 100644
index 0000000..ab4b037
--- /dev/null
+++ b/model.safetensors
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:79ea242756a4433459646b0c69336be5070c71397bdd6612ec815e27fcc63af6
+size 135
diff --git a/special_tokens_map.json b/special_tokens_map.json
new file mode 100644
index 0000000..9b8043f
--- /dev/null
+++ b/special_tokens_map.json
@@ -0,0 +1,31 @@
+{
+ "additional_special_tokens": [
+ "<|im_start|>",
+ "<|im_end|>",
+ "<|object_ref_start|>",
+ "<|object_ref_end|>",
+ "<|box_start|>",
+ "<|box_end|>",
+ "<|quad_start|>",
+ "<|quad_end|>",
+ "<|vision_start|>",
+ "<|vision_end|>",
+ "<|vision_pad|>",
+ "<|image_pad|>",
+ "<|video_pad|>"
+ ],
+ "eos_token": {
+ "content": "<|im_end|>",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false
+ },
+ "pad_token": {
+ "content": "<|vision_pad|>",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false
+ }
+}
diff --git a/tokenizer.json b/tokenizer.json
new file mode 100644
index 0000000..cd71f61
--- /dev/null
+++ b/tokenizer.json
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:aeb13307a71acd8fe81861d94ad54ab689df773318809eed3cbe794b4492dae4
+size 11422654
diff --git a/tokenizer_config.json b/tokenizer_config.json
new file mode 100644
index 0000000..f8a4de0
--- /dev/null
+++ b/tokenizer_config.json
@@ -0,0 +1,240 @@
+{
+ "add_bos_token": false,
+ "add_prefix_space": false,
+ "added_tokens_decoder": {
+ "151643": {
+ "content": "<|endoftext|>",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": true
+ },
+ "151644": {
+ "content": "<|im_start|>",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": true
+ },
+ "151645": {
+ "content": "<|im_end|>",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": true
+ },
+ "151646": {
+ "content": "<|object_ref_start|>",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": true
+ },
+ "151647": {
+ "content": "<|object_ref_end|>",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": true
+ },
+ "151648": {
+ "content": "<|box_start|>",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": true
+ },
+ "151649": {
+ "content": "<|box_end|>",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": true
+ },
+ "151650": {
+ "content": "<|quad_start|>",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": true
+ },
+ "151651": {
+ "content": "<|quad_end|>",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": true
+ },
+ "151652": {
+ "content": "<|vision_start|>",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": true
+ },
+ "151653": {
+ "content": "<|vision_end|>",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": true
+ },
+ "151654": {
+ "content": "<|vision_pad|>",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": true
+ },
+ "151655": {
+ "content": "<|image_pad|>",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": true
+ },
+ "151656": {
+ "content": "<|video_pad|>",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": true
+ },
+ "151657": {
+ "content": "",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": false
+ },
+ "151658": {
+ "content": "",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": false
+ },
+ "151659": {
+ "content": "<|fim_prefix|>",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": false
+ },
+ "151660": {
+ "content": "<|fim_middle|>",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": false
+ },
+ "151661": {
+ "content": "<|fim_suffix|>",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": false
+ },
+ "151662": {
+ "content": "<|fim_pad|>",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": false
+ },
+ "151663": {
+ "content": "<|repo_name|>",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": false
+ },
+ "151664": {
+ "content": "<|file_sep|>",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": false
+ },
+ "151665": {
+ "content": "",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": false
+ },
+ "151666": {
+ "content": "",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": false
+ },
+ "151667": {
+ "content": "",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": false
+ },
+ "151668": {
+ "content": "",
+ "lstrip": false,
+ "normalized": false,
+ "rstrip": false,
+ "single_word": false,
+ "special": false
+ }
+ },
+ "additional_special_tokens": [
+ "<|im_start|>",
+ "<|im_end|>",
+ "<|object_ref_start|>",
+ "<|object_ref_end|>",
+ "<|box_start|>",
+ "<|box_end|>",
+ "<|quad_start|>",
+ "<|quad_end|>",
+ "<|vision_start|>",
+ "<|vision_end|>",
+ "<|vision_pad|>",
+ "<|image_pad|>",
+ "<|video_pad|>"
+ ],
+ "bos_token": null,
+ "clean_up_tokenization_spaces": false,
+ "eos_token": "<|im_end|>",
+ "errors": "replace",
+ "extra_special_tokens": {},
+ "model_max_length": 40960,
+ "pad_token": "<|vision_pad|>",
+ "padding_side": "right",
+ "split_special_tokens": false,
+ "tokenizer_class": "Qwen2Tokenizer",
+ "unk_token": null
+}
diff --git a/vocab.json b/vocab.json
new file mode 100644
index 0000000..6c49fc6
--- /dev/null
+++ b/vocab.json
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:ca10d7e9fb3ed18575dd1e277a2579c16d108e32f27439684afa0e10b1440910
+size 2776833