初始化项目,由ModelHub XC社区提供模型
Model: ValiantLabs/Qwen3-4B-Thinking-2507-Esper3.1 Source: Original Platform
This commit is contained in:
138
README.md
Normal file
138
README.md
Normal file
@@ -0,0 +1,138 @@
|
||||
---
|
||||
language:
|
||||
- en
|
||||
library_name: transformers
|
||||
pipeline_tag: text-generation
|
||||
tags:
|
||||
- esper
|
||||
- esper-3.1
|
||||
- esper-3
|
||||
- valiant
|
||||
- valiant-labs
|
||||
- qwen
|
||||
- qwen-3
|
||||
- qwen-3-4b
|
||||
- qwen3-4b-thinking-2507
|
||||
- 4b
|
||||
- reasoning
|
||||
- code
|
||||
- code-instruct
|
||||
- python
|
||||
- javascript
|
||||
- dev-ops
|
||||
- jenkins
|
||||
- terraform
|
||||
- ansible
|
||||
- docker
|
||||
- jenkins
|
||||
- kubernetes
|
||||
- helm
|
||||
- grafana
|
||||
- prometheus
|
||||
- shell
|
||||
- bash
|
||||
- azure
|
||||
- aws
|
||||
- gcp
|
||||
- cloud
|
||||
- scripting
|
||||
- powershell
|
||||
- problem-solving
|
||||
- architect
|
||||
- engineer
|
||||
- developer
|
||||
- creative
|
||||
- analytical
|
||||
- expert
|
||||
- rationality
|
||||
- conversational
|
||||
- chat
|
||||
- instruct
|
||||
base_model: Qwen/Qwen3-4B-Thinking-2507
|
||||
datasets:
|
||||
|
||||
- sequelbox/Titanium3-DeepSeek-V3.1-Terminus
|
||||
- sequelbox/Tachibana3-Part1-DeepSeek-V3.1-Terminus
|
||||
- sequelbox/Tachibana3-Part2-DeepSeek-V3.2
|
||||
- sequelbox/Mitakihara-DeepSeek-R1-0528
|
||||
license: apache-2.0
|
||||
---
|
||||
|
||||
|
||||
**[Support our open-source dataset and model releases!](https://huggingface.co/spaces/sequelbox/SupportOpenSource)**
|
||||
|
||||
|
||||

|
||||
|
||||
Esper 3.1: [Ministral-3-3B-Reasoning-2512](https://huggingface.co/ValiantLabs/Ministral-3-3B-Reasoning-2512-Esper3.1), [Qwen3-4B-Thinking-2507](https://huggingface.co/ValiantLabs/Qwen3-4B-Thinking-2507-Esper3.1), [Ministral-3-8B-Reasoning-2512](https://huggingface.co/ValiantLabs/Ministral-3-8B-Reasoning-2512-Esper3.1), [Ministral-3-14B-Reasoning-2512](https://huggingface.co/ValiantLabs/Ministral-3-14B-Reasoning-2512-Esper3.1), [gpt-oss-20b](https://huggingface.co/ValiantLabs/gpt-oss-20b-Esper3.1), [Qwen3.5-27B](https://huggingface.co/ValiantLabs/Qwen3.5-27B-Esper3.1), [Qwen3.6-27B](https://huggingface.co/ValiantLabs/Qwen3.6-27B-Esper3.1), [Qwen3.6-35B-A3B](https://huggingface.co/ValiantLabs/Qwen3.6-35B-A3B-Esper3.1)
|
||||
|
||||
|
||||
Esper 3.1 is a coding, architecture, and DevOps reasoning specialist built on Qwen 3.
|
||||
- Your dedicated DevOps expert: Esper 3.1 maximizes DevOps and architecture helpfulness, powered by [high-difficulty DevOps and architecture data](https://huggingface.co/datasets/sequelbox/Titanium3-DeepSeek-V3.1-Terminus) generated with DeepSeek-V3.1-Terminus!
|
||||
- Improved coding performance: challenging code-reasoning datasets stretch [DeepSeek-V3.1-Terminus](https://huggingface.co/datasets/sequelbox/Tachibana3-Part1-DeepSeek-V3.1-Terminus) and [DeepSeek-V3.2](https://huggingface.co/datasets/sequelbox/Tachibana3-Part2-DeepSeek-V3.2) to the limits, allowing Esper 3.1 to tackle harder coding tasks!
|
||||
- AI to build AI: our [high-difficulty AI expertise data](https://huggingface.co/datasets/sequelbox/Mitakihara-DeepSeek-R1-0528) boosts Esper 3.1's MLOps, AI architecture, AI research, and general reasoning skills.
|
||||
- Small model sizes allow running on local desktop and mobile, plus super-fast server inference!
|
||||
|
||||
|
||||
## Prompting Guide
|
||||
Esper 3.1 uses the [Qwen3-4B-Thinking-2507](https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507) prompt format.
|
||||
|
||||
Example inference script to get started:
|
||||
|
||||
```python
|
||||
from transformers import AutoModelForCausalLM, AutoTokenizer
|
||||
|
||||
model_name = "ValiantLabs/Qwen3-4B-Thinking-2507-Esper3.1"
|
||||
|
||||
# load the tokenizer and the model
|
||||
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
||||
model = AutoModelForCausalLM.from_pretrained(
|
||||
model_name,
|
||||
torch_dtype="auto",
|
||||
device_map="auto"
|
||||
)
|
||||
|
||||
# prepare the model input
|
||||
prompt = "Write a Terraform configuration that uses the `aws_ami` data source to find the latest Amazon Linux 2 AMI. Then, provision an EC2 instance using this dynamically determined AMI ID."
|
||||
messages = [
|
||||
{"role": "user", "content": prompt}
|
||||
]
|
||||
text = tokenizer.apply_chat_template(
|
||||
messages,
|
||||
tokenize=False,
|
||||
add_generation_prompt=True,
|
||||
enable_thinking=True # Switches between thinking and non-thinking modes. Default is True.
|
||||
)
|
||||
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
|
||||
|
||||
# conduct text completion
|
||||
generated_ids = model.generate(
|
||||
**model_inputs,
|
||||
max_new_tokens=32768
|
||||
)
|
||||
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()
|
||||
|
||||
# parsing thinking content
|
||||
try:
|
||||
# rindex finding 151668 (</think>)
|
||||
index = len(output_ids) - output_ids[::-1].index(151668)
|
||||
except ValueError:
|
||||
index = 0
|
||||
|
||||
thinking_content = tokenizer.decode(output_ids[:index], skip_special_tokens=True).strip("\n")
|
||||
content = tokenizer.decode(output_ids[index:], skip_special_tokens=True).strip("\n")
|
||||
|
||||
print("thinking content:", thinking_content)
|
||||
print("content:", content)
|
||||
```
|
||||
|
||||
|
||||

|
||||
|
||||
|
||||
Esper 3.1 is created by [Valiant Labs.](http://valiantlabs.ca/)
|
||||
|
||||
[Check out our HuggingFace page to see all of our models!](https://huggingface.co/ValiantLabs)
|
||||
|
||||
We care about open source. For everyone to use.
|
||||
|
||||
Reference in New Issue
Block a user