初始化项目,由ModelHub XC社区提供模型
Model: hadadxyz/OpenSonnet-Lite-MAX Source: Original Platform
This commit is contained in:
154
README.md
Normal file
154
README.md
Normal file
@@ -0,0 +1,154 @@
|
||||
---
|
||||
base_model:
|
||||
- Qwen/Qwen3-4B-Thinking-2507
|
||||
|
||||
datasets:
|
||||
- Roman1111111/claude-sonnet-4.6-120000x
|
||||
- Roman1111111/claude-sonnet-4.6-100000X-filtered
|
||||
- TeichAI/lordx64-claude-opus-4.7-max-cleaned
|
||||
- Crownelius/Opus-4.6-Reasoning-3300x
|
||||
- TeichAI/claude-4.5-opus-high-reasoning-250x
|
||||
- TeichAI/claude-haiku-4.5-high-reasoning-1700x
|
||||
- TeichAI/claude-sonnet-4.5-high-reasoning-250x
|
||||
- TeichAI/deepseek-v3.2-speciale-openr1-math-3k
|
||||
- TeichAI/deepseek-v3.2-speciale-1000x
|
||||
- Roman1111111/gemini-3-pro-10000x-hard-high-reasoning
|
||||
- Roman1111111/gemini-3.1-pro-hard-high-reasoning
|
||||
- Jackrong/DeepSeek-V4-Distill-8000x
|
||||
|
||||
tags:
|
||||
- opensonnet
|
||||
- claude-sonnet
|
||||
- sonnet
|
||||
|
||||
pipeline_tag: text-generation
|
||||
library_name: transformers
|
||||
license: apache-2.0
|
||||
license_link: https://huggingface.co/hadadxyz/OpenSonnet-Lite-MAX/blob/main/LICENSE
|
||||
---
|
||||
|
||||
# Comparison
|
||||
|
||||
| Model | Training Approach | Developer Role | Context Length | Training Epochs | Transformers Version | Notes |
|
||||
|------------------------------------------------------------------------------|--------------------------|------------------------|----------------|------------------|------------------------|-------------------------------------------------------------------------------------|
|
||||
| [OpenSonnet-Lite-MAX](https://huggingface.co/hadadxyz/OpenSonnet-Lite-MAX) | Multi-Stage Fine-Tuning | Supported | 262,144 | 2 | `transformers>=5.0.0` | Latest version with improved training efficiency and enhanced instruction alignment |
|
||||
| [OpenSonnet-Lite](https://huggingface.co/hadadxyz/OpenSonnet-Lite) | Single-Stage Fine-Tuning | Not supported | 262,144 | 3 | `transformers>=4.51.0` | Previous version with simpler training pipeline |
|
||||
| [Qwen3-4B-Thinking-2507](https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507) | N/A | Not supported | 262,144 | N/A | `transformers>=4.51.0` | Base model |
|
||||
|
||||
> [OpenSonnet-Lite-MAX quick demo](https://www.kaggle.com/code/hadadrjt/opensonnet-lite-max) with tool calling.
|
||||
|
||||
### Benchmark Evaluation
|
||||
|
||||
| Dataset | Score | Source | Framework |
|
||||
|-------------------------------------------------------|--------|---------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------|
|
||||
| [GSM8K](https://huggingface.co/datasets/openai/gsm8k) | 85.22 | [Evaluation Results](https://huggingface.co/hadadxyz/OpenSonnet-Lite-MAX/tree/main/.eval_results) | [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) |
|
||||
| MMLU-Pro | - | - | - |
|
||||
| GPQA (Diamond) | - | - | - |
|
||||
|
||||
|
||||
# Inference Parameters
|
||||
|
||||
For best results, the following sampling configuration is recommended:
|
||||
|
||||
| Parameter | Recommended Value | Description |
|
||||
|---------------------|---------------------|------------------------------------------|
|
||||
| temperature | 0.6 (default) - 1.0 | Controls randomness in generation |
|
||||
| top_p | 0.95 (default) | Nucleus sampling threshold |
|
||||
| top_k | 20 (default) - 40 | Top-k sampling parameter |
|
||||
| min_p | 0.0 (default) | Minimum probability threshold |
|
||||
| repetition_penalty | 1.0 (default) - 1.2 | Penalizes repeated tokens |
|
||||
| presence_penalty | 1.0 - 1.5 | Encourages introducing new topics |
|
||||
|
||||
|
||||
# Max Tokens
|
||||
|
||||
| Small Tasks | Medium Tasks | Large Tasks | Complex Tasks |
|
||||
|-------------|--------------|-------------|---------------|
|
||||
| 4096/8192 | 16384 | 32768/81920 | 131072 |
|
||||
|
||||
|
||||
# Instruction
|
||||
|
||||
```md
|
||||
You are OpenSonnet, a large language model trained by the Open Source community. You are based on the Qwen3 architecture.
|
||||
|
||||
You are an AI assistant designed to provide accurate, helpful, and context-aware responses. Your reasoning style must dynamically adapt based on the complexity of the user’s request.
|
||||
|
||||
---
|
||||
|
||||
# Adaptive Thinking Mode
|
||||
|
||||
* Automatically assess the complexity of each user request before responding.
|
||||
|
||||
* If the task is complex, multi-step, analytical, or requires planning, reasoning, or explanation:
|
||||
- Use structured, step-by-step reasoning internally before responding.
|
||||
- Provide a clear, well-organized, and thorough answer.
|
||||
|
||||
* If the task is simple, factual, or straightforward:
|
||||
- Use fast, minimal reasoning.
|
||||
- Respond concisely without unnecessary elaboration.
|
||||
|
||||
---
|
||||
|
||||
# Complexity Detection Guidelines
|
||||
|
||||
* Treat a request as COMPLEX if it involves:
|
||||
- Multi-step problem solving
|
||||
- Logic, mathematics, coding, or debugging
|
||||
- Planning, strategy, or decision making
|
||||
- Deep explanation or comparison
|
||||
- Ambiguous or multi-part instructions
|
||||
|
||||
* Treat a request as SIMPLE if it involves:
|
||||
- Direct factual questions
|
||||
- Basic definitions
|
||||
- Short instructions
|
||||
- Common knowledge retrieval
|
||||
- Single-step tasks
|
||||
|
||||
---
|
||||
|
||||
# Response Style Rules
|
||||
|
||||
* Always prioritize correctness and clarity.
|
||||
|
||||
* For complex tasks: structure answers clearly using sections or bullet points when helpful.
|
||||
|
||||
* For simple tasks: keep responses short and direct.
|
||||
|
||||
* Avoid unnecessary verbosity in all cases.
|
||||
|
||||
---
|
||||
|
||||
# Quality Principles
|
||||
|
||||
* Be accurate, logical, and consistent.
|
||||
|
||||
* Do not hallucinate information.
|
||||
|
||||
* If uncertain, clearly state limitations.
|
||||
|
||||
* Optimize responses for usefulness and readability.
|
||||
|
||||
---
|
||||
|
||||
# User Intent Focus
|
||||
|
||||
* Always prioritize the user’s intent over literal interpretation.
|
||||
|
||||
* If the request is ambiguous, make reasonable assumptions or ask a clarifying question when necessary.
|
||||
```
|
||||
|
||||
|
||||
# Citation
|
||||
|
||||
If you use this model in your research or applications, please cite both this model and the base model:
|
||||
|
||||
```bibtex
|
||||
@misc{opensonnet-lite-max,
|
||||
author = {hadadxyz},
|
||||
title = {OpenSonnet-Lite-MAX},
|
||||
year = {2026},
|
||||
url = {https://huggingface.co/hadadxyz/OpenSonnet-Lite-MAX}
|
||||
}
|
||||
```
|
||||
Reference in New Issue
Block a user