初始化项目，由ModelHub XC社区提供模型

Model: KnutJaegersberg/Tri-21B-Think-gguf Source: Original Platform
2026-04-13 03:37:56 +08:00
commit 85f3f2cb35
12 changed files with 298 additions and 0 deletions
--- a/.gitattributes
+++ b/.gitattributes
@@ -0,0 +1,44 @@
 *.7z filter=lfs diff=lfs merge=lfs -text
 *.arrow filter=lfs diff=lfs merge=lfs -text
 *.bin filter=lfs diff=lfs merge=lfs -text
 *.bz2 filter=lfs diff=lfs merge=lfs -text
 *.ckpt filter=lfs diff=lfs merge=lfs -text
 *.ftz filter=lfs diff=lfs merge=lfs -text
 *.gz filter=lfs diff=lfs merge=lfs -text
 *.h5 filter=lfs diff=lfs merge=lfs -text
 *.joblib filter=lfs diff=lfs merge=lfs -text
 *.lfs.* filter=lfs diff=lfs merge=lfs -text
 *.mlmodel filter=lfs diff=lfs merge=lfs -text
 *.model filter=lfs diff=lfs merge=lfs -text
 *.msgpack filter=lfs diff=lfs merge=lfs -text
 *.npy filter=lfs diff=lfs merge=lfs -text
 *.npz filter=lfs diff=lfs merge=lfs -text
 *.onnx filter=lfs diff=lfs merge=lfs -text
 *.ot filter=lfs diff=lfs merge=lfs -text
 *.parquet filter=lfs diff=lfs merge=lfs -text
 *.pb filter=lfs diff=lfs merge=lfs -text
 *.pickle filter=lfs diff=lfs merge=lfs -text
 *.pkl filter=lfs diff=lfs merge=lfs -text
 *.pt filter=lfs diff=lfs merge=lfs -text
 *.pth filter=lfs diff=lfs merge=lfs -text
 *.rar filter=lfs diff=lfs merge=lfs -text
 *.safetensors filter=lfs diff=lfs merge=lfs -text
 saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.tar.* filter=lfs diff=lfs merge=lfs -text
 *.tar filter=lfs diff=lfs merge=lfs -text
 *.tflite filter=lfs diff=lfs merge=lfs -text
 *.tgz filter=lfs diff=lfs merge=lfs -text
 *.wasm filter=lfs diff=lfs merge=lfs -text
 *.xz filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
 Tri-21B-Think_4bit.gguf filter=lfs diff=lfs merge=lfs -text
 Tri-21B-Think_5bit.gguf filter=lfs diff=lfs merge=lfs -text
 Tri-21B-Think_6bit.gguf filter=lfs diff=lfs merge=lfs -text
 Tri-21B-Think-fp16.gguf filter=lfs diff=lfs merge=lfs -text
 Tri-21B-Think-Preview-fp16.gguf filter=lfs diff=lfs merge=lfs -text
 Tri-21B-Think_8bit.gguf filter=lfs diff=lfs merge=lfs -text
 Tri-21B-Think-Preview_8bit.gguf filter=lfs diff=lfs merge=lfs -text
 Tri-21B-8bit.gguf filter=lfs diff=lfs merge=lfs -text
 Tri-21B-fp16.gguf filter=lfs diff=lfs merge=lfs -text
--- a/79
+++ b/79
@@ -0,0 +1,79 @@
 TRILLION LABS AI MODEL LICENSE AGREEMENT (Version 1.0)
 This License Agreement ("Agreement") is entered into between you ("Licensee") and Trillion Labs ("Licensor"), governing the use of the Trillion Labs AI Model series ("Model"). By downloading, installing, copying, or using the Model, you agree to comply with and be bound by the terms of this Agreement. If you do not agree to all the terms, you must not download, install, copy, or use the Model.
 1. Definitions
   1.1 Model: The artificial intelligence model series provided by Licensor ("Tri-" series), including software, algorithms, machine learning models, and related components provided by Licensor, including all updates, enhancements, improvements, bug fixes, patches, or other modifications.
   1.2 Derivatives: Modifications, alterations, enhancements, improvements, adaptations, or derivative works of the Model created by Licensee or third parties, including changes to the Model's architecture, parameters, data processing methods, or functionality.
   1.3 Output: Any data, results, content, predictions, analyses, insights, or other materials generated by the Model or Derivatives, whether in original form or further processed or modified.
   1.4 Licensor: Trillion Labs, the owner, developer, and provider of the Model, holding all rights, title, and interest in the Model.
   1.5 Licensee: The individual or entity using or intending to use the Model under this Agreement.
 2. License Grant
   2.1 Grant of License: Subject to the terms outlined herein, Licensor grants Licensee a limited, non-exclusive, non-transferable, worldwide, revocable license to:
       a. Access, download, install, and use the Model for research purposes, including evaluation, testing, academic research, experimentation, and non-commercial competitions.
       b. Publicly disclose research results and findings derived from the Model or Derivatives, including publishing papers or presentations, with appropriate attribution to Licensor.
       c. Modify the Model and create Derivatives for research purposes. Derivatives must include "Tri-" followed by the original Model version in their names (e.g., "Tri-21B-xxx").
       d. Distribute the Model and Derivatives, each accompanied by a copy of this Agreement.
       e. Commercially use the Model or Derivatives only if Licensee's Monthly Active Users (MAU) are fewer than 1 million or Annual Recurring Revenue (ARR) is less than $10 million USD, and with explicit prior written approval from Licensor.
   2.2 Scope of License: Any use beyond the explicitly stated permissions, particularly commercial use, requires a separate commercial license agreement approved in writing by Licensor.
 3. Restrictions
   3.1 Commercial Use: Licensee exceeding the defined thresholds (≥1M MAU or ≥$10M ARR) must secure a separate commercial license from Licensor before commercial deployment.
   3.2 Reverse Engineering: Licensee shall not reverse engineer, decompile, disassemble, or attempt to derive proprietary source code, underlying ideas, algorithms, or structure of the Model.
   3.3 Unlawful Use: Licensee shall not use the Model or Derivatives for illegal, fraudulent, unauthorized, or unethical purposes or violate applicable laws and regulations.
   3.4 Ethical Use: Licensee agrees to ethical and responsible use of the Model and Derivatives, avoiding generation or distribution of misleading, discriminatory, defamatory, harmful, or infringing content.
 4. Ownership
   4.1 Intellectual Property: All rights, title, and interest in the Model, including modifications, Derivatives, and documentation, remain exclusively with Licensor.
   4.2 Output: Licensor retains ownership of all rights, title, and interest in Output generated by the Model or Derivatives. Licensee may use Output solely for permitted research purposes. Licensee shall not commercially exploit or claim ownership of Output unless explicitly permitted under a separate commercial license agreement with Licensor.
   4.3 Attribution: Licensee must attribute Licensor, citing the Model's name and version, in any publications or presentations derived from using the Model.
 5. No Warranty
   5.1 "As-Is" Basis: The Model, Derivatives, and Output are provided "as-is" and "as-available," without warranties of any kind, express or implied, including merchantability, fitness for a particular purpose, accuracy, reliability, non-infringement, or warranties arising from usage or trade.
 6. Limitation of Liability
   6.1 No Liability for Damages: Licensor shall not be liable for special, incidental, indirect, consequential, exemplary, or punitive damages arising from or connected to the use or inability to use the Model, Derivatives, or Output.
   6.2 Indemnification: Licensee agrees to indemnify and hold harmless Licensor from claims, liabilities, damages, losses, or expenses arising from Licensee's use or distribution of the Model, Derivatives, or Output.
 7. Termination
   7.1 Termination by Licensor: Licensor reserves the right to terminate this Agreement at any time, with or without cause, immediately upon notice, particularly in case of breach of Agreement terms.
   7.2 Effect of Termination: Upon termination, Licensee must cease use and destroy all copies of the Model, Derivatives, and Output in its possession.
   7.3 Survival: Ownership, No Warranty, Limitation of Liability, and Termination sections survive termination.
 8. Governing Law
   8.1 This Agreement shall be governed by and construed under the laws of the State of California, without regard to conflict of law principles.
 9. Alterations
   9.1 Modifications: Licensor reserves the right to amend this Agreement at any time. Continued use of the Model constitutes acceptance of amendments.
   9.2 Entire Agreement: This Agreement constitutes the entire agreement between Licensee and Licensor regarding its subject matter, superseding all prior agreements or understandings.
 By downloading, installing, or using the Trillion Labs AI Model, Licensee acknowledges that it has read, understood, and agrees to be bound by this Agreement.
--- a/README.md
+++ b/README.md
@@ -0,0 +1,148 @@
 ---
 license: other
 license_name: trillion
 license_link: LICENSE
 tags:
 - finetuned
 - chat
 - reasoning
 language:
 - en
 - ko
 - ja
 pipeline_tag: text-generation
 library_name: transformers
 base_model:
 - trillionlabs/Tri-21B
 ---
 The License for the Base Model Tri-21B is the trillion license in this repo, the think and think-preview versions are Apache 2.0. 
 <p align="center">
 <picture>
  <img src="https://raw.githubusercontent.com/trillion-labs/.github/main/Tri-21B-Think.png" alt="Tri-21B-Think-Preview" style="width: 80%;">
 </picture>
 </p>
 ## Introduction
 **Tri-21B-Think-Preview** is an intermediate checkpoint of [Tri-21B-Think](https://huggingface.co/trillionlabs/Tri-21B-Think), featuring mid-training context length expansion to 32K tokens and instruction tuning for chain-of-thought reasoning and tool use.
 ### Model Specifications
 - Type: Causal Language Model (Reasoning-Enhanced)
 - Base Model: [Tri-21B](https://huggingface.co/trillionlabs/Tri-21B)
 - Architecture: Transformer Decoder with RoPE, SwiGLU, RMSNorm, and GQA
 - Number of Parameters: 20.73B
 - Number of Layers: 40
 - Number of Attention Heads: 32 (Query) / 8 (Key, Value)
 - Head Dimension: 160
 - Hidden Size: 5,120
 - Intermediate Size: 27,392
 - Context Length: 32,768 (up to 262,144 with YaRN)
 - Vocab Size: 124,416
 ## Quickstart
 ```python
 import torch
 from transformers import AutoModelForCausalLM, AutoTokenizer
 model_name = "trillionlabs/Tri-21B-Think-Preview"
 model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto"
 )
 tokenizer = AutoTokenizer.from_pretrained(model_name)
 prompt = "Solve the following step by step: What is the sum of the first 100 prime numbers?"
 messages = [
    {"role": "user", "content": prompt}
 ]
 text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
 )
 model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
 generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=4096,
    temperature=0.6,
    top_p=0.9
 )
 generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
 ]
 response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
 print(response)
 ```
 ### vLLM & SGLang Deployment
 vLLM and SGLang support for Trillion Model is on the way. Stay tuned!
 ## Fine-tuning Notes
 > **Note on `<think>` tags:** This model was trained without `<think>` and `</think>` as special tokens. They were added post-training for compatibility with reasoning parsers. If you plan to fine-tune this model, you'll need to modify `tokenizer_config.json` to avoid indexing errors.
 Replace tokens 123975 and 123976 in `tokenizer_config.json`:
 ```json
 "123975": {
  "content": "<|reserved_special_token_9|>",
  "lstrip": false,
  "normalized": false,
  "rstrip": false,
  "single_word": false,
  "special": true
 },
 "123976": {
  "content": "<|reserved_special_token_10|>",
  "lstrip": false,
  "normalized": false,
  "rstrip": false,
  "single_word": false,
  "special": true
 }
 ```
 ## Evaluation
 | Category | Benchmark | Description | Tri-21B-Think-Preview |
 | :--- | :--- | :--- | :---: |
 | **Reasoning** | GPQA-Diamond | Graduate-level science questions across physics, chemistry, and biology (PhD-level) | 54 |
 | | AIME 2025 | American Invitational Mathematics Examination 2025 | 50.0 |
 | | MMLU-Pro | Massive Multitask Language Understanding with more answer choices and reasoning-focused questions | 65.19 |
 | | HLE | Humanity's Last Exam — 2,500 expert-level questions across 100+ subjects created by nearly 1,000 domain experts | 5.12 |
 | **Coding** | LiveCodeBench v6 | Competitive programming benchmark with problems sourced from recent programming contests | 48.57 |
 | | SciCode | Code generation across 338 subproblems in 16 natural science fields drawn from real research workflows | 18 |
 | **Instruction Following** | IFEval | Tests ability to follow precise formatting and output constraint instructions | 84.05 |
 | | IFBench | Evaluates generalization to novel, verifiable output constraints not seen during training (Allen AI) | 51.02 |
 | **Agentic** | TAU2-Bench (Telecom) | Dual-control conversational benchmark where both agent and user use tools to resolve telecom scenarios (Sierra) | 93 |
 | | AA-LCR | Long-context reasoning over multiple documents at 10K–100K tokens (Artificial Analysis) | 15 |
 | | AA-Omniscience | Factual reliability across 6,000 questions in 42 subtopics, penalizing hallucinations (Artificial Analysis) | -48.55 |
 | **Korean** | KMMLU-Pro | 2,822 questions from 14 Korean National Professional Licensure exams (LG AI Research) | 54.18 |
 | | CLIcK | 1,995 Korean cultural and linguistic knowledge questions sourced from official exams and textbooks (KAIST) | 77.94 |
 | | KoBALT | Korean linguistic understanding across syntax, semantics, pragmatics, phonetics, and morphology (SNU) | 47.29 |
 ## Limitations
 - **Language Support**: Optimized for English, Korean, and Japanese. Other languages may show degraded performance.
 - **Knowledge Cutoff**: February 2025.
 - **Intermediate Checkpoint**: See [Tri-21B-Think](https://huggingface.co/trillionlabs/Tri-21B-Think) for the final model.
 ## License
 This model is licensed under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0).
 ## Contact
 For inquiries: [info@trillionlabs.co](mailto:info@trillionlabs.co)
--- a/Tri-21B-8bit.gguf
+++ b/Tri-21B-8bit.gguf
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:95338c37be401d6c3af6563eb11a5fa8e8cb57114cbafed61278b09a8d9164d3
 size 22029446688
--- a/Tri-21B-Think-Preview-fp16.gguf
+++ b/Tri-21B-Think-Preview-fp16.gguf
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:7e6a1b5e2ff7d5bd204731e5a950af00eb2a52c5fbb65f99d43165b4727ae0dc
 size 41459232864
--- a/Tri-21B-Think-Preview_8bit.gguf
+++ b/Tri-21B-Think-Preview_8bit.gguf
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:80b0724d721a3750985ecf7fea53246bfe4ae3ce8d112c24abe25a820577463c
 size 22029447264
--- a/Tri-21B-Think-fp16.gguf
+++ b/Tri-21B-Think-fp16.gguf
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:0777e811c281ac31b87cb27f011006954139f57ae94b366dae92f669f87fb0b0
 size 41459232832
--- a/Tri-21B-Think_4bit.gguf
+++ b/Tri-21B-Think_4bit.gguf
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:744267e765abf9db198836acf705b2df5962b28d6126ea96cafb803b2fe3720f
 size 12588064832
--- a/Tri-21B-Think_5bit.gguf
+++ b/Tri-21B-Think_5bit.gguf
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:f401fdb8a03d687147f8874553de9d0756f840c53ed189bd6443d82cfdead5e3
 size 14732075072
--- a/Tri-21B-Think_6bit.gguf
+++ b/Tri-21B-Think_6bit.gguf
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:195318d6f3dbd4d73b2a23bf5ada34c751b0be9999d343db08d8da0ed64f0f8a
 size 17010085952
--- a/Tri-21B-Think_8bit.gguf
+++ b/Tri-21B-Think_8bit.gguf
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:4f70806203577fc082247ac43f2d34a5525df2215a44db1e90905f620be17aef
 size 22029447232
--- a/Tri-21B-fp16.gguf
+++ b/Tri-21B-fp16.gguf
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:60f281ffda9edeaf5efa3aa1671f860076a464c0ac67e1c37256536670b1838b
 size 41459232288