初始化项目，由ModelHub XC社区提供模型

Model: trillionlabs/Tri-7B-Base Source: Original Platform
2026-05-20 22:44:12 +08:00
commit 474eadc1fc
13 changed files with 1790 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,68 @@
+---
+license: apache-2.0
+tags:
+- pretrained
+- base-model
+language:
+- en
+- ko
+- ja
+pipeline_tag: text-generation
+library_name: transformers
+extra_gated_fields:
+  Full Name: text
+  Email: text
+  Organization: text
+---
+
+<p align="center">
+<picture>
+  <img src="https://raw.githubusercontent.com/trillion-labs/.github/main/Tri-7B.png" alt="Tri-7B-Base", style="width: 80%;">
+</picture>
+</p>
+
+# Tri-7B-Base
+
+## Introduction
+
+We present **Tri-7B-Base**, a foundation language model that serves as the pre-trained base for our Tri-7B model family. This model represents our commitment to efficient training while establishing a strong foundation for downstream fine-tuning and adaptation.
+
+### Key Features
+* **Foundation Architecture**: State-of-the-art transformer architecture optimized for efficiency
+* **Multi-lingual Foundation**: Pre-trained on diverse data in Korean, English, and Japanese
+* **Efficient Training**: Optimized training methodology for computational efficiency
+
+### Model Specifications
+
+#### Tri-7B-Base
+- Type: Causal Language Model
+- Training Stage: Pre-training
+- Architecture: Transformer Decoder with RoPE, SwiGLU, RMSNorm
+- Number of Parameters: 7.76B
+- Number of Layers: 32
+- Number of Attention Heads: 32
+- Context Length: 4,096
+- Vocab Size: 128,128
+
+## Use Cases
+
+As a base model, Tri-7B-Base is designed to serve as a foundation for various downstream applications:
+
+- **Fine-tuning**: Adapt to specific domains or tasks
+- **Instruction Tuning**: Create chat or assistant models
+- **Domain Specialization**: Customize for specific industries or use cases
+- **Research**: Explore model behaviors and capabilities
+- **Language Generation**: General text completion and generation tasks
+
+## Limitations
+
+- **Base Model Nature**: This is a pre-trained base model without instruction tuning or alignment. For chat or assistant capabilities, consider fine-tuned variants.
+- **Language Support**: The model is optimized for English, Korean, and Japanese. Usage with other languages may result in degraded performance.
+- **Knowledge Cutoff**: The model's information is limited to data available up to February, 2025.
+- **Generation Quality**: As a base model, outputs may require post-processing or filtering for production use cases.
+
+## License
+This model is licensed under the Apache License 2.0.
+
+## Contact
+For inquiries, please contact: info@trillionlabs.co