model: support solar (#8189)

This commit is contained in:
Praneth Paruchuri
2025-09-15 23:51:13 +05:30
committed by GitHub
parent 28c79dc84a
commit a45d9a4ee8
3 changed files with 514 additions and 0 deletions

View File

@@ -49,6 +49,9 @@ in the GitHub search bar.
| **ERNIE-4.5** (4.5, 4.5MoE series) | `baidu/ERNIE-4.5-21B-A3B-PT` | Baidu's ERNIE-4.5 series which consists of MoE with 47B and 3B active parameters, with the largest model having 424B total parameters, as well as a 0.3B dense model. |
| **Arcee AFM-4.5B** | `arcee-ai/AFM-4.5B-Base` | Arcee's foundational model series for real world reliability and edge deployments. |
| **Persimmon** (8B) | `adept/persimmon-8b-chat` | Adepts open 8B model with a 16K context window and fast inference; trained for broad usability and licensed under Apache 2.0. |
| **Solar** (10.7B) | `upstage/SOLAR-10.7B-Instruct-v1.0` | Upstage's 10.7B parameter model, optimized for instruction-following tasks. This architecture incorporates a depth-up scaling methodology, enhancing model performance. |
| **Ling** (16.8B290B) | `inclusionAI/Ling-lite`, `inclusionAI/Ling-plus` | InclusionAIs open MoE models. Ling-Lite has 16.8B total / 2.75B active parameters, and Ling-Plus has 290B total / 28.8B active parameters. They are designed for high performance on NLP and complex reasoning tasks. |
| **Granite 3.0, 3.1** (IBM) | `ibm-granite/granite-3.1-8b-instruct` | IBM's open dense foundation models optimized for reasoning, code, and business AI use cases. Integrated with Red Hat and watsonx systems. |
| **Granite 3.0 MoE** (IBM) | `ibm-granite/granite-3.0-3b-a800m-instruct` | IBMs Mixture-of-Experts models offering strong performance with cost-efficiency. MoE expert routing designed for enterprise deployment at scale. |