[Model] Add Olmo 3 model support (#11396)

This commit is contained in:
Shane A
2025-10-19 23:59:16 -07:00
committed by GitHub
parent 984fbeb16b
commit d383e6616e
8 changed files with 169 additions and 4 deletions

View File

@@ -33,6 +33,7 @@ in the GitHub search bar.
| **Gemma** (v1, v2, v3) | `google/gemma-3-1b-it` | Googles family of efficient multilingual models (1B27B); Gemma 3 offers a 128K context window, and its larger (4B+) variants support vision input. |
| **Phi** (Phi-1.5, Phi-2, Phi-3, Phi-4, Phi-MoE series) | `microsoft/Phi-4-multimodal-instruct`, `microsoft/Phi-3.5-MoE-instruct` | Microsofts Phi family of small models (1.3B5.6B); Phi-4-multimodal (5.6B) processes text, images, and speech, Phi-4-mini is a high-accuracy text model and Phi-3.5-MoE is a mixture-of-experts model. |
| **MiniCPM** (v3, 4B) | `openbmb/MiniCPM3-4B` | OpenBMBs series of compact LLMs for edge devices; MiniCPM 3 (4B) achieves GPT-3.5-level results in text tasks. |
| **OLMo** (2, 3) | `allenai/OLMo-2-1124-7B-Instruct` | Allen AIs series of Open Language Models designed to enable the science of language models. |
| **OLMoE** (Open MoE) | `allenai/OLMoE-1B-7B-0924` | Allen AIs open Mixture-of-Experts model (7B total, 1B active parameters) delivering state-of-the-art results with sparse expert activation. |
| **StableLM** (3B, 7B) | `stabilityai/stablelm-tuned-alpha-7b` | StabilityAIs early open-source LLM (3B & 7B) for general text generation; a demonstration model with basic instruction-following ability. |
| **Command-R** (Cohere) | `CohereForAI/c4ai-command-r-v01` | Coheres open conversational LLM (Command series) optimized for long context, retrieval-augmented generation, and tool use. |