Update README.md

2025-08-11 23:07:30 +00:00
parent 868d5588ac
commit 91a5426c06
1 changed files with 12 additions and 12 deletions
--- a/README.md
+++ b/README.md
@@ -3,7 +3,7 @@ license: apache-2.0
 language:
 - en
 base_model:
- Menlo/Jan-v1-4B
+- Qwen/Qwen3-4B-Thinking-2507
 pipeline_tag: text-generation
 ---
 # Jan-v1: Advanced Agentic Language Model
@@ -26,21 +26,21 @@ Jan-v1 leverages the newly released [Qwen3-4B-thinking](https://huggingface.co/Q
 ### Question Answering (SimpleQA) 
 For question-answering, Jan-v1 shows a significant performance gain from model scaling, achieving 91.2% accuracy.
-![image/png](https://cdn-uploads.huggingface.co/production/uploads/65713d70f56f9538679e5a56/xuDDHjPnqzS_eziwShmBq.png)
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/65713d70f56f9538679e5a56/abEitIjvszFm7Z8mRHQz-.png)
 *The 91.2% SimpleQA accuracy represents a significant milestone in factual question answering for models of this scale, demonstrating the effectiveness of our scaling and fine-tuning approach.*
-### Report Generation & Factuality
+### Chat Benchmarks
-Evaluated on a benchmark testing factual report generation from web sources, using an LLM-as-judge. The benchmark includes our proprietary `Jan Exam - Longform` and the `DeepResearchBench`.
+
 These benchmarks evaluate the model's conversational and instructional capabilities.
 | Benchmark | JanV1 (Ours) | Qwen3-4B-Thinking-2507 | GPT-OSS-20B (High) | GPT-OSS-20B (Low) |
 | :--- | :--- | :--- | :--- | :--- |
 | EQBench | **83.61** | 82.61 | 78.35 | 78.35 |
 | CreativeWriting | **72.08** | 65.74 | 30.23 | 26.38 |
 | IFBench | **Prompt:** 0.3537<br>**Instruction:** 0.3910 | Prompt: 0.4490<br>Instruction: **0.4806** | Prompt: 0.5646<br>Instruction: 0.6000 | Prompt: 0.5034<br>Instruction: 0.5403 |
 | ArenaHardv2 | **25.3** | - | - | - |
 | Model | Average Overall Score |
 | :--- | :--- |
 | o4-mini | 7.30 |
 | **Jan-v1-4B (Ours)** | **7.17** |
 | gpt-4.1 | 6.90 |
 | Qwen3-4B-Thinking-2507 | 6.84 |
 | 4o-mini | 6.60 |
 | Jan-nano-128k | 5.63 |
 ## Quick Start
 ### Integration with Jan App