初始化项目，由ModelHub XC社区提供模型

Model: larryhudson/context-1-PreTrainedTokenizerFast Source: Original Platform
2026-05-06 12:37:50 +08:00
commit 9e4449ed14
12 changed files with 622 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,74 @@
+---
+license: apache-2.0
+base_model:
+- openai/gpt-oss-20b
+library_name: transformers
+---
+
+# Chroma Context-1
+
+Context-1 is a 20B parameter agentic search model trained
+to retrieve supporting documents for complex, multi-hop
+queries. It is designed to be used as a retrieval subagent
+alongside a frontier reasoning model: given a query,
+Context-1 decomposes it into subqueries, iteratively
+searches a corpus, and selectively edits its own context
+to free capacity for further exploration.
+
+Context-1 achieves retrieval performance comparable to
+frontier LLMs at a fraction of the cost and up to 10x
+faster inference speed.
+
+**Technical report:**
+[Chroma Context-1: Training a Self-Editing Search Agent](https://trychroma.com/research/context-1)
+
+## Model Details
+
+- **Base model:** gpt-oss-20b
+- **Parameters:** 20B (Mixture of Experts)
+- **Training:** SFT + RL (CISPO) with a staged curriculum
+- **Precision:** BF16 (MXFP4 quantized checkpoint coming soon)
+
+## Key Capabilities
+
+- **Query decomposition:** Breaks complex multi-constraint
+  questions into targeted subqueries.
+- **Parallel tool calling:** Averages 2.56 tool calls per
+  turn, reducing total turns and end-to-end latency.
+- **Self-editing context:** Selectively prunes irrelevant
+  documents mid-search to sustain retrieval quality over
+  long horizons within a bounded context window (0.94
+  prune accuracy).
+- **Cross-domain generalization:** Trained on web, legal,
+  and finance tasks; generalizes to held-out domains and
+  public benchmarks (BrowseComp-Plus, SealQA, FRAMES,
+  HLE).
+
+## Important: Agent Harness Required
+
+Context-1 is trained to operate within a specific agent
+harness that manages tool execution, token budgets, context
+pruning, and deduplication. **The harness is not yet
+public.** Running the model without it will not reproduce
+the results reported in the technical report.
+
+We plan to release the full agent harness and evaluation
+code soon. In the meantime, the technical report describes
+the harness design in detail.
+
+## Citation
+
+```bibtex
+@techreport{bashir2026context1,
+  title = {Chroma Context-1: Training a Self-Editing Search Agent},
+  author = {Bashir, Hammad and Hong, Kelly and Jiang, Patrick and Shi, Zhiyi},
+  year = {2026},
+  month = {March},
+  institution = {Chroma},
+  url = {https://trychroma.com/research/context-1},
+}
+```
+
+## License
+
+Apache 2.0