Model: AlienKevin/marin-8b-instruct-sft-terminalcorpus Source: Original Platform
license, base_model, datasets, tags, language, pipeline_tag
| license | base_model | datasets | tags | language | pipeline_tag | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| apache-2.0 | marin-community/marin-8b-instruct |
|
|
|
text-generation |
Marin-8B-Instruct SFT on TerminalCorpus
Marin-8B Instruct fine-tuned on nvidia/Nemotron-Terminal-Corpus (366K terminal agent trajectories).
Model Details
| Parameter | Value |
|---|---|
| Base model | marin-community/marin-8b-instruct |
| Architecture | Llama 3 8B (32 layers, 4096 hidden, 32 heads, 8 KV heads) |
| Tokenizer | marin-community/marin-tokenizer |
| Training data | nvidia/Nemotron-Terminal-Corpus (366K examples, all 4 subsets) |
| Epochs | 2 |
| Training steps | 5,721 |
| Batch size | 128 |
| Sequence length | 32,768 |
| Learning rate | 2e-5 (cosine, 10% warmup) |
| Optimizer | AdamW (β=0.9/0.95), grad_clip=1.0, wd=1e-4 |
| TPU | v5p-64 |
| Final loss | 0.442 |
Evaluation Results
Terminal-Bench 2.0
| Model | TB2 Accuracy |
|---|---|
| Marin-8B Instruct (no SFT) | 0/89 = 0% |
| Marin-8B Instruct + TerminalCorpus SFT | 1/89 = 1.1% |
| NemotronTerminal-8B (Qwen3-8B, paper) | 13.0% ± 2.2 |
| Marin Qwen3-8B SFT reproduction (exp3490b) | 14/88 = 15.9% |
TBLite Progression
| Checkpoint | TBLite |
|---|---|
| Step 1500 (26%) | 1/100 = 1% |
| Step 3000 (52%) | 5/100 = 5% |
Training Details
Trained following the NemotronTerminal-8B paper hyperparameters. The model reaches a higher final loss (0.442 vs 0.360) as the Qwen3-8B reproduction but scores significantly lower on terminal benchmarks, likely due to architecture and tokenizer differences between Llama 3 and Qwen3.
Tracked in: marin-community/marin#4420
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("AlienKevin/marin-8b-instruct-sft-terminalcorpus")
tokenizer = AutoTokenizer.from_pretrained("AlienKevin/marin-8b-instruct-sft-terminalcorpus")
License
Apache 2.0
Description
Languages
Jinja
100%