--- license: apache-2.0 base_model: marin-community/marin-8b-instruct datasets: - nvidia/Nemotron-Terminal-Corpus tags: - terminal - agentic - sft - llama - marin language: - en pipeline_tag: text-generation --- # Marin-8B-Instruct SFT on TerminalCorpus Marin-8B Instruct fine-tuned on [nvidia/Nemotron-Terminal-Corpus](https://huggingface.co/datasets/nvidia/Nemotron-Terminal-Corpus) (366K terminal agent trajectories). ## Model Details | Parameter | Value | |---|---| | Base model | [marin-community/marin-8b-instruct](https://huggingface.co/marin-community/marin-8b-instruct) | | Architecture | Llama 3 8B (32 layers, 4096 hidden, 32 heads, 8 KV heads) | | Tokenizer | [marin-community/marin-tokenizer](https://huggingface.co/marin-community/marin-tokenizer) | | Training data | nvidia/Nemotron-Terminal-Corpus (366K examples, all 4 subsets) | | Epochs | 2 | | Training steps | 5,721 | | Batch size | 128 | | Sequence length | 32,768 | | Learning rate | 2e-5 (cosine, 10% warmup) | | Optimizer | AdamW (β=0.9/0.95), grad_clip=1.0, wd=1e-4 | | TPU | v5p-64 | | Final loss | 0.442 | ## Evaluation Results ### Terminal-Bench 2.0 | Model | TB2 Accuracy | |---|---| | Marin-8B Instruct (no SFT) | 0/89 = 0% | | **Marin-8B Instruct + TerminalCorpus SFT** | **1/89 = 1.1%** | | NemotronTerminal-8B (Qwen3-8B, paper) | 13.0% ± 2.2 | | Marin Qwen3-8B SFT reproduction (exp3490b) | 14/88 = 15.9% | ### TBLite Progression | Checkpoint | TBLite | |---|---| | Step 1500 (26%) | 1/100 = 1% | | Step 3000 (52%) | 5/100 = 5% | ## Training Details Trained following the [NemotronTerminal-8B](https://arxiv.org/abs/2602.21193) paper hyperparameters. The model reaches a higher final loss (0.442 vs 0.360) as the Qwen3-8B reproduction but scores significantly lower on terminal benchmarks, likely due to architecture and tokenizer differences between Llama 3 and Qwen3. Tracked in: [marin-community/marin#4420](https://github.com/marin-community/marin/issues/4420) ## Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained("AlienKevin/marin-8b-instruct-sft-terminalcorpus") tokenizer = AutoTokenizer.from_pretrained("AlienKevin/marin-8b-instruct-sft-terminalcorpus") ``` ## License Apache 2.0