--- base_model: Qwen/Qwen3-0.6B library_name: transformers tags: - generated_from_trainer - trl - grpo - cricket - merged license: mit pipeline_tag: text-generation --- # cricket-captain-qwen3-06b-merged `Qwen/Qwen3-0.6B` with the [`pratinavseth/cricket-captain-qwen3-06b-stage2`](https://huggingface.co/pratinavseth/cricket-captain-qwen3-06b-stage2) LoRA adapter (stage 2 GRPO, step 50) **merged into the base weights**. Single-file model, ready to load without PEFT — usable directly via `transformers`, `vllm`, or TGI. - **Live Space:** https://huggingface.co/spaces/pratinavseth/cricket-captain-llm - **Source code:** https://github.com/pratinavseth/cricket-captain-llm - **Adapter (un-merged):** [`pratinavseth/cricket-captain-qwen3-06b-stage2`](https://huggingface.co/pratinavseth/cricket-captain-qwen3-06b-stage2) - **W&B run:** https://wandb.ai/ptnv-s-research/huggingface ## Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer tok = AutoTokenizer.from_pretrained("pratinavseth/cricket-captain-qwen3-06b-merged") model = AutoModelForCausalLM.from_pretrained("pratinavseth/cricket-captain-qwen3-06b-merged", torch_dtype="bfloat16", device_map="auto") ``` The model expects the cricket-captain prompt schema produced by the OpenEnv environment in this repo (see `inference.py` for prompt construction).