--- license: apache-2.0 base_model: unsloth/Qwen2.5-Coder-3B-Instruct-bnb-4bit tags: [apex, salesforce, lwc, visualforce, aura, soql, sfdx, code, fine-tuned, qlora, unsloth] datasets: [Gianloko/apex-coder-training-data] language: [en] pipeline_tag: text-generation --- # ApexCoder-1.5B ยท Merged 16-bit Model *Last updated: 2026-03-20 โ€” Cycle 2* Production-ready merged model (base + LoRA fused into 16-bit weights). Trained on a single NVIDIA A40 (44 GB) using Unsloth QLoRA + TRL SFTTrainer. > **Looking for a smaller download?** > Use the [LoRA adapter](Gianloko/apex-coder-1.5b-lora) (~150 MB) or the > [GGUF Q4_K_M](Gianloko/apex-coder-1.5b-GGUF) (~986 MB) for Ollama. ## ๐Ÿ“Š Evaluation โ€” Cycle 2 | Metric | Value | |---|---| | **LLM-as-judge (avg)** | **12.6/15** | | **Perplexity** | **1.14** | | **ฮ” vs previous cycle** | **+12.6** | | Training loss | 0.2274 | | Training samples | 8,990 | | Training steps | 1100 | ### By reasoning type | Type | Status | Score | Progress | |---|---|---|---| ### Cycle history | Cycle | Date | Score | PPL | ฮ” | vs Published | |---|---|---|---|---|---| | 1 | 2026-03-20 | 12.9/15 | 1.17 | +12.9 | 12.9 | | 2 | 2026-03-20 | 12.6/15 | 1.14 | +12.6 | 13.2 | ## ๐Ÿš€ Quick start ```python from transformers import AutoModelForCausalLM, AutoTokenizer import torch model = AutoModelForCausalLM.from_pretrained( "Gianloko/apex-coder-1.5b", torch_dtype=torch.bfloat16, device_map="auto", ) tokenizer = AutoTokenizer.from_pretrained("Gianloko/apex-coder-1.5b") messages = [ {"role": "system", "content": "You are ApexCoder, a world-class Salesforce expert."}, {"role": "user", "content": "Write a bulkified Apex trigger on Opportunity that prevents status changes to Closed Won if no related Products exist."}, ] inputs = tokenizer.apply_chat_template( messages, return_tensors="pt", add_generation_prompt=True ).to(model.device) output = model.generate(inputs, max_new_tokens=512, temperature=0.1, do_sample=False) print(tokenizer.decode(output[0][inputs.shape[1]:], skip_special_tokens=True)) ``` ## ๐Ÿฆ™ Ollama (GGUF โ€” recommended for local use) ```bash ollama pull hf.co/Gianloko/apex-coder-1.5b-GGUF:Q4_K_M ollama run hf.co/Gianloko/apex-coder-1.5b-GGUF:Q4_K_M ``` ## ๐Ÿ”ง LoRA adapter If you already have the base model loaded, use the [LoRA adapter](Gianloko/apex-coder-1.5b-lora) (~150 MB) instead: ```python from peft import PeftModel model = PeftModel.from_pretrained(base_model, "Gianloko/apex-coder-1.5b-lora") ``` ## โš™๏ธ V6 pipeline notes - **Warm-start training** โ€” cycle 2+ initialises from previous LoRA adapter - **Best-ever gate** โ€” publish blocked if new model regresses vs published model - **Data quality** โ€” validated with langdetect + non-ASCII ratio filter - **CanaryCallback** โ€” 3 probes per epoch, majority-fail aborts training - **Post-merge validation** โ€” 3 sanity + 3 hallucination probes gate every push - **Dataset versioned** โ€” cycle tags on HuggingFace for full rollback capability ## License Apache 2.0