apex-coder-1.5b/README.md

---
license: apache-2.0
base_model: unsloth/Qwen2.5-Coder-3B-Instruct-bnb-4bit
tags: [apex, salesforce, lwc, visualforce, aura, soql, sfdx, code, fine-tuned, qlora, unsloth]
datasets: [Gianloko/apex-coder-training-data]
language: [en]
pipeline_tag: text-generation
---

# ApexCoder-1.5B · Merged 16-bit Model

*Last updated: 2026-03-20 — Cycle 2*

Production-ready merged model (base + LoRA fused into 16-bit weights).
Trained on a single NVIDIA A40 (44 GB) using Unsloth QLoRA + TRL SFTTrainer.

> **Looking for a smaller download?**
> Use the [LoRA adapter](Gianloko/apex-coder-1.5b-lora) (~150 MB) or the
> [GGUF Q4_K_M](Gianloko/apex-coder-1.5b-GGUF) (~986 MB) for Ollama.

## 📊 Evaluation — Cycle 2

| Metric | Value |
|---|---|
| **LLM-as-judge (avg)** | **12.6/15** |
| **Perplexity** | **1.14** |
| **Δ vs previous cycle** | **+12.6** |
| Training loss | 0.2274 |
| Training samples | 8,990 |
| Training steps | 1100 |

### By reasoning type

| Type | Status | Score | Progress |
|---|---|---|---|


### Cycle history

| Cycle | Date | Score | PPL | Δ | vs Published |
|---|---|---|---|---|---|
| 1 | 2026-03-20 | 12.9/15 | 1.17 | +12.9 | 12.9 |
| 2 | 2026-03-20 | 12.6/15 | 1.14 | +12.6 | 13.2 |


## 🚀 Quick start

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model = AutoModelForCausalLM.from_pretrained(
    "Gianloko/apex-coder-1.5b",
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("Gianloko/apex-coder-1.5b")

messages = [
    {"role": "system", "content": "You are ApexCoder, a world-class Salesforce expert."},
    {"role": "user",   "content": "Write a bulkified Apex trigger on Opportunity that prevents status changes to Closed Won if no related Products exist."},
]
inputs = tokenizer.apply_chat_template(
    messages, return_tensors="pt", add_generation_prompt=True
).to(model.device)

output = model.generate(inputs, max_new_tokens=512, temperature=0.1, do_sample=False)
print(tokenizer.decode(output[0][inputs.shape[1]:], skip_special_tokens=True))
```

## 🦙 Ollama (GGUF — recommended for local use)

```bash
ollama pull hf.co/Gianloko/apex-coder-1.5b-GGUF:Q4_K_M
ollama run hf.co/Gianloko/apex-coder-1.5b-GGUF:Q4_K_M
```

## 🔧 LoRA adapter

If you already have the base model loaded, use the
[LoRA adapter](Gianloko/apex-coder-1.5b-lora) (~150 MB) instead:

```python
from peft import PeftModel
model = PeftModel.from_pretrained(base_model, "Gianloko/apex-coder-1.5b-lora")
```

## ⚙️ V6 pipeline notes

- **Warm-start training** — cycle 2+ initialises from previous LoRA adapter
- **Best-ever gate** — publish blocked if new model regresses vs published model
- **Data quality** — validated with langdetect + non-ASCII ratio filter
- **CanaryCallback** — 3 probes per epoch, majority-fail aborts training
- **Post-merge validation** — 3 sanity + 3 hallucination probes gate every push
- **Dataset versioned** — cycle tags on HuggingFace for full rollback capability

## License

Apache 2.0
初始化项目，由ModelHub XC社区提供模型 Model: Gianloko/apex-coder-1.5b Source: Original Platform 2026-06-01 23:30:23 +08:00			`---`
			`license: apache-2.0`
			`base_model: unsloth/Qwen2.5-Coder-3B-Instruct-bnb-4bit`
			`tags: [apex, salesforce, lwc, visualforce, aura, soql, sfdx, code, fine-tuned, qlora, unsloth]`
			`datasets: [Gianloko/apex-coder-training-data]`
			`language: [en]`
			`pipeline_tag: text-generation`
			`---`

			`# ApexCoder-1.5B · Merged 16-bit Model`

			`Last updated: 2026-03-20 — Cycle 2`

			`Production-ready merged model (base + LoRA fused into 16-bit weights).`
			`Trained on a single NVIDIA A40 (44 GB) using Unsloth QLoRA + TRL SFTTrainer.`

			`> Looking for a smaller download?`
			`> Use the [LoRA adapter](Gianloko/apex-coder-1.5b-lora) (~150 MB) or the`
			`> [GGUF Q4_K_M](Gianloko/apex-coder-1.5b-GGUF) (~986 MB) for Ollama.`

			`## 📊 Evaluation — Cycle 2`

			`\| Metric \| Value \|`
			`\|---\|---\|`
			`\| LLM-as-judge (avg) \| 12.6/15 \|`
			`\| Perplexity \| 1.14 \|`
			`\| Δ vs previous cycle \| +12.6 \|`
			`\| Training loss \| 0.2274 \|`
			`\| Training samples \| 8,990 \|`
			`\| Training steps \| 1100 \|`

			`### By reasoning type`

			`\| Type \| Status \| Score \| Progress \|`
			`\|---\|---\|---\|---\|`


			`### Cycle history`

			`\| Cycle \| Date \| Score \| PPL \| Δ \| vs Published \|`
			`\|---\|---\|---\|---\|---\|---\|`
			`\| 1 \| 2026-03-20 \| 12.9/15 \| 1.17 \| +12.9 \| 12.9 \|`
			`\| 2 \| 2026-03-20 \| 12.6/15 \| 1.14 \| +12.6 \| 13.2 \|`


			`## 🚀 Quick start`

			```python
			`from transformers import AutoModelForCausalLM, AutoTokenizer`
			`import torch`

			`model = AutoModelForCausalLM.from_pretrained(`
			`"Gianloko/apex-coder-1.5b",`
			`torch_dtype=torch.bfloat16,`
			`device_map="auto",`
			`)`
			`tokenizer = AutoTokenizer.from_pretrained("Gianloko/apex-coder-1.5b")`

			`messages = [`
			`{"role": "system", "content": "You are ApexCoder, a world-class Salesforce expert."},`
			`{"role": "user", "content": "Write a bulkified Apex trigger on Opportunity that prevents status changes to Closed Won if no related Products exist."},`
			`]`
			`inputs = tokenizer.apply_chat_template(`
			`messages, return_tensors="pt", add_generation_prompt=True`
			`).to(model.device)`

			`output = model.generate(inputs, max_new_tokens=512, temperature=0.1, do_sample=False)`
			`print(tokenizer.decode(output[0][inputs.shape[1]:], skip_special_tokens=True))`
			```

			`## 🦙 Ollama (GGUF — recommended for local use)`

			```bash
			`ollama pull hf.co/Gianloko/apex-coder-1.5b-GGUF:Q4_K_M`
			`ollama run hf.co/Gianloko/apex-coder-1.5b-GGUF:Q4_K_M`
			```

			`## 🔧 LoRA adapter`

			`If you already have the base model loaded, use the`
			`[LoRA adapter](Gianloko/apex-coder-1.5b-lora) (~150 MB) instead:`

			```python
			`from peft import PeftModel`
			`model = PeftModel.from_pretrained(base_model, "Gianloko/apex-coder-1.5b-lora")`
			```

			`## ⚙️ V6 pipeline notes`

			`- Warm-start training — cycle 2+ initialises from previous LoRA adapter`
			`- Best-ever gate — publish blocked if new model regresses vs published model`
			`- Data quality — validated with langdetect + non-ASCII ratio filter`
			`- CanaryCallback — 3 probes per epoch, majority-fail aborts training`
			`- Post-merge validation — 3 sanity + 3 hallucination probes gate every push`
			`- Dataset versioned — cycle tags on HuggingFace for full rollback capability`

			`## License`

			`Apache 2.0`