CounseLLM — Empathy-Aligned Conversational Support LLM
An empathy-aligned conversational support model fine-tuned from Llama 3.1 8B Instruct using a two-stage alignment pipeline: Supervised Fine-Tuning (SFT) on 36K counseling examples followed by Direct Preference Optimization (DPO) on ~2K preference-filtered pairs.
Disclaimer: This is an AI research project and is not a substitute for professional mental health care. If you are in crisis, please contact the 988 Suicide & Crisis Lifeline (call or text 988) or your local emergency services.
Evaluated on 25 curated prompts across 18 mental health categories (anxiety, depression, grief, crisis, relationships, trauma, etc.).
How to Use
fromtransformersimportAutoModelForCausalLM,AutoTokenizerimporttorchmodel_id="Wothmag07/counseLLM"tokenizer=AutoTokenizer.from_pretrained(model_id)model=AutoModelForCausalLM.from_pretrained(model_id,torch_dtype=torch.bfloat16,device_map="auto",)messages=[{"role":"system","content":"You are a mental health counselor providing supportive, empathetic guidance. Respond by first acknowledging the person's feelings, then explore their situation with open-ended questions. Use techniques like reflective listening, validation, and gentle reframing. Keep responses warm, conversational, and non-judgmental."},{"role":"user","content":"I've been feeling really anxious about work lately and I can't sleep."},]input_ids=tokenizer.apply_chat_template(messages,return_tensors="pt").to(model.device)outputs=model.generate(input_ids,max_new_tokens=512,temperature=0.7,top_p=0.9,repetition_penalty=1.1,)response=tokenizer.decode(outputs[0][input_ids.shape[1]:],skip_special_tokens=True)print(response)
Uses
Intended Use
Research and educational purposes in AI-assisted mental health support
Studying alignment techniques (SFT + DPO) applied to sensitive domains
Demonstrating empathy-aligned language model fine-tuning
Out-of-Scope Use
Clinical deployment — this model is not validated for clinical use
Crisis intervention — should not be relied upon for suicide prevention or emergency situations
Replacement for therapy — not a substitute for licensed mental health professionals
Bias, Risks, and Limitations
The model may reflect biases present in training data (both real and synthetic sources)
Responses may sometimes be generic or miss nuances of specific cultural contexts
The model may generate plausible-sounding but clinically inaccurate advice
Training data is predominantly English and may not generalize to other languages
Should not be deployed in production clinical settings without extensive safety review
Environmental Impact
Hardware: NVIDIA H100 80GB
Training Time: ~3.5 hours total (SFT: 3h, DPO: 30min)