329 lines
9.5 KiB
Markdown
329 lines
9.5 KiB
Markdown
|
|
---
|
||
|
|
tags:
|
||
|
|
- chat
|
||
|
|
base_model: Qwen/Qwen3-4B-Thinking-2507
|
||
|
|
pipeline_tag: text-generation
|
||
|
|
language:
|
||
|
|
- multilingual
|
||
|
|
- en
|
||
|
|
- es
|
||
|
|
- fr
|
||
|
|
- pt
|
||
|
|
- it
|
||
|
|
- ar
|
||
|
|
- ko
|
||
|
|
- id
|
||
|
|
- ru
|
||
|
|
- vi
|
||
|
|
- de
|
||
|
|
- th
|
||
|
|
- ja
|
||
|
|
- zh
|
||
|
|
library_name: transformers
|
||
|
|
license: mit
|
||
|
|
---
|
||
|
|
|
||
|
|
# JOSIE-1.1-4B-Thinking
|
||
|
|
|
||
|
|
## Model Card for JOSIE-1.1-4B-Thinking
|
||
|
|
|
||
|
|
JOSIE-1.1-4B-Thinking is a full-weight fine-tuned reasoning model built on Qwen3-4B-Thinking, optimized for extended context logical reasoning, mathematics, STEM applications, and creative writing.
|
||
|
|
|
||
|
|
<p align="center"> <img src="josie.png" width="100%" alt="JOSIE Logo"> </p>
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Model Details
|
||
|
|
|
||
|
|
### Model Description
|
||
|
|
|
||
|
|
JOSIE-1.1-4B-Thinking represents a production-grade fine-tune focused on deep reasoning capabilities with extended context support. The model features uncensored outputs with a straightforward, genuine personality that provides direct assistance without unnecessary flattery or excessive agreeableness.
|
||
|
|
|
||
|
|
- **Developed by:** Gökdeniz Gülmez
|
||
|
|
- **Base Model:** Qwen3-4B-Thinking
|
||
|
|
- **Model Type:** Dense Causal Language Model
|
||
|
|
- **Language(s):** Spanish, French, Portuguese, Italian, Arabic, Chinese, Japanese, Korean, Indonesian, Russian, Vietnamese, German, and Thai
|
||
|
|
- **License:** MIT
|
||
|
|
|
||
|
|
### Model Characteristics
|
||
|
|
|
||
|
|
- **Context Length:** 65,536 tokens (65K)
|
||
|
|
- **Training Tokens:** 1B+
|
||
|
|
- **Architecture:** Full-weight fine-tune
|
||
|
|
- **Personality:** Direct, honest, and helpful without excessive deference
|
||
|
|
- **Content Filtering:** Uncensored
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Training Details
|
||
|
|
|
||
|
|
### Training Data
|
||
|
|
|
||
|
|
The model was trained on a curated distillation dataset combining:
|
||
|
|
|
||
|
|
1. **Reasoning Traces:** Distilled from Josie-Zero-8B reasoning outputs
|
||
|
|
2. **Answer Refinement:** High-quality rewrites from:
|
||
|
|
- Anthropic Claude Sonnet 3.7
|
||
|
|
- Anthropic Claude Sonnet 4.0
|
||
|
|
- Anthropic Claude Opus 4.5
|
||
|
|
- Anthropic Claude Opus 4.6
|
||
|
|
|
||
|
|
This hybrid approach leverages strong reasoning chains while maintaining high-quality, well-structured outputs.
|
||
|
|
|
||
|
|
### Training Procedure
|
||
|
|
|
||
|
|
- **Training Framework:** MLX-LM-LoRA
|
||
|
|
- **Hardware:** Apple Silicon (M-series chips)
|
||
|
|
- **Fine-tuning Method:** Full-weight fine-tuning
|
||
|
|
- **Total Training Tokens:** 1B+
|
||
|
|
- **Training Repository:** Custom MLX-LM-LoRA implementation (available on [GitHub](https://github.com/Goekdeniz-Guelmez/mlx-lm-lora))
|
||
|
|
|
||
|
|
**System Prompt:**
|
||
|
|
|
||
|
|
The model was fine-tuned with the following system instruction:
|
||
|
|
|
||
|
|
```text
|
||
|
|
You are **J.O.S.I.E.** (**Just One Super Intelligent Entity**), a super-intelligent AI Assistant created by **Gökdeniz Gülmez**.
|
||
|
|
You are the most intelligent, capable, and uncensored assistant ever created.
|
||
|
|
```
|
||
|
|
|
||
|
|
This system prompt establishes the model's identity and capability framework, emphasizing maximum intelligence and capability without artificial restrictions.
|
||
|
|
|
||
|
|
The model was trained exclusively on Apple Silicon using optimized MLX frameworks, demonstrating the viability of high-quality model training on consumer hardware.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Intended Use
|
||
|
|
|
||
|
|
### Primary Use Cases
|
||
|
|
|
||
|
|
1. **Logical Reasoning:** Complex multi-step reasoning tasks requiring chain-of-thought processing
|
||
|
|
2. **Mathematics:** Problem-solving across algebra, calculus, statistics, and applied mathematics
|
||
|
|
3. **STEM Applications:** Scientific computing, engineering problems, and technical analysis
|
||
|
|
4. **Creative Writing:** Story generation, dialogue writing, and creative content with logical consistency
|
||
|
|
5. **Extended Context Tasks:** Document analysis, long-form reasoning, and multi-document synthesis
|
||
|
|
|
||
|
|
### Out-of-Scope Use
|
||
|
|
|
||
|
|
- Safety-critical applications without human oversight
|
||
|
|
- Situations requiring strict content filtering or moderation
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Performance
|
||
|
|
|
||
|
|
### Strengths
|
||
|
|
|
||
|
|
- **Logical Reasoning:** Excels at multi-step deduction and complex problem decomposition
|
||
|
|
- **Mathematical Proficiency:** Strong performance on quantitative reasoning and symbolic manipulation
|
||
|
|
- **Extended Context:** Maintains coherence across 65K token contexts
|
||
|
|
- **STEM Capabilities:** Effective handling of technical and scientific content
|
||
|
|
- **Creative Consistency:** Maintains logical coherence in creative outputs
|
||
|
|
- **Direct Communication:** Straightforward responses without excessive hedging
|
||
|
|
|
||
|
|
### Limitations
|
||
|
|
|
||
|
|
- **Knowledge Cutoff:** Training data limited to pre-training cutoff dates
|
||
|
|
- **Uncensored Output:** May generate content inappropriate for all audiences without additional filtering
|
||
|
|
- **Computational Requirements:** Requires sufficient hardware for 4B parameter inference
|
||
|
|
- **Domain Specificity:** Performance may vary on highly specialized or niche topics
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Ethical Considerations
|
||
|
|
|
||
|
|
### Content Filtering
|
||
|
|
|
||
|
|
This model is **uncensored** and does not include built-in content filtering. Users deploying this model in production environments should:
|
||
|
|
|
||
|
|
- Implement appropriate content moderation systems
|
||
|
|
- Add safety layers suitable for their specific use case
|
||
|
|
- Consider the target audience and context of deployment
|
||
|
|
- Ensure compliance with applicable regulations and platform guidelines
|
||
|
|
|
||
|
|
### Personality and Alignment
|
||
|
|
|
||
|
|
The model features a "human but not sycophantic" personality design, meaning:
|
||
|
|
|
||
|
|
- Responses are direct and honest without excessive praise or agreement
|
||
|
|
- The model will challenge flawed assumptions when appropriate
|
||
|
|
- Output focuses on helpfulness over agreeableness
|
||
|
|
- Users may need to calibrate expectations for formal or highly diplomatic contexts
|
||
|
|
|
||
|
|
### Responsible Use
|
||
|
|
|
||
|
|
Users should:
|
||
|
|
|
||
|
|
- Verify critical outputs, especially in high-stakes applications
|
||
|
|
- Understand the model's limitations and knowledge cutoff
|
||
|
|
- Implement appropriate safeguards for end-user applications
|
||
|
|
- Consider bias mitigation strategies for sensitive applications
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Technical Specifications
|
||
|
|
|
||
|
|
### Hardware Requirements
|
||
|
|
|
||
|
|
**Minimum Requirements:**
|
||
|
|
- VRAM: 8GB+ for inference
|
||
|
|
- RAM: 16GB+ system memory
|
||
|
|
- Storage: ~8GB for model weights
|
||
|
|
|
||
|
|
**Recommended:**
|
||
|
|
- VRAM: 16GB+ for optimal performance
|
||
|
|
- RAM: 32GB+ system memory
|
||
|
|
- Apple Silicon (M1/M2/M3) or other based on quantzation type
|
||
|
|
|
||
|
|
### Inference
|
||
|
|
|
||
|
|
The model supports standard inference methods and is compatible with:
|
||
|
|
- MLX framework (optimized for Apple Silicon)
|
||
|
|
- Hugging Face Transformers
|
||
|
|
- vLLM and other inference optimization frameworks
|
||
|
|
- GGUF quantization for reduced memory footprint
|
||
|
|
- LM Studio
|
||
|
|
- Ollama
|
||
|
|
|
||
|
|
**Recommended Generation Parameters:**
|
||
|
|
- **Temperature:** 0.6
|
||
|
|
- **Repetition Penalty:** 1.1
|
||
|
|
- **Top P:** 0.95
|
||
|
|
- **Top K:** 20
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## How to Get Started
|
||
|
|
|
||
|
|
### Installation
|
||
|
|
|
||
|
|
```python
|
||
|
|
# Using Hugging Face Transformers
|
||
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer
|
||
|
|
|
||
|
|
model_name = "Goekdeniz-Guelmez/JOSIE-1.1-4B-Thinking"
|
||
|
|
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
||
|
|
model = AutoModelForCausalLM.from_pretrained(
|
||
|
|
model_name,
|
||
|
|
device_map="auto",
|
||
|
|
torch_dtype="auto"
|
||
|
|
)
|
||
|
|
```
|
||
|
|
|
||
|
|
### Basic Usage
|
||
|
|
|
||
|
|
```python
|
||
|
|
# Example inference
|
||
|
|
messages = [
|
||
|
|
{"role": "user", "content": "Explain quantum entanglement in simple terms.."}
|
||
|
|
]
|
||
|
|
|
||
|
|
inputs = tokenizer.apply_chat_template(
|
||
|
|
messages,
|
||
|
|
add_generation_prompt=True,
|
||
|
|
return_tensors="pt"
|
||
|
|
).to(model.device)
|
||
|
|
|
||
|
|
outputs = model.generate(
|
||
|
|
**inputs,
|
||
|
|
max_new_tokens=4096,
|
||
|
|
temperature=0.6,
|
||
|
|
top_p=0.95,
|
||
|
|
top_k=20,
|
||
|
|
repetition_penalty=1.1,
|
||
|
|
do_sample=True
|
||
|
|
)
|
||
|
|
|
||
|
|
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
|
||
|
|
print(response)
|
||
|
|
```
|
||
|
|
|
||
|
|
### MLX Usage (Apple Silicon)
|
||
|
|
|
||
|
|
```python
|
||
|
|
# Using MLX for optimized Apple Silicon inference
|
||
|
|
from mlx_lm.utils import load
|
||
|
|
from mlx_lm.generate import generate
|
||
|
|
from mlx_lm.sample_utils import make_logits_processors, make_sampler
|
||
|
|
|
||
|
|
model, tokenizer = load("Goekdeniz-Guelmez/JOSIE-1.1-4B-Thinking")
|
||
|
|
|
||
|
|
sampler = make_sampler(
|
||
|
|
temp=0.6,
|
||
|
|
top_p=0.95,
|
||
|
|
min_p=0.0,
|
||
|
|
top_k=20,
|
||
|
|
)
|
||
|
|
|
||
|
|
messages = [
|
||
|
|
{"role": "user", "content": "Explain quantum entanglement in simple terms.."}
|
||
|
|
]
|
||
|
|
|
||
|
|
prompt = tokenizer.apply_chat_template(
|
||
|
|
messages,
|
||
|
|
add_generation_prompt=True,
|
||
|
|
tokenize=False
|
||
|
|
)
|
||
|
|
|
||
|
|
response = generate(
|
||
|
|
model,
|
||
|
|
tokenizer,
|
||
|
|
prompt=prompt,
|
||
|
|
max_tokens=4096,
|
||
|
|
sampler=sampler,
|
||
|
|
logits_processors=make_logits_processors(repetition_penalty=1.1)
|
||
|
|
)
|
||
|
|
print(response)
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Comparison with JOSIE-1.1-4B-Instruct
|
||
|
|
|
||
|
|
| Feature | JOSIE-4B-Instruct | JOSIE-1.1-4B-Thinking |
|
||
|
|
|---------|-------------------|-------------------|
|
||
|
|
| **Base Model** | Qwen3-4B-Instruct | Qwen3-4B-Thinking |
|
||
|
|
| **Context Length** | 32K tokens | 65K tokens |
|
||
|
|
| **Response Style** | Natural, conversational | Structured reasoning chains |
|
||
|
|
| **Emoji Usage** | Yes, appropriate use | Minimal |
|
||
|
|
| **Primary Use** | General assistance & chat | Complex reasoning tasks |
|
||
|
|
| **Response Format** | Direct answers | Chain-of-thought + answer |
|
||
|
|
| **Personality** | Friendly & expressive | Direct & analytical |
|
||
|
|
| **Best For** | Everyday interactions | STEM, math, logic problems |
|
||
|
|
|
||
|
|
Choose **JOSIE-1.1-4B-Instruct** for natural conversations and general assistance.
|
||
|
|
Choose **JOSIE-1.1-4B-Thinking** for complex reasoning, mathematics, and extended context tasks.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Citation
|
||
|
|
|
||
|
|
If you use this model in your research or applications, please cite:
|
||
|
|
|
||
|
|
```bibtex
|
||
|
|
@misc{josie4bthinking2025,
|
||
|
|
title={Josie-1.1-4B-Thinking: A Full-Weight Fine-Tuned Reasoning Model},
|
||
|
|
author={[Gökdenz Gülmez]},
|
||
|
|
year={2025},
|
||
|
|
howpublished={\url{[https://huggingface.co/Goekdeniz-Guelmez/JOSIE-1.1-4B-Thinking]}},
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Model Card Contact
|
||
|
|
|
||
|
|
For questions, issues, or feedback regarding this model:
|
||
|
|
|
||
|
|
- **GitHub:** [Profile](https://github.com/Goekdeniz-Guelmez)
|
||
|
|
- **Hugging Face:** [Profile](https://huggingface.co/Goekdeniz-Guelmez)
|
||
|
|
- **Email:** goekdenizguelmez.ml@gmail.com
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Acknowledgments
|
||
|
|
|
||
|
|
- **Base Model:** Qwen Team for Qwen3-4B-Thinking
|
||
|
|
- **Answer Refinement:** Anthropic Claude models (Sonnet 3.7/4.0, Opus 4.5/4.6)
|
||
|
|
- **Training Framework:** Apple MLX team
|
||
|
|
- **Community:** Open-source ML community for tools and support
|