55 lines
1.2 KiB
Markdown
55 lines
1.2 KiB
Markdown
---
|
|
language: en
|
|
license: apache-2.0
|
|
library_name: mlx
|
|
pipeline_tag: text-generation
|
|
tags:
|
|
- text-generation
|
|
- mlx
|
|
- apple-silicon
|
|
base_model: YoAbriel/KodaLite-1.3B
|
|
---
|
|
|
|
# KodaLite-1.3B — MLX (fp16)
|
|
|
|
MLX version of [YoAbriel/KodaLite-1.3B](https://huggingface.co/YoAbriel/KodaLite-1.3B), optimized for Apple Silicon (M1/M2/M3/M4).
|
|
|
|
**Size**: ~2.5 GB | **Precision**: bfloat16
|
|
|
|
## Usage
|
|
|
|
```bash
|
|
pip install mlx-lm
|
|
```
|
|
|
|
```python
|
|
from mlx_lm import load, generate
|
|
|
|
model, tok = load("YoAbriel/KodaLite-1.3B-mlx")
|
|
prompt = tok.apply_chat_template(
|
|
[{"role": "user", "content": "What is the capital of France?"}],
|
|
tokenize=False,
|
|
add_generation_prompt=True,
|
|
)
|
|
print(generate(model, tok, prompt=prompt, max_tokens=80))
|
|
```
|
|
|
|
Or from the command line:
|
|
|
|
```bash
|
|
mlx_lm.generate --model YoAbriel/KodaLite-1.3B-mlx \
|
|
--prompt "<|user|>\nHello\n<|assistant|>\n" --max-tokens 80
|
|
```
|
|
|
|
## Other quantizations
|
|
|
|
- [YoAbriel/KodaLite-1.3B-mlx-8bit](https://huggingface.co/YoAbriel/KodaLite-1.3B-mlx-8bit) — 1.4 GB, 8-bit
|
|
|
|
## Limitations
|
|
|
|
Small model (1.27B params), undertrained (1.64B tokens). See the [base model card](https://huggingface.co/YoAbriel/KodaLite-1.3B) for full details.
|
|
|
|
## License
|
|
|
|
Apache 2.0
|