2c2240df4281d0b1f16927bf7cc98f559e83b9e0
Model: Mathieu-Thomas-JOSSET/joke-finetome-model-gguf-phi4-20260112-081758 Source: Original Platform
pipeline_tag, tags, base_model, datasets
| pipeline_tag | tags | base_model | datasets | ||||||
|---|---|---|---|---|---|---|---|---|---|
| text-generation |
|
|
|
joke-finetome-model-gguf-phi4-20260112-081758 : GGUF
This model was finetuned and converted to GGUF format using Unsloth.
Example usage:
- For text only LLMs:
./llama.cpp/llama-cli -hf Mathieu-Thomas-JOSSET/joke-finetome-model-gguf-phi4-20260112-081758 --jinja - For multimodal models:
./llama.cpp/llama-mtmd-cli -hf Mathieu-Thomas-JOSSET/joke-finetome-model-gguf-phi4-20260112-081758 --jinja
Available Model files:
phi-4.Q8_0.gguf
Ollama
An Ollama Modelfile is included for easy deployment.
This was trained 2x faster with Unsloth

Training artifacts
- Plot (interactive):
reports/training_loss_step.html - Run manifest:
reports/run_manifest.json - Inference sample:
reports/inference_sample.json - Config snapshot:
reports/config_snapshot.json
Inference
This repository contains a GGUF model intended to be used with llama.cpp and/or deployed on Hugging Face Inference Endpoints (llama.cpp container).
Recommended Inference Endpoints knobs:
- Max tokens / request: 1024
- Max concurrent requests: 2
Local llama.cpp (Phi-4 template)
llama-cli -hf Mathieu-Thomas-JOSSET/joke-finetome-model-gguf-phi4-20260112-081758:q8_0 -cnv --chat-template phi4
Hugging Face Inference Endpoint (llama.cpp)
When creating an endpoint, select this repo and the GGUF file <your_model>.gguf (quant: q8_0).
Recommended settings are stored in: inference/endpoint_recipe.json.
Python client example: inference/hf_endpoint_client.py
Description
Model synced from source: Mathieu-Thomas-JOSSET/joke-finetome-model-gguf-phi4-20260112-081758
Languages
HTML
99.8%
Python
0.2%