Files
joke-finetome-model-gguf-ph…/inference/llama_cli_examples.md

13 lines
238 B
Markdown
Raw Normal View History

### Local inference (llama.cpp)
```bash
llama-cli -hf {REPO_ID}:q8_0 -cnv --chat-template phi4
```
### Server (OpenAI-compatible)
```bash
llama-server -hf {REPO_ID}:q8_0
# /v1/chat/completions will be available (OpenAI-compatible)
```