13 lines
238 B
Markdown
13 lines
238 B
Markdown
|
|
### Local inference (llama.cpp)
|
||
|
|
|
||
|
|
```bash
|
||
|
|
llama-cli -hf {REPO_ID}:q8_0 -cnv --chat-template phi4
|
||
|
|
```
|
||
|
|
|
||
|
|
### Server (OpenAI-compatible)
|
||
|
|
|
||
|
|
```bash
|
||
|
|
llama-server -hf {REPO_ID}:q8_0
|
||
|
|
# /v1/chat/completions will be available (OpenAI-compatible)
|
||
|
|
```
|