Support stream=True in v1/completions (#49)

This commit is contained in:
Cody Yu
2024-01-18 17:00:56 -08:00
committed by GitHub
parent 98a3e8ef78
commit 61d4c93962
7 changed files with 233 additions and 39 deletions

View File

@@ -238,9 +238,25 @@ curl http://localhost:30000/generate \
}
}'
```
Learn more about the argument format [here](docs/sampling_params.md).
### OpenAI Compatible API
In addition, the server supports an experimental OpenAI-compatible API.
```python
import openai
client = openai.Client(
base_url="http://127.0.0.1:30000/v1", api_key="EMPTY")
response = client.completions.create(
model="default",
prompt="The capital of France is",
temperature=0,
max_tokens=32,
)
print(response)
```
### Additional Arguments
- Add `--tp 2` to enable tensor parallelism.
```