Support stream=True in v1/completions (#49)
This commit is contained in:
18
README.md
18
README.md
@@ -238,9 +238,25 @@ curl http://localhost:30000/generate \
|
||||
}
|
||||
}'
|
||||
```
|
||||
|
||||
Learn more about the argument format [here](docs/sampling_params.md).
|
||||
|
||||
### OpenAI Compatible API
|
||||
|
||||
In addition, the server supports an experimental OpenAI-compatible API.
|
||||
|
||||
```python
|
||||
import openai
|
||||
client = openai.Client(
|
||||
base_url="http://127.0.0.1:30000/v1", api_key="EMPTY")
|
||||
response = client.completions.create(
|
||||
model="default",
|
||||
prompt="The capital of France is",
|
||||
temperature=0,
|
||||
max_tokens=32,
|
||||
)
|
||||
print(response)
|
||||
```
|
||||
|
||||
### Additional Arguments
|
||||
- Add `--tp 2` to enable tensor parallelism.
|
||||
```
|
||||
|
||||
Reference in New Issue
Block a user