Support stream=True in v1/completions (#49)

2024-01-18 17:00:56 -08:00
parent 98a3e8ef78
commit 61d4c93962
7 changed files with 233 additions and 39 deletions
--- a/README.md
+++ b/README.md
@@ -238,9 +238,25 @@ curl http://localhost:30000/generate \
    }
  }'
 ```
-
 Learn more about the argument format [here](docs/sampling_params.md).

+### OpenAI Compatible API
+
+In addition, the server supports an experimental OpenAI-compatible API.
+
+```python
+import openai
+client = openai.Client(
+    base_url="http://127.0.0.1:30000/v1", api_key="EMPTY")
+response = client.completions.create(
+	model="default",
+	prompt="The capital of France is",
+	temperature=0,
+	max_tokens=32,
+)
+print(response)
+```
+
 ### Additional Arguments
 - Add `--tp 2` to enable tensor parallelism.
 ```