Support Thinking Budget (via custom_logit_processor for OpenAI API) [Fix #6572] (#11416)

Signed-off-by: ybyang <ybyang7@iflytek.com>
Co-authored-by: YorkSu <york_su@qq.com>
This commit is contained in:
ybyang
2025-10-21 16:27:56 +08:00
committed by GitHub
parent c1e1600373
commit dbb16bedd5
7 changed files with 239 additions and 1 deletions

View File

@@ -319,3 +319,27 @@ response = requests.post(
)
print(response.json())
```
Send an OpenAI chat completion request:
```python
import openai
from sglang.utils import print_highlight
client = openai.Client(base_url="http://127.0.0.1:30000/v1", api_key="None")
response = client.chat.completions.create(
model="meta-llama/Meta-Llama-3-8B-Instruct",
messages=[
{"role": "user", "content": "List 3 countries and their capitals."},
],
temperature=0.0,
max_tokens=32,
extra_body={
"custom_logit_processor": DeterministicLogitProcessor().to_str(),
"custom_params": {"token_id": 5},
},
)
print_highlight(f"Response: {response}")
```