4.7 KiB
Classification API
This document describes the /v1/classify API endpoint implementation in SGLang, which is compatible with vLLM's classification API format.
Overview
The classification API allows you to classify text inputs using classification models. This implementation follows the same format as vLLM's 0.7.0 classification API.
API Endpoint
POST /v1/classify
Request Format
{
"model": "model_name",
"input": "text to classify"
}
Parameters
model(string, required): The name of the classification model to useinput(string, required): The text to classifyuser(string, optional): User identifier for trackingrid(string, optional): Request ID for trackingpriority(integer, optional): Request priority
Response Format
{
"id": "classify-9bf17f2847b046c7b2d5495f4b4f9682",
"object": "list",
"created": 1745383213,
"model": "jason9693/Qwen2.5-1.5B-apeach",
"data": [
{
"index": 0,
"label": "Default",
"probs": [0.565970778465271, 0.4340292513370514],
"num_classes": 2
}
],
"usage": {
"prompt_tokens": 10,
"total_tokens": 10,
"completion_tokens": 0,
"prompt_tokens_details": null
}
}
Response Fields
id: Unique identifier for the classification requestobject: Always "list"created: Unix timestamp when the request was createdmodel: The model used for classificationdata: Array of classification resultsindex: Index of the resultlabel: Predicted class labelprobs: Array of probabilities for each classnum_classes: Total number of classes
usage: Token usage informationprompt_tokens: Number of input tokenstotal_tokens: Total number of tokenscompletion_tokens: Number of completion tokens (always 0 for classification)prompt_tokens_details: Additional token details (optional)
Example Usage
Using curl
curl -v "http://127.0.0.1:8000/v1/classify" \
-H "Content-Type: application/json" \
-d '{
"model": "jason9693/Qwen2.5-1.5B-apeach",
"input": "Loved the new café—coffee was great."
}'
Using Python
import requests
import json
# Make classification request
response = requests.post(
"http://127.0.0.1:8000/v1/classify",
headers={"Content-Type": "application/json"},
json={
"model": "jason9693/Qwen2.5-1.5B-apeach",
"input": "Loved the new café—coffee was great."
}
)
# Parse response
result = response.json()
print(json.dumps(result, indent=2))
Supported Models
The classification API works with any classification model supported by SGLang, including:
Classification Models (Multi-class)
LlamaForSequenceClassification- Multi-class classificationQwen2ForSequenceClassification- Multi-class classificationQwen3ForSequenceClassification- Multi-class classificationBertForSequenceClassification- Multi-class classificationGemma2ForSequenceClassification- Multi-class classification
Label Mapping: The API automatically uses the id2label mapping from the model's config.json file to provide meaningful label names instead of generic class names. If id2label is not available, it falls back to LABEL_0, LABEL_1, etc., or Class_0, Class_1 as a last resort.
Reward Models (Single score)
InternLM2ForRewardModel- Single reward scoreQwen2ForRewardModel- Single reward scoreLlamaForSequenceClassificationWithNormal_Weights- Special reward model
Note: The /classify endpoint in SGLang was originally designed for reward models but now supports all non-generative models. Our /v1/classify endpoint provides a standardized vLLM-compatible interface for classification tasks.
Error Handling
The API returns appropriate HTTP status codes and error messages:
400 Bad Request: Invalid request format or missing required fields500 Internal Server Error: Server-side processing error
Error response format:
{
"error": "Error message",
"type": "error_type",
"code": 400
}
Implementation Details
The classification API is implemented using:
- Rust Router: Handles routing and request/response models in
sgl-router/src/protocols/spec.rs - Python HTTP Server: Implements the actual endpoint in
python/sglang/srt/entrypoints/http_server.py - Classification Service: Handles the classification logic in
python/sglang/srt/entrypoints/openai/serving_classify.py
Testing
Use the provided test script to verify the implementation:
python test_classify_api.py
Compatibility
This implementation is compatible with vLLM's classification API format, allowing seamless migration from vLLM to SGLang for classification tasks.