enginex-mthreads-vllm/docs/usage/README.md

# Using vLLM

First, vLLM must be [installed](../getting_started/installation/README.md) for your chosen device in either a Python or Docker environment.

Then, vLLM supports the following usage patterns:

- [Inference and Serving](../serving/offline_inference.md): Run a single instance of a model.
- [Deployment](../deployment/docker.md): Scale up model instances for production.
- [Training](../training/rlhf.md): Train or fine-tune a model.
Sync from v0.13 2026-01-19 10:38:50 +08:00			`# Using vLLM`

			`First, vLLM must be [installed](../getting_started/installation/README.md) for your chosen device in either a Python or Docker environment.`

			`Then, vLLM supports the following usage patterns:`

			`- [Inference and Serving](../serving/offline_inference.md): Run a single instance of a model.`
			`- [Deployment](../deployment/docker.md): Scale up model instances for production.`
			`- [Training](../training/rlhf.md): Train or fine-tune a model.`