Files
xc-llm-ascend/docs/source/community/user_stories/llamafactory.md
Cao Yi 6de207de88 [main][Docs] Fix typos across documentation (#6728)
## Summary

Fix typos and improve grammar consistency across 50 documentation files.
 
### Changes include:
- Spelling corrections (e.g., "Facotory" → "Factory", "certainty" →
"determinism")
- Grammar improvements (e.g., "multi-thread" → "multi-threaded",
"re-routed" → "re-run")
- Punctuation fixes (semicolon consistency in filter parameters)
- Code style fixes (correct flag name `--num-prompts` instead of
`--num-prompt`)
- Capitalization consistency (e.g., "python" → "Python", "ascend" →
"Ascend")
- vLLM version: v0.15.0
- vLLM main:
9562912cea

---------

Signed-off-by: SlightwindSec <slightwindsec@gmail.com>
2026-02-13 15:50:05 +08:00

1.1 KiB
Raw Blame History

LLaMA-Factory

Introduction

LLaMA-Factory is an easy-to-use and efficient platform for training and fine-tuning large language models. With LLaMA-Factory, you can fine-tune hundreds of pre-trained models locally without writing any code.

LLaMA-Factory users need to evaluate and inference the model after fine-tuning.

Business challenge

LLaMA-Factory uses Transformers to perform inference on Ascend NPUs, but the speed is slow.

Benefits with vLLM Ascend

With the joint efforts of LLaMA-Factory and vLLM Ascend (LLaMA-Factory#7739), LLaMA-Factory has achieved significant performance gains during model inference. Benchmark results show that its inference speed is now up to 2× faster compared to the Transformers implementation.

Learn more

See more details about LLaMA-Factory and how it uses vLLM Ascend for inference on Ascend NPUs in LLaMA-Factory Ascend NPU Inference.