[Doc] Refactor and init user story page (#1224)

### What this PR does / why we need it?
This PR refactor the user stories page:
- Move it to community
- Add initial info of LLaMA-Factory, Huggingface/trl, MindIE Turbo,
GPUStack, verl
- Add a new page for LLaMA-Factory

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Preview locally

Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
This commit is contained in:
Yikun Jiang
2025-06-17 09:36:35 +08:00
committed by GitHub
parent 9d3cbc0953
commit 05dec7eda9
5 changed files with 39 additions and 44 deletions

View File

@@ -0,0 +1,19 @@
# LLaMA-Factory
**About / Introduction**
[LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) is an easy-to-use and efficient platform for training and fine-tuning large language models. With LLaMA-Factory, you can fine-tune hundreds of pre-trained models locally without writing any code.
LLaMA-Facotory users need to evaluate and inference the model after fine-tuning the model.
**The Business Challenge**
LLaMA-Factory used transformers to perform inference on Ascend NPU, but the speed was slow.
**Solving Challenges and Benefits with vLLM Ascend**
With the joint efforts of LLaMA-Factory and vLLM Ascend ([LLaMA-Factory#7739](https://github.com/hiyouga/LLaMA-Factory/pull/7739)), the performance of LLaMA-Factory in the model inference stage has been significantly improved. According to the test results, the inference speed of LLaMA-Factory has been increased to 2x compared to the transformers version.
**Learn more**
See more about LLaMA-Factory and how it uses vLLM Ascend for inference on the Ascend NPU in the following documentation: [LLaMA-Factory Ascend NPU Inference](https://llamafactory.readthedocs.io/en/latest/advanced/npu_inference.html).