[v0.11.0][Doc] Update doc (#3852)
### What this PR does / why we need it? Update doc Signed-off-by: hfadzxy <starmoon_zhang@163.com>
This commit is contained in:
@@ -1,19 +1,19 @@
|
||||
# LLaMA-Factory
|
||||
|
||||
**About / Introduction**
|
||||
**Introduction**
|
||||
|
||||
[LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) is an easy-to-use and efficient platform for training and fine-tuning large language models. With LLaMA-Factory, you can fine-tune hundreds of pre-trained models locally without writing any code.
|
||||
|
||||
LLaMA-Facotory users need to evaluate and inference the model after fine-tuning the model.
|
||||
LLaMA-Facotory users need to evaluate and inference the model after fine-tuning.
|
||||
|
||||
**The Business Challenge**
|
||||
**Business challenge**
|
||||
|
||||
LLaMA-Factory used transformers to perform inference on Ascend NPU, but the speed was slow.
|
||||
LLaMA-Factory uses Transformers to perform inference on Ascend NPUs, but the speed is slow.
|
||||
|
||||
**Solving Challenges and Benefits with vLLM Ascend**
|
||||
**Benefits with vLLM Ascend**
|
||||
|
||||
With the joint efforts of LLaMA-Factory and vLLM Ascend ([LLaMA-Factory#7739](https://github.com/hiyouga/LLaMA-Factory/pull/7739)), the performance of LLaMA-Factory in the model inference stage has been significantly improved. According to the test results, the inference speed of LLaMA-Factory has been increased to 2x compared to the transformers version.
|
||||
With the joint efforts of LLaMA-Factory and vLLM Ascend ([LLaMA-Factory#7739](https://github.com/hiyouga/LLaMA-Factory/pull/7739)), LLaMA-Factory has achieved significant performance gains during model inference. Benchmark results show that its inference speed is now up to 2× faster compared to the Transformers implementation.
|
||||
|
||||
**Learn more**
|
||||
|
||||
See more about LLaMA-Factory and how it uses vLLM Ascend for inference on the Ascend NPU in the following documentation: [LLaMA-Factory Ascend NPU Inference](https://llamafactory.readthedocs.io/en/latest/advanced/npu_inference.html).
|
||||
See more details about LLaMA-Factory and how it uses vLLM Ascend for inference on Ascend NPUs in [LLaMA-Factory Ascend NPU Inference](https://llamafactory.readthedocs.io/en/latest/advanced/npu_inference.html).
|
||||
|
||||
Reference in New Issue
Block a user