Commit Graph

  • affca6f348 [Test] Add accuracy test report workflow (#542) hfadzxy 2025-04-30 14:53:58 +08:00
  • ba9714ccee Optimize qwen2_vl and qwen2_5_vl (#701) zouyida2052 2025-04-30 14:22:38 +08:00
  • 90aabaeb2e [Doc] Add benchmark guide (#635) Li Wang 2025-04-30 09:17:59 +08:00
  • f8350569e6 [CI] upgrade vllm to 0.8.5 (#715) wangxiyuan 2025-04-30 09:15:50 +08:00
  • 95e7aa4736 [Platform] format platform to make it more clear (#610) wangxiyuan 2025-04-30 09:03:10 +08:00
  • b917361ca5 [MISC] Clean up torch_npu (#688) wangxiyuan 2025-04-29 18:03:38 +08:00
  • 0329fad927 [Perf] Deepseekv3 performance optimization for eager mode (#598) Pleaplusone 2025-04-29 17:12:03 +08:00
  • 87975fa058 [Bugfix] Fix early return in CustomDeepseekV2MoE.forward during profile_run (#682) ApsarasX 2025-04-29 17:06:19 +08:00
  • 7aee9228f0 [CI] Add nightly CI (#668) Li Wang 2025-04-29 16:35:52 +08:00
  • d6be63e11d [CI] Add Qwen3-0.6B-Base test (#717) Li Wang 2025-04-29 14:35:19 +08:00
  • 0dae55a9a3 [MISC] fix format check error (#654) wangxiyuan 2025-04-29 11:14:19 +08:00
  • 1fce70a2fb [Model] Support common fused moe ops for moe model, such as Qwen3Moe (#709) wangxiyuan 2025-04-28 21:57:01 +08:00
  • 40bd602485 [Feature] Use reshape_and_cache fused op (#706) Jade Zheng 2025-04-28 21:54:42 +08:00
  • d39855b075 Update installation and tutorial doc (#711) Yikun Jiang 2025-04-28 21:52:17 +08:00
  • 5995d23532 [Doc] Add 0.8.4rc2 release note (#705) wangxiyuan 2025-04-28 21:51:35 +08:00
  • 54c0e63df7 [MTP] follow custom deepseek modeling changes to support graph mode (#636) wemaster 2025-04-28 21:18:53 +08:00
  • be9e3e8545 [Bugfix] Fix triton placeholder patch period (#704) Mengqing Cao 2025-04-28 18:52:03 +08:00
  • 58f9d932d3 [Doc] Update faqs (#699) Li Wang 2025-04-28 18:48:23 +08:00
  • d0a0c81ced [Doc] Add deepsee-v2-lite w8a8 quantization turorial (#630) Li Wang 2025-04-28 17:14:26 +08:00
  • 5de3646522 [MISC] Make vllm version configurable (#651) wangxiyuan 2025-04-28 14:19:06 +08:00
  • 8849cf1eda Bump actions/setup-python from 5.5.0 to 5.6.0 (#697) dependabot[bot] 2025-04-28 14:06:38 +08:00
  • ee7a0e2cd4 Update openEuler dockerfile for COMPILE_CUSTOM_KERNELS=1 (#689) Icey 2025-04-28 11:45:46 +08:00
  • 38f34e359f [Fix] fix deepseek v0 attention eager mode (#671) Pleaplusone 2025-04-28 08:53:06 +08:00
  • 413657ae43 [FOLLOWUP][DOC] Fix pip install cmd in installation.md (#680) Yikun Jiang 2025-04-27 18:37:25 +08:00
  • 2e20797934 [BUILD] Upgrade torch-npu to 2.5.1 (#661) Yikun Jiang 2025-04-27 17:28:29 +08:00
  • fa4a5d980e [Bugfix] Remove redundant tensor creation and unused code (#656) Jade Zheng 2025-04-27 14:09:16 +08:00
  • ba3d8aae94 [Model][MiniCPM] support MiniCPM (#645) Mengqing Cao 2025-04-27 11:27:24 +08:00
  • 742f679c7d Remove prompt string from engine core data structures (#663) Yikun Jiang 2025-04-26 23:15:58 +08:00
  • c99c4c8c70 [Doc] Update feature support list (#650) wangxiyuan 2025-04-26 10:27:29 +08:00
  • 3879d9cad9 [CI] Fix sample backward compatibility problem (#648) wangxiyuan 2025-04-25 11:53:26 +08:00
  • d785e78563 [V1] Make V1 engine backward compatible (#637) yiz-liu 2025-04-24 17:20:11 +08:00
  • bd70ce828c [CI] Add qwen2.5-vl test (#643) Li Wang 2025-04-24 17:12:12 +08:00
  • a9c6b52205 [Bugfix] Fix qwen2.5-vl positon input bug (#639) Li Wang 2025-04-24 15:21:57 +08:00
  • 866ce7168c [Benchmark] Download model from modelscope (#634) Li Wang 2025-04-24 14:48:24 +08:00
  • 05bdcbeae4 support aclgraph (#426) Bug Hunter Yan 2025-04-23 20:56:24 +08:00
  • 5c6d05a59e support deepseek quant & mix-parallel with graphmode (#585) zzzzwwjj 2025-04-23 16:23:25 +08:00
  • e74331a1ed Add dp initialize patch with hccl backend (#626) Pleaplusone 2025-04-23 15:47:51 +08:00
  • 848e041a54 Using EvalScope evaluation (#611) RongRongStudio 2025-04-23 00:50:09 +08:00
  • 4a0ce3660e [Misc] Remove some parts of metrics patch (#603) Shanshan Shen 2025-04-22 18:45:21 +08:00
  • cf6ab42ee2 [CI]Add guided decoding test (#422) Li Wang 2025-04-22 17:50:06 +08:00
  • 538a69c145 [Patch] format patch module to make it more clear (#601) wangxiyuan 2025-04-22 14:13:00 +08:00
  • ad845bfe82 fix doc to mention env setting for v0.7.3-dev (#602) Shuqiao Li 2025-04-22 14:11:41 +08:00
  • d12a057df8 Add note for deepseek related docs and remove unnecessary comments (#590) Pleaplusone 2025-04-22 09:59:09 +08:00
  • c5850d302d [Doc] Update installation (#596) Mengqing Cao 2025-04-22 09:04:20 +08:00
  • a8d633f629 [Bugfix] fix import error (#600) paulyu12 2025-04-22 08:57:25 +08:00
  • 0ae9ee0f8a [BUGFIX] main-sd-bugfix && [UT] add mtp UT (#593) wemaster 2025-04-21 19:25:51 +08:00
  • 5442b463fd add doc for patch_config (#574) Shuqiao Li 2025-04-21 10:33:38 +08:00
  • 96d6fa7c90 [Docker] Fix openEuler image suffix (#586) Yikun Jiang 2025-04-21 08:55:26 +08:00
  • 12cae04db9 [quantization] Support w8a8 quantization (#580) Yikun Jiang 2025-04-20 18:14:05 +08:00
  • 1a1f9a6d89 port deepseekv2 and mtp to main branch (#429) Pleaplusone 2025-04-19 17:38:18 +08:00
  • 086423dc35 [Docker] Bump Dockerfile version to v0.8.4 (#577) Yikun Jiang 2025-04-18 19:15:17 +08:00
  • a127cc83f8 catch ImportError when C code not compiled (#575) Shuqiao Li 2025-04-18 18:11:49 +08:00
  • 985b0548b0 [Doc] Update v0.8.4 release note, add contents for structured output feature (#576) Shanshan Shen 2025-04-18 17:44:16 +08:00
  • 65c1f4579f [V1][Structured Output] Add apply_grammar_bitmask() method to model runner (#555) Shanshan Shen 2025-04-18 16:47:55 +08:00
  • 2c903bc7ac [Doc] Update doc for custom ops build (#570) Mengqing Cao 2025-04-18 15:35:10 +08:00
  • b91f9a5afd [Doc][Build] Update build doc and faq (#568) Mengqing Cao 2025-04-18 14:16:41 +08:00
  • e66ded5679 [Doc] Add release note for 0.8.4rc1 (#557) wangxiyuan 2025-04-18 13:24:36 +08:00
  • 7eeff60715 [Doc] Update FAQ doc (#561) Shanshan Shen 2025-04-18 13:13:13 +08:00
  • 84563fc65d Add sleep mode feature for Ascend NPU (#513) Shuqiao Li 2025-04-18 13:11:39 +08:00
  • 42c7fbb10e [Misc] Fix import error and address nits to make CI happy (#563) wangxiyuan 2025-04-18 12:23:32 +08:00
  • 66a0837963 adopt rope in vllm-ascend (#530) Pleaplusone 2025-04-18 08:56:05 +08:00
  • 23f85e3f74 [BugFix] Fix scheduler problems in last PR. (#558) whx 2025-04-18 08:49:48 +08:00
  • 6ee7f5cf71 [SpecDecode] Add spec decode support (#500) Mengqing Cao 2025-04-17 20:16:32 +08:00
  • b71f193cb0 [Model][Doc] Update model support list (#552) Mengqing Cao 2025-04-17 19:32:20 +08:00
  • 20dff4deff [Scheduler] Add AscendScheduler. (#543) whx 2025-04-17 19:31:50 +08:00
  • 697908f5cd [Platform][Worker][ModelRunner] Add LoRA & Multi-LoRA support (#521) paulyu12 2025-04-17 16:48:46 +08:00
  • 9935d45728 [CI]Add model basic accuracy test(Qwen2.5-0.5B-Instruct) (#460) hfadzxy 2025-04-17 14:59:56 +08:00
  • c3d1a3782a Add pyhccl (#503) Huazhong Ji 2025-04-17 14:57:52 +08:00
  • 64fdf4cbef [Doc]Update faq (#536) Li Wang 2025-04-17 14:56:51 +08:00
  • 6061f33670 [Bugfix][Model] Fix api in DeepSeek model (#545) Mengqing Cao 2025-04-17 11:56:05 +08:00
  • 9859e7313f [CI]Add global env to runner (#537) Li Wang 2025-04-17 10:08:00 +08:00
  • 00de2ee6ad [Doc] update faq about progress bar display issue (#538) hfadzxy 2025-04-16 16:07:08 +08:00
  • fe13cd9ea5 [Doc] update faq about w8a8 (#534) Mengqing Cao 2025-04-16 09:37:21 +08:00
  • 415ed027fa [V1][Platform] Remove supports_structured_output() in platform (#531) Shanshan Shen 2025-04-16 09:30:33 +08:00
  • bbe7ccd366 [MISC] Add patch module (#526) wangxiyuan 2025-04-16 09:28:58 +08:00
  • 434749d299 [CI] update 0.8.3 to 0.8.4 (#528) wangxiyuan 2025-04-16 09:26:30 +08:00
  • 13480d1238 [CI]Fix workflow (#532) Li Wang 2025-04-15 19:55:41 +08:00
  • bcbc04f92b [Doc] Add environment variables doc (#519) Shanshan Shen 2025-04-15 16:09:36 +08:00
  • 44a8301424 [Feature] Add PD separation feature (#432) eeethenQ 2025-04-15 15:11:35 +08:00
  • c7f6584d75 [V1] clean up V1 code (#505) wangxiyuan 2025-04-15 10:24:02 +08:00
  • f6af1d2471 [MISC] fix logger (#515) wangxiyuan 2025-04-15 10:18:05 +08:00
  • 5c6d79687c [Doc] Update FAQ (#518) wangxiyuan 2025-04-15 10:17:56 +08:00
  • 5fa70b6393 [Build] Update doc (#509) wangxiyuan 2025-04-14 14:38:50 +08:00
  • 11ecbfdb31 [Doc] Update FAQ doc (#504) Shanshan Shen 2025-04-14 11:11:40 +08:00
  • 9c7428b3d5 [CI] enable custom ops build (#466) wangxiyuan 2025-04-12 10:24:53 +08:00
  • d05ea17427 Add openEuler based container image for vLLM Ascend (#489) Icey 2025-04-10 14:30:49 +08:00
  • afdbf77483 [CI] Add new runner and enable QwQ multinpu test (#417) Li Wang 2025-04-08 16:52:45 +08:00
  • 5d6239306b [DOC] Update multi_node.md (#468) jinyuxin 2025-04-08 14:19:57 +08:00
  • f6cf92e7d5 [quant][bugfix] fix deepseek quant bug (#478) Mengqing Cao 2025-04-08 09:15:56 +08:00
  • 579d858a20 Set torchvision<0.21.0 to match torch/torch_npu version (#479) Yikun Jiang 2025-04-08 09:15:42 +08:00
  • 1d88dacf9f [V1][Platform] Add supports_structured_output() method to Platform (#475) Shanshan Shen 2025-04-07 19:11:51 +08:00
  • adabdeea7f Set numpy < 2.0.0 to resolve numpy VersionConflict (#476) Yikun Jiang 2025-04-07 16:07:21 +08:00
  • 344228a5da [deepseek][bugfix] support deepseek quant (#469) Mengqing Cao 2025-04-07 10:56:12 +08:00
  • 3f9752f8ee [Bugfix]Lazy import vllm config (#462) Li Wang 2025-04-03 16:03:08 +08:00
  • ce8259975e [core] Support custom ascendc kernels in vllm-ascend (#233) Pleaplusone 2025-04-03 14:52:34 +08:00
  • 14d9a64047 [ModelRunner][V1] Optimize V1 attention mask (#442) Shanshan Shen 2025-04-02 10:33:53 +08:00
  • 94bf9c379e [Doc]Add developer guide for using lm-eval (#456) hfadzxy 2025-04-01 23:43:51 +08:00
  • 78083d405e Bump actions/setup-python from 5.4.0 to 5.5.0 (#440) dependabot[bot] 2025-04-01 14:34:33 +08:00
  • 2dbd763584 [CI] Fix mypy CI (#443) Mengqing Cao 2025-04-01 09:25:33 +08:00
  • c42e21a5aa [Docs] Add install system dependencies in install doc (#438) Yikun Jiang 2025-03-31 14:17:55 +08:00