Commit Graph

  • a8d633f629 [Bugfix] fix import error (#600) paulyu12 2025-04-22 08:57:25 +08:00
  • 0ae9ee0f8a [BUGFIX] main-sd-bugfix && [UT] add mtp UT (#593) wemaster 2025-04-21 19:25:51 +08:00
  • 5442b463fd add doc for patch_config (#574) Shuqiao Li 2025-04-21 10:33:38 +08:00
  • 96d6fa7c90 [Docker] Fix openEuler image suffix (#586) Yikun Jiang 2025-04-21 08:55:26 +08:00
  • 12cae04db9 [quantization] Support w8a8 quantization (#580) Yikun Jiang 2025-04-20 18:14:05 +08:00
  • 1a1f9a6d89 port deepseekv2 and mtp to main branch (#429) Pleaplusone 2025-04-19 17:38:18 +08:00
  • 086423dc35 [Docker] Bump Dockerfile version to v0.8.4 (#577) Yikun Jiang 2025-04-18 19:15:17 +08:00
  • a127cc83f8 catch ImportError when C code not compiled (#575) Shuqiao Li 2025-04-18 18:11:49 +08:00
  • 985b0548b0 [Doc] Update v0.8.4 release note, add contents for structured output feature (#576) Shanshan Shen 2025-04-18 17:44:16 +08:00
  • 65c1f4579f [V1][Structured Output] Add apply_grammar_bitmask() method to model runner (#555) Shanshan Shen 2025-04-18 16:47:55 +08:00
  • 2c903bc7ac [Doc] Update doc for custom ops build (#570) Mengqing Cao 2025-04-18 15:35:10 +08:00
  • b91f9a5afd [Doc][Build] Update build doc and faq (#568) Mengqing Cao 2025-04-18 14:16:41 +08:00
  • e66ded5679 [Doc] Add release note for 0.8.4rc1 (#557) wangxiyuan 2025-04-18 13:24:36 +08:00
  • 7eeff60715 [Doc] Update FAQ doc (#561) Shanshan Shen 2025-04-18 13:13:13 +08:00
  • 84563fc65d Add sleep mode feature for Ascend NPU (#513) Shuqiao Li 2025-04-18 13:11:39 +08:00
  • 42c7fbb10e [Misc] Fix import error and address nits to make CI happy (#563) wangxiyuan 2025-04-18 12:23:32 +08:00
  • 66a0837963 adopt rope in vllm-ascend (#530) Pleaplusone 2025-04-18 08:56:05 +08:00
  • 23f85e3f74 [BugFix] Fix scheduler problems in last PR. (#558) whx 2025-04-18 08:49:48 +08:00
  • 6ee7f5cf71 [SpecDecode] Add spec decode support (#500) Mengqing Cao 2025-04-17 20:16:32 +08:00
  • b71f193cb0 [Model][Doc] Update model support list (#552) Mengqing Cao 2025-04-17 19:32:20 +08:00
  • 20dff4deff [Scheduler] Add AscendScheduler. (#543) whx 2025-04-17 19:31:50 +08:00
  • 697908f5cd [Platform][Worker][ModelRunner] Add LoRA & Multi-LoRA support (#521) paulyu12 2025-04-17 16:48:46 +08:00
  • 9935d45728 [CI]Add model basic accuracy test(Qwen2.5-0.5B-Instruct) (#460) hfadzxy 2025-04-17 14:59:56 +08:00
  • c3d1a3782a Add pyhccl (#503) Huazhong Ji 2025-04-17 14:57:52 +08:00
  • 64fdf4cbef [Doc]Update faq (#536) Li Wang 2025-04-17 14:56:51 +08:00
  • 6061f33670 [Bugfix][Model] Fix api in DeepSeek model (#545) Mengqing Cao 2025-04-17 11:56:05 +08:00
  • 9859e7313f [CI]Add global env to runner (#537) Li Wang 2025-04-17 10:08:00 +08:00
  • 00de2ee6ad [Doc] update faq about progress bar display issue (#538) hfadzxy 2025-04-16 16:07:08 +08:00
  • fe13cd9ea5 [Doc] update faq about w8a8 (#534) Mengqing Cao 2025-04-16 09:37:21 +08:00
  • 415ed027fa [V1][Platform] Remove supports_structured_output() in platform (#531) Shanshan Shen 2025-04-16 09:30:33 +08:00
  • bbe7ccd366 [MISC] Add patch module (#526) wangxiyuan 2025-04-16 09:28:58 +08:00
  • 434749d299 [CI] update 0.8.3 to 0.8.4 (#528) wangxiyuan 2025-04-16 09:26:30 +08:00
  • 13480d1238 [CI]Fix workflow (#532) Li Wang 2025-04-15 19:55:41 +08:00
  • bcbc04f92b [Doc] Add environment variables doc (#519) Shanshan Shen 2025-04-15 16:09:36 +08:00
  • 44a8301424 [Feature] Add PD separation feature (#432) eeethenQ 2025-04-15 15:11:35 +08:00
  • c7f6584d75 [V1] clean up V1 code (#505) wangxiyuan 2025-04-15 10:24:02 +08:00
  • f6af1d2471 [MISC] fix logger (#515) wangxiyuan 2025-04-15 10:18:05 +08:00
  • 5c6d79687c [Doc] Update FAQ (#518) wangxiyuan 2025-04-15 10:17:56 +08:00
  • 5fa70b6393 [Build] Update doc (#509) wangxiyuan 2025-04-14 14:38:50 +08:00
  • 11ecbfdb31 [Doc] Update FAQ doc (#504) Shanshan Shen 2025-04-14 11:11:40 +08:00
  • 9c7428b3d5 [CI] enable custom ops build (#466) wangxiyuan 2025-04-12 10:24:53 +08:00
  • d05ea17427 Add openEuler based container image for vLLM Ascend (#489) Icey 2025-04-10 14:30:49 +08:00
  • afdbf77483 [CI] Add new runner and enable QwQ multinpu test (#417) Li Wang 2025-04-08 16:52:45 +08:00
  • 5d6239306b [DOC] Update multi_node.md (#468) jinyuxin 2025-04-08 14:19:57 +08:00
  • f6cf92e7d5 [quant][bugfix] fix deepseek quant bug (#478) Mengqing Cao 2025-04-08 09:15:56 +08:00
  • 579d858a20 Set torchvision<0.21.0 to match torch/torch_npu version (#479) Yikun Jiang 2025-04-08 09:15:42 +08:00
  • 1d88dacf9f [V1][Platform] Add supports_structured_output() method to Platform (#475) Shanshan Shen 2025-04-07 19:11:51 +08:00
  • adabdeea7f Set numpy < 2.0.0 to resolve numpy VersionConflict (#476) Yikun Jiang 2025-04-07 16:07:21 +08:00
  • 344228a5da [deepseek][bugfix] support deepseek quant (#469) Mengqing Cao 2025-04-07 10:56:12 +08:00
  • 3f9752f8ee [Bugfix]Lazy import vllm config (#462) Li Wang 2025-04-03 16:03:08 +08:00
  • ce8259975e [core] Support custom ascendc kernels in vllm-ascend (#233) Pleaplusone 2025-04-03 14:52:34 +08:00
  • 14d9a64047 [ModelRunner][V1] Optimize V1 attention mask (#442) Shanshan Shen 2025-04-02 10:33:53 +08:00
  • 94bf9c379e [Doc]Add developer guide for using lm-eval (#456) hfadzxy 2025-04-01 23:43:51 +08:00
  • 78083d405e Bump actions/setup-python from 5.4.0 to 5.5.0 (#440) dependabot[bot] 2025-04-01 14:34:33 +08:00
  • 2dbd763584 [CI] Fix mypy CI (#443) Mengqing Cao 2025-04-01 09:25:33 +08:00
  • c42e21a5aa [Docs] Add install system dependencies in install doc (#438) Yikun Jiang 2025-03-31 14:17:55 +08:00
  • 7beb4339dc [Doc]Add developer guide for using OpenCompass (#368) hfadzxy 2025-03-31 00:24:25 +08:00
  • b6499ed97d [CI] Use CI pool (#428) wangxiyuan 2025-03-29 12:42:59 +08:00
  • ca8b1c3e47 [Doc] Add 0.7.3rc2 release note (#419) wangxiyuan 2025-03-29 09:02:08 +08:00
  • 31f29b9f30 [Core] Make V1 work and enable V1 engine test (#389) wangxiyuan 2025-03-28 19:34:23 +08:00
  • 57a84bb7be [Bug Fix] Fix bug of platform for parameter checking (#411) wuhuikx 2025-03-28 16:31:27 +08:00
  • b1557abab6 fix multistep bug,remove uselesscodes (#355) Tony 2025-03-28 09:55:35 +08:00
  • 1864c40520 Add vLLM Ascend Weekly meeting link (#400) Yikun Jiang 2025-03-27 09:00:21 +08:00
  • 4804b74e95 Update 110-user-story.yml (#402) Zhenyu Zheng 2025-03-27 08:58:57 +08:00
  • 0b5a9643fd Add an example for user stories (#399) Zhenyu Zheng 2025-03-26 16:25:57 +08:00
  • 122505208f FastPatch: Optimized Patch Embedding for Qwen2VL (#345) BAI Fan 2025-03-26 14:28:20 +08:00
  • d4accf4ec2 [Doc][Model] update LLaVA 1.6 support (#373) Mengqing Cao 2025-03-26 09:07:55 +08:00
  • 6295d2e9bc [CI/Build][Doc] upgrade torch-npu to 0320 (#392) Mengqing Cao 2025-03-26 09:04:12 +08:00
  • 3fb3b5cf75 [Doc] Update model support doc (add QwQ-32B) (#388) Shanshan Shen 2025-03-25 11:40:50 +08:00
  • 8996733307 [CI] fix vllm test (#365) Mengqing Cao 2025-03-24 16:09:06 +08:00
  • 89ca63a2c2 [Bugfix] Disable torch.compile() (#370) Shanshan Shen 2025-03-21 15:55:51 +08:00
  • 9a175ca0fc [Doc]Add benchmark scripts (#74) Li Wang 2025-03-21 15:54:34 +08:00
  • befbee5883 Update README and add collect_env info (#369) wangxiyuan 2025-03-21 15:43:43 +08:00
  • 243ed4da69 Add vLLM forum info and update readme (#366) Yikun Jiang 2025-03-21 09:32:42 +08:00
  • c06af8b2e0 [V1][Core] Add support for V1 Engine (#295) Shanshan Shen 2025-03-20 19:34:44 +08:00
  • 663dca7578 [CI] fix race condition problem (#353) wangxiyuan 2025-03-19 17:04:36 +08:00
  • 441a62e937 [Doc] Fix bugs of installation doc and format tool (#330) Shanshan Shen 2025-03-14 10:21:35 +08:00
  • ac1ba1d8d2 [Build] Fix x86 image build (#327) wangxiyuan 2025-03-14 09:41:57 +08:00
  • c25631ec7b [Doc] Add the release note for 0.7.3rc1 (#285) wangxiyuan 2025-03-13 17:57:06 +08:00
  • 41aba1cfc1 [Doc]Fix tutorial doc expression (#319) Li Wang 2025-03-13 15:24:05 +08:00
  • 59ea23d0d3 [Doc] Add Single NPU (Qwen2.5-VL-7B) tutorial (#311) xiemingda 2025-03-12 20:37:12 +08:00
  • 7330416de3 [BugFix] Fix bugs when using ascend quantization (#275) Angazenn 2025-03-12 11:33:21 +08:00
  • 5c7a95b01d [Attn] Support encoder-only attention with torch sdpa (#290) Mengqing Cao 2025-03-12 08:57:29 +08:00
  • 12aa7115b5 bugfix for qwen2_vl (#301) zouyida2002 2025-03-12 08:39:50 +08:00
  • 9450e9811b [CI] Uninstall triton in dockerfile (#298) wangxiyuan 2025-03-12 07:14:57 +08:00
  • 0db6670bfa [Feature] Implement EP-compatible fused_moe (#121) yiz-liu 2025-03-11 21:08:02 +08:00
  • 4c9d78a035 support multistep decode (#299) Tony 2025-03-11 19:20:06 +08:00
  • feb6bdb12e [Platform][Model Runner] Add hash of request_ids; Change blocksize back to 128. (#293) whx 2025-03-11 18:50:28 +08:00
  • 007aeaa48b [Doc] Change distributed_executor_backend to mp (#287) Yikun Jiang 2025-03-10 11:27:26 +08:00
  • 38334f5daa [Docs] Re-arch on doc and make QwQ doc work (#271) Yikun Jiang 2025-03-10 09:27:48 +08:00
  • 18bb8d1f52 Adapt vLLM requirements changes to fix main CI (#279) Yikun Jiang 2025-03-09 16:07:45 +08:00
  • 268da28961 Pin modelscope<1.23.0 on vLLM v0.7.3 (#272) Yikun Jiang 2025-03-09 15:59:42 +08:00
  • be58d5f3d8 Bump torch_npu version to dev20250308.3 (#276) Yikun Jiang 2025-03-09 15:59:15 +08:00
  • 91f7d8115d [CI/Build] Bump torch_npu to dev20250307.3 (#265) Mengqing Cao 2025-03-07 20:34:07 +08:00
  • faf8cd89cb register qwen2_vl to rewrite qwen2_vl forwad (#241) zouyida2002 2025-03-07 15:41:47 +08:00
  • 35cb7b5234 [CI] Add dispatch job to leverage dynamic devices (#251) Yikun Jiang 2025-03-07 09:47:13 +08:00
  • 3217f0d10f [Feature] Modify description and api for ascend quantization (#243) Angazenn 2025-03-06 15:17:25 +08:00
  • cff08f9df8 [Doc] Add initial FAQs (#247) Yikun Jiang 2025-03-06 10:42:42 +08:00
  • dcd0005058 [Fix] Remove npu_group_topk before CANN version update (#242) HongtaoYang 2025-03-06 09:02:46 +08:00
  • 0d3463400a [Performance] Change the shape of kv_cache to avoid view of k_cache and v_cache. (#204) whx 2025-03-05 10:51:07 +08:00