-
-
Notifications
You must be signed in to change notification settings - Fork 4k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Bugfix][Core] Fix tekken edge case for mistral tokenizer
#8640
opened Sep 19, 2024 by
patrickvonplaten
Loading…
[Bugfix] Handle
best_of>1
case by disabling multi-step speculation.
#8637
opened Sep 19, 2024 by
afeldman-nm
Loading…
[Bugfix][Model] enable internvl running with num_scheduler_steps > 1
#8614
opened Sep 19, 2024 by
DefTruth
Loading…
fix validation: Only set tool_choice
auto
if at least one tool is provided
#8568
opened Sep 18, 2024 by
chiragjn
Loading…
Fix typical acceptance sampler with correct recovered token ids
#8562
opened Sep 18, 2024 by
jiqing-feng
Loading…
[Bugfix] Fix potentially unsafe custom allreduce synchronization
#8558
opened Sep 18, 2024 by
hanzhi713
Loading…
[CI/Build] Re-enabling Entrypoints tests on ROCm, excluding ones that fail
ready
ONLY add when PR is ready to merge/full CI is needed
#8551
opened Sep 18, 2024 by
alexeykondrat
Loading…
[Doc] update the debugging document to add more explanation on
gpu_memory_utilization
and CUDA OOM issues
#8541
opened Sep 17, 2024 by
yangalan123
Loading…
[Bugfix] fix OpenAI API server startup with --disable-frontend-multiprocessing
ready
ONLY add when PR is ready to merge/full CI is needed
#8537
opened Sep 17, 2024 by
dtrifiro
Loading…
[Kernel][Model] Varlen prefill + Prefill chunking support for mamba kernels
#8533
opened Sep 17, 2024 by
mzusman
Loading…
[CI/Build][Misc] Comparing between block manager v1 and v2, under full prefix sharing and no prefix sharing case.
ready
ONLY add when PR is ready to merge/full CI is needed
#8528
opened Sep 16, 2024 by
KuntaiDu
Loading…
[dbrx] refactor dbrx experts to extend FusedMoe class
ready
ONLY add when PR is ready to merge/full CI is needed
#8518
opened Sep 16, 2024 by
divakar-amd
Loading…
[Doc] Compatibility matrix for mutual exclusive features
#8512
opened Sep 16, 2024 by
wallashss
Loading…
[Core] Implementing disaggregated prefilling, and caching KV cache in CPU/disk/database.
#8498
opened Sep 16, 2024 by
KuntaiDu
Loading…
[Bugfix] Fix incorrect llava next feature size calculation
#8496
opened Sep 15, 2024 by
zyddnys
Loading…
[Model][VLM] Add LLaVA-Onevision model support
#8486
opened Sep 14, 2024 by
litianjian
Loading…
2 of 3 tasks
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.