vllm-project / vllm Public

Notifications You must be signed in to change notification settings
Fork 4k
Star 27k

Code
Issues 1.6k
Pull requests 411
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Pull requests: vllm-project/vllm

Labels 49 Milestones 0

New pull request New

411 Open 3,355 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[Bugfix][Core] Fix tekken edge case for mistral tokenizer

#8640 opened Sep 19, 2024 by patrickvonplaten

Loading…

[Bugfix] Handle best_of>1 case by disabling multi-step speculation.

#8637 opened Sep 19, 2024 by afeldman-nm

Loading…

[Bugfix][Model] enable internvl running with num_scheduler_steps > 1

#8614 opened Sep 19, 2024 by DefTruth

Loading…

[Misc] guard against change in cuda library name

#8609 opened Sep 19, 2024 by bnellnm

Loading…

Support FP8 MoE for compressed-tensors

#8588 opened Sep 19, 2024 by mgoin • Draft

[Bugfix] Move health checks to separate thread

#8583 opened Sep 18, 2024 by joerunde

Loading…

[Core] Allow IPv6 in VLLM_HOST_IP with zmq

#8575 opened Sep 18, 2024 by russellb

Loading…

[Bugfix] Fix Phi3.5 mini and MoE LoRA inference

#8571 opened Sep 18, 2024 by garg-amit

Loading…

fix validation: Only set tool_choice auto if at least one tool is provided

#8568 opened Sep 18, 2024 by chiragjn

Loading…

Fix typical acceptance sampler with correct recovered token ids

#8562 opened Sep 18, 2024 by jiqing-feng

Loading…

[Bugfix] Fix potentially unsafe custom allreduce synchronization

#8558 opened Sep 18, 2024 by hanzhi713

Loading…

[MISC] add support custom_op check

#8557 opened Sep 18, 2024 by jikunshang

Loading…

[Misc] Fix api_server args

#8556 opened Sep 18, 2024 by Juelianqvq

Loading…

[CI/Build] Re-enabling Entrypoints tests on ROCm, excluding ones that fail ready

ONLY add when PR is ready to merge/full CI is needed

#8551 opened Sep 18, 2024 by alexeykondrat

Loading…

[Bugfix] Validate SamplingParam n is an int

#8548 opened Sep 17, 2024 by saumya-saran

Loading…

[Doc] update the debugging document to add more explanation on gpu_memory_utilization and CUDA OOM issues

#8541 opened Sep 17, 2024 by yangalan123

Loading…

[Bugfix] fix OpenAI API server startup with --disable-frontend-multiprocessing ready

ONLY add when PR is ready to merge/full CI is needed

#8537 opened Sep 17, 2024 by dtrifiro

Loading…

[Kernel][Model] Varlen prefill + Prefill chunking support for mamba kernels

#8533 opened Sep 17, 2024 by mzusman

Loading…

ppc64le: Dockerfile and CI fix

#8529 opened Sep 17, 2024 by sumitd2

Loading…

[CI/Build][Misc] Comparing between block manager v1 and v2, under full prefix sharing and no prefix sharing case. ready

ONLY add when PR is ready to merge/full CI is needed

#8528 opened Sep 16, 2024 by KuntaiDu

Loading…

[dbrx] refactor dbrx experts to extend FusedMoe class ready

ONLY add when PR is ready to merge/full CI is needed

#8518 opened Sep 16, 2024 by divakar-amd

Loading…

[Doc] Compatibility matrix for mutual exclusive features

#8512 opened Sep 16, 2024 by wallashss

Loading…

[Core] Implementing disaggregated prefilling, and caching KV cache in CPU/disk/database.

#8498 opened Sep 16, 2024 by KuntaiDu

Loading…

[Bugfix] Fix incorrect llava next feature size calculation

#8496 opened Sep 15, 2024 by zyddnys

Loading…

[Model][VLM] Add LLaVA-Onevision model support

#8486 opened Sep 14, 2024 by litianjian

Loading…

2 of 3 tasks

Previous 1 2 3 4 5 … 16 17 Next

Previous Next

ProTip! Type g p on any issue or pull request to go back to the pull request listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly