Skip to content

Pull requests: NVIDIA/TensorRT-LLM

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Fix errors when quantizing Llama model
#2264 opened Sep 28, 2024 by dleunji Loading…
fix: none prompt to string
#2259 opened Sep 26, 2024 by dongs0104 Loading…
README.md: Add 3rd Party Inference Speed Dashboard documentation Improvements or additions to documentation
#2244 opened Sep 22, 2024 by matichon-vultureprime Loading…
Modify small-batched weight only quantization quantization Issue about lower bit quantization, including int8, int4, fp8 triaged Issue has been triaged by maintainers
#2213 opened Sep 10, 2024 by dasistwo Loading…
Create sync.yml
#2154 opened Aug 27, 2024 by inkimikoko Loading…
typo fix quick-start-guide.md
#2075 opened Aug 1, 2024 by sweetning0809 Loading…
fix GemmFpAIntB MMa::IteratorB::Layout
#2070 opened Jul 31, 2024 by luliyucoordinate Loading…
fix wrong arg in Engine Building Command in docs/source/performance/perf-overview.md documentation Improvements or additions to documentation Merged
#2057 opened Jul 30, 2024 by RuibaiXu Loading…
Correct the version
#1936 opened Jul 12, 2024 by Shixiaowei02 Loading…
Fix default min length triaged Issue has been triaged by maintainers
#1935 opened Jul 11, 2024 by akhoroshev Loading…
Add support for custom tokenizer and batch size
#1927 opened Jul 9, 2024 by uppalutkarsh Loading…
Add support for falcon2 triaged Issue has been triaged by maintainers
#1926 opened Jul 9, 2024 by puneeshkhanna Loading…
Dev sm87 trt101
#1880 opened Jul 3, 2024 by sunnyqgg Loading…
Bump transformers from 4.36.2 to 4.38.0 in /examples/multimodal bug Something isn't working dependencies Pull requests that update a dependency file triaged Issue has been triaged by maintainers waiting for feedback
#1689 opened May 28, 2024 by dependabot bot Loading…
add cached generation buffer triaged Issue has been triaged by maintainers waiting for feedback
#1685 opened May 28, 2024 by michael200892458 Loading…
Fix CUDA OOM when creating Mixtral checkpoint triaged Issue has been triaged by maintainers waiting for feedback
#1629 opened May 19, 2024 by VivekBits2210 Loading…
Add support for non-power-of-two heads with Alibi triaged Issue has been triaged by maintainers
#1611 opened May 15, 2024 by vmarkovtsev Loading…
ProTip! Type g i on any issue or pull request to go back to the issue listing page.