fix SmoothQuantGatedMLP ffn_hidden_size bug #1712

michael200892458 · 2024-06-03T02:42:40Z

SmoothQuantMLP mlp_hidden_size is not equal config.intermediate_size when Qwen model。
mlp_hidden_size should be layer.mlp.ffn_hidden_size

nv-guomingz · 2024-06-03T06:14:28Z

Hi @michael200892458 , thanks for contribuding tensorrt-llm.
May I know the background of this MR that we need to update generation.py? Is it a duplicate MR of #1685? If so, I suggest we keep this MR focusing on ffn_hidden_size bug only.

nv-guomingz · 2024-06-05T09:44:47Z

Hi @michael200892458 , after checking the latest main code, this issue had been fixed in main branch.

huiyuan.lb added 3 commits May 28, 2024 14:22

add cached generation buffer

41e3462

Merge remote-tracking branch 'origin/main'

69002c0

fix SmoothQuantGatedMLP ffn_hidden_size bug

a0c3a59

nv-guomingz closed this Jun 5, 2024

nv-guomingz added the triaged Issue has been triaged by maintainers label Jun 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix SmoothQuantGatedMLP ffn_hidden_size bug #1712

fix SmoothQuantGatedMLP ffn_hidden_size bug #1712

michael200892458 commented Jun 3, 2024

nv-guomingz commented Jun 3, 2024 •

edited

Loading

nv-guomingz commented Jun 5, 2024

fix SmoothQuantGatedMLP ffn_hidden_size bug #1712

fix SmoothQuantGatedMLP ffn_hidden_size bug #1712

Conversation

michael200892458 commented Jun 3, 2024

nv-guomingz commented Jun 3, 2024 • edited Loading

nv-guomingz commented Jun 5, 2024

nv-guomingz commented Jun 3, 2024 •

edited

Loading