Skip to content

Commit dde1fa1

Browse files
authored
[Misc] Improve BNB loader to handle mixture of sharded and merged weights with same suffix (#11566)
Signed-off-by: Isotr0py <[email protected]>
1 parent 0240402 commit dde1fa1

File tree

1 file changed

+5
-2
lines changed

1 file changed

+5
-2
lines changed

vllm/model_executor/model_loader/loader.py

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1001,8 +1001,11 @@ def _get_bnb_target_modules(self, model: nn.Module) -> None:
10011001
for sub_name in sub_modules:
10021002
self.target_modules.append(
10031003
name.replace(last_name, sub_name))
1004-
else:
1005-
self.target_modules.append(name)
1004+
# Add original module name even if the module has stacked map,
1005+
# in case model has a mixture of disk-merged and disk-splitted
1006+
# weights with same last name.
1007+
self.target_modules.append(name)
1008+
10061009
assert (self.target_modules
10071010
), "vllm currently does not support BNB quantization for"
10081011
f" {type(model).__name__}"

0 commit comments

Comments
 (0)