[Question]: 使用taskflow api 加载英文模型 rocketqav2-en-marco-cross-encoder 执行文本相似度计算，报错“ classifier.weight receives a shape [768, 2], but the expected shape is [768, 1].” #9097

K-K-SD · 2024-09-06T04:36:06Z

环境：
paddlenlp 2.6.1
paddlepaddle 2.6.0

代码：
from paddlenlp import Taskflow
similarity = Taskflow("text_similarity", model='rocketqav2-en-marco-cross-encoder')
text1 = "i am hungry"
text2 = "i am very happy when i play soccer."
text3 = "i like apples."
text4 = "The weather is cold."
text5 = "you look beautiful."
text6 = "you look beautiful."
text7 = "he was very sad to lose this game"
text8 = "the opponent won this game, he's not feeling well"
print(similarity([[text1, text2],[text3, text4],[text5, text6],[text7, text8]]))

报错：
File "/usr/local/lib/python3.8/dist-packages/paddlenlp/taskflow/taskflow.py", line 809, in init
self.task_instance = task_class(
File "/usr/local/lib/python3.8/dist-packages/paddlenlp/taskflow/text_similarity.py", line 175, in init
self._get_inference_model()
File "/usr/local/lib/python3.8/dist-packages/paddlenlp/taskflow/task.py", line 341, in _get_inference_model
self._construct_model(self.model)
File "/usr/local/lib/python3.8/dist-packages/paddlenlp/taskflow/text_similarity.py", line 199, in _construct_model
self._model = ErnieCrossEncoder(self._task_path, num_classes=1, reinitialize=True)
File "/usr/local/lib/python3.8/dist-packages/paddlenlp/transformers/semantic_search/modeling.py", line 255, in init
self.ernie = ErnieEncoder.from_pretrained(pretrain_model_name_or_path, num_classes=num_classes, ignore_mismatched_sizes=False)
File "/usr/local/lib/python3.8/dist-packages/paddlenlp/transformers/model_utils.py", line 2334, in from_pretrained
model, missing_keys, unexpected_keys, mismatched_keys = cls._load_pretrained_model(
File "/usr/local/lib/python3.8/dist-packages/paddlenlp/transformers/model_utils.py", line 2020, in _load_pretrained_model
raise RuntimeError(f"Error(s) in loading state_dict for {model.class.name}:\n\t{error_msg}")
RuntimeError: Error(s) in loading state_dict for ErnieEncoder:
Skip loading for classifier.weight. classifier.weight receives a shape [768, 2], but the expected shape is [768, 1].
Skip loading for classifier.bias. classifier.bias receives a shape [2], but the expected shape is [1].
You may consider adding ignore_mismatched_sizes=True in the model from_pretrained method.

按照提示修改 /usr/local/lib/python3.8/dist-packages/paddlenlp/transformers/semantic_search/modeling.py 的
self.ernie = ErnieEncoder.from_pretrained(pretrain_model_name_or_path, num_classes=num_classes) 为
self.ernie = ErnieEncoder.from_pretrained(pretrain_model_name_or_path, num_classes=num_classes, ignore_mismatched_sizes=True)
能下载模型并读入模型参数进行预测，但是预测出来的 similarity 不在0-1的范围，如下：
[{'text1': 'i am hungry', 'text2': 'i am very happy when i play soccer.', 'similarity': -0.31699442863464355}, {'text1': 'i like apples.', 'text2': 'The weather is cold.', 'similarity': -0.05174262821674347}, {'text1': 'you look beautiful.', 'text2': 'you look beautiful.', 'similarity': -0.7865103483200073}, {'text1': 'he was very sad to lose this game', 'text2': "the opponent won this game, he's not feeling well", 'similarity': -1.3662326335906982}]

另外，英文模型ernie-search-large-cross-encoder-marco-en也存在上面的问题。而simbert-base-chinese , rocketqa-base-cross-encoder 等中文模型是没问题的。

请问该如何处理？

The text was updated successfully, but these errors were encountered:

K-K-SD added the question Further information is requested label Sep 6, 2024

paddle-bot bot assigned gongel Sep 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question]: 使用taskflow api 加载英文模型 rocketqav2-en-marco-cross-encoder 执行文本相似度计算，报错“ classifier.weight receives a shape [768, 2], but the expected shape is [768, 1].” #9097

[Question]: 使用taskflow api 加载英文模型 rocketqav2-en-marco-cross-encoder 执行文本相似度计算，报错“ classifier.weight receives a shape [768, 2], but the expected shape is [768, 1].” #9097

K-K-SD commented Sep 6, 2024 •

edited

Loading

[Question]: 使用taskflow api 加载英文模型 rocketqav2-en-marco-cross-encoder 执行文本相似度计算，报错“ classifier.weight receives a shape [768, 2], but the expected shape is [768, 1].” #9097

[Question]: 使用taskflow api 加载英文模型 rocketqav2-en-marco-cross-encoder 执行文本相似度计算，报错“ classifier.weight receives a shape [768, 2], but the expected shape is [768, 1].” #9097

Comments

K-K-SD commented Sep 6, 2024 • edited Loading

K-K-SD commented Sep 6, 2024 •

edited

Loading