Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question]: 使用taskflow api 加载英文模型 rocketqav2-en-marco-cross-encoder 执行文本相似度计算,报错“ classifier.weight receives a shape [768, 2], but the expected shape is [768, 1].” #9097

Open
K-K-SD opened this issue Sep 6, 2024 · 0 comments
Assignees
Labels
question Further information is requested

Comments

@K-K-SD
Copy link

K-K-SD commented Sep 6, 2024

环境:
paddlenlp 2.6.1
paddlepaddle 2.6.0

代码:
from paddlenlp import Taskflow
similarity = Taskflow("text_similarity", model='rocketqav2-en-marco-cross-encoder')
text1 = "i am hungry"
text2 = "i am very happy when i play soccer."
text3 = "i like apples."
text4 = "The weather is cold."
text5 = "you look beautiful."
text6 = "you look beautiful."
text7 = "he was very sad to lose this game"
text8 = "the opponent won this game, he's not feeling well"
print(similarity([[text1, text2],[text3, text4],[text5, text6],[text7, text8]]))

报错:
File "/usr/local/lib/python3.8/dist-packages/paddlenlp/taskflow/taskflow.py", line 809, in init
self.task_instance = task_class(
File "/usr/local/lib/python3.8/dist-packages/paddlenlp/taskflow/text_similarity.py", line 175, in init
self._get_inference_model()
File "/usr/local/lib/python3.8/dist-packages/paddlenlp/taskflow/task.py", line 341, in _get_inference_model
self._construct_model(self.model)
File "/usr/local/lib/python3.8/dist-packages/paddlenlp/taskflow/text_similarity.py", line 199, in _construct_model
self._model = ErnieCrossEncoder(self._task_path, num_classes=1, reinitialize=True)
File "/usr/local/lib/python3.8/dist-packages/paddlenlp/transformers/semantic_search/modeling.py", line 255, in init
self.ernie = ErnieEncoder.from_pretrained(pretrain_model_name_or_path, num_classes=num_classes, ignore_mismatched_sizes=False)
File "/usr/local/lib/python3.8/dist-packages/paddlenlp/transformers/model_utils.py", line 2334, in from_pretrained
model, missing_keys, unexpected_keys, mismatched_keys = cls._load_pretrained_model(
File "/usr/local/lib/python3.8/dist-packages/paddlenlp/transformers/model_utils.py", line 2020, in _load_pretrained_model
raise RuntimeError(f"Error(s) in loading state_dict for {model.class.name}:\n\t{error_msg}")
RuntimeError: Error(s) in loading state_dict for ErnieEncoder:
Skip loading for classifier.weight. classifier.weight receives a shape [768, 2], but the expected shape is [768, 1].
Skip loading for classifier.bias. classifier.bias receives a shape [2], but the expected shape is [1].
You may consider adding ignore_mismatched_sizes=True in the model from_pretrained method.

按照提示修改 /usr/local/lib/python3.8/dist-packages/paddlenlp/transformers/semantic_search/modeling.py 的
self.ernie = ErnieEncoder.from_pretrained(pretrain_model_name_or_path, num_classes=num_classes) 为
self.ernie = ErnieEncoder.from_pretrained(pretrain_model_name_or_path, num_classes=num_classes, ignore_mismatched_sizes=True)
能下载模型并读入模型参数进行预测,但是预测出来的 similarity 不在0-1的范围,如下:
[{'text1': 'i am hungry', 'text2': 'i am very happy when i play soccer.', 'similarity': -0.31699442863464355}, {'text1': 'i like apples.', 'text2': 'The weather is cold.', 'similarity': -0.05174262821674347}, {'text1': 'you look beautiful.', 'text2': 'you look beautiful.', 'similarity': -0.7865103483200073}, {'text1': 'he was very sad to lose this game', 'text2': "the opponent won this game, he's not feeling well", 'similarity': -1.3662326335906982}]

另外,英文模型ernie-search-large-cross-encoder-marco-en也存在上面的问题。而simbert-base-chinese , rocketqa-base-cross-encoder 等中文模型是没问题的。

请问该如何处理?

@K-K-SD K-K-SD added the question Further information is requested label Sep 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants