使用Qwen2-72B-Base测试GPQA数据集时报错：NotImplementedError: OpenAI does not support ppl-based evaluation yet, try gen-based instead. #1526

13416157913 · 2024-09-12T11:55:36Z

Prerequisite

I have searched Issues and Discussions but cannot get the expected help.
The bug has not been fixed in the latest version.

Type

I'm evaluating with the officially supported tasks/models/datasets.

Environment

1

Reproduces the problem - code/configuration sample

from mmengine.config import read_base
from opencompass.models import OpenAI
from opencompass.partitioners import NaivePartitioner
from opencompass.runners import LocalRunner
from opencompass.tasks import OpenICLInferTask

with read_base():
from .datasets.collections.chat_medium import datasets
from .summarizers.medium import summarizer
from .datasets.gpqa.gpqa_gen import gpqa_datasets

api_meta_template = dict(
round=[
dict(role='HUMAN', api_role='HUMAN'),
dict(role='BOT', api_role='BOT', generate=True),
],
)
datasets = [*gpqa_datasets]
models = [
dict(abbr='xxxx',
type=OpenAI, path='xxxx',
key='http://xxx.xxx.xxx.xxx:xxxx', # The key will be obtained from $OPENAI_API_KEY, but you can write down your key here as well
meta_template=api_meta_template,
query_per_second=1,
max_out_len=8192, max_seq_len=8192, batch_size=8),
]
infer = dict(
partitioner=dict(type=NaivePartitioner),
runner=dict(
type=LocalRunner,
max_num_workers=1,
task=dict(type=OpenICLInferTask)),
)

Reproduces the problem - command or script

1

Reproduces the problem - error message

Traceback (most recent call last):
File "/home/opencompass/opencompass/tasks/openicl_infer.py", line 152, in
inferencer.run()
File "/home/opencompass/opencompass/tasks/openicl_infer.py", line 81, in run
self._inference()
File "/home/opencompass/opencompass/tasks/openicl_infer.py", line 125, in _inference
inferencer.inference(retriever,
File "/home/anaconda3/lib/python3.10/site-packages/opencompass/openicl/icl_inferencer/icl_ppl_inferencer.py", line 159, in inference
sub_res = self.model.get_ppl_from_template(sub_prompt_list).tolist()
File "/home/anaconda3/lib/python3.10/site-packages/opencompass/models/base.py", line 152, in get_ppl_from_template
return self.get_ppl(inputs, mask_length)
File "/home/anaconda3/lib/python3.10/site-packages/opencompass/models/base_api.py", line 124, in get_ppl
raise NotImplementedError(f'{self.class.name} does not support'
NotImplementedError: OpenAI does not support ppl-based evaluation yet, try gen-based instead.

Other information

gpqa数据集使用的配置文件为：gpqa_ppl_6bf57a.py
报错信息：NotImplementedError: OpenAI does not support ppl-based evaluation yet, try gen-based instead.

MaiziXiao · 2024-09-13T09:26:35Z

The errore message is straightforward. OpenAI model does not support PPL based evaluation (based on output logits), try another GPQA generation settings (only rely on input and output strings)

13416157913 · 2024-09-13T11:38:31Z

The errore message is straightforward. OpenAI model does not support PPL based evaluation (based on output logits), try another GPQA generation settings (only rely on input and output strings)

Hello, thanks your answer. Qwen2 model support PPL based evaluation?

mm-assistant bot assigned acylam Sep 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

使用Qwen2-72B-Base测试GPQA数据集时报错：NotImplementedError: OpenAI does not support ppl-based evaluation yet, try gen-based instead. #1526

使用Qwen2-72B-Base测试GPQA数据集时报错：NotImplementedError: OpenAI does not support ppl-based evaluation yet, try gen-based instead. #1526

13416157913 commented Sep 12, 2024

MaiziXiao commented Sep 13, 2024

13416157913 commented Sep 13, 2024

使用Qwen2-72B-Base测试GPQA数据集时报错：NotImplementedError: OpenAI does not support ppl-based evaluation yet, try gen-based instead. #1526

使用Qwen2-72B-Base测试GPQA数据集时报错：NotImplementedError: OpenAI does not support ppl-based evaluation yet, try gen-based instead. #1526

Comments

13416157913 commented Sep 12, 2024

Prerequisite

Type

Environment

Reproduces the problem - code/configuration sample

Reproduces the problem - command or script

Reproduces the problem - error message

Other information

MaiziXiao commented Sep 13, 2024

13416157913 commented Sep 13, 2024