Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load dataset from hf failed #68

Open
murphypei opened this issue Jul 16, 2024 · 4 comments
Open

Load dataset from hf failed #68

murphypei opened this issue Jul 16, 2024 · 4 comments

Comments

@murphypei
Copy link

murphypei commented Jul 16, 2024

datasets = ['hotpotqa', '2wikimqa', 'musique', 'narrativeqa', 'qasper', 'multifieldqa_en', 'gov_report', 'qmsum', 'trec', 'samsum', 'triviaqa', 'passage_count', 'passage_retrieval_en', 'multi_news']
for dataset in datasets:
        print(f"Loading dataset {dataset}")
        data = load_dataset("THUDM/LongBench", dataset, split="test")
        output_path = f"{output_dir}/pred/{dataset}.jsonl"

File "/usr/local/lib/python3.9/dist-packages/datasets/packaged_modules/cache/cache.py", line 65, in _find_hash_in_cache
raise ValueError(
ValueError: Couldn't find cache for THUDM/LongBench for config '2wikimqa'
Available configs in the cache: ['dureader', 'hotpotqa', 'multifieldqa_en_e', 'qasper_e']

@bys0318
Copy link
Member

bys0318 commented Jul 17, 2024

Hi, can you try deleting the cached files and download all over again?

@murphypei
Copy link
Author

murphypei commented Jul 17, 2024

Hi, can you try deleting the cached files and download all over again?

yes, and I test many times in both local machine and docker environment. I don't known if you can reproduce this error, maybe this error is just my mistakes. Thanks for your reply.

Finally I was forced to download the jsonl file and load it from local disk and it works.

I can still use this dataset but I think this error may leading to reduced usage.

@bys0318
Copy link
Member

bys0318 commented Jul 17, 2024

Glad to hear you've loaded the dataset! Perhaps this error is due to a low datasets version. One can try update the package:

pip install -U datasets

@murphypei
Copy link
Author

Glad to hear you've loaded the dataset! Perhaps this error is due to a low datasets version. One can try update the package:

pip install -U datasets

I have already upgraded it to the lastest version but it didn't work. Maybe it's the huggingface issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants