Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wav2Vec2 Pretraining #5533

Open
rajeevbaalwan opened this issue Aug 8, 2024 · 3 comments
Open

Wav2Vec2 Pretraining #5533

rajeevbaalwan opened this issue Aug 8, 2024 · 3 comments

Comments

@rajeevbaalwan
Copy link

❓ Questions and Help

I want to perform wav2vec2 Pretraining from scratch and while following the documentation for same on https://github.com/facebookresearch/fairseq/tree/main/examples/wav2vec it is mentioned that all audio clips should be in single directory. The issue is i have too much data to keep in a single directory.

I have data in multiple directories on different disks and can't move complete data in single directory due to storage issue. Is it possible to pretrain the model in this scenario?

@gau-nernst
Copy link

You can try using symlinks

@rajeevbaalwan
Copy link
Author

@gau-nernst thanks for your response.
Is there any other way for this like modifying the code so it can handle paths from multiple directories? At the end a path is to be loaded by the data loader so why this constraint?

@gau-nernst
Copy link

Of course it is possible to modify the code, but then you have to do it yourself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants