Wav2Vec2 Pretraining #5533

rajeevbaalwan · 2024-08-08T07:49:08Z

❓ Questions and Help

I want to perform wav2vec2 Pretraining from scratch and while following the documentation for same on https://github.com/facebookresearch/fairseq/tree/main/examples/wav2vec it is mentioned that all audio clips should be in single directory. The issue is i have too much data to keep in a single directory.

I have data in multiple directories on different disks and can't move complete data in single directory due to storage issue. Is it possible to pretrain the model in this scenario?

gau-nernst · 2024-08-27T03:13:59Z

You can try using symlinks

rajeevbaalwan · 2024-09-02T06:58:39Z

@gau-nernst thanks for your response.
Is there any other way for this like modifying the code so it can handle paths from multiple directories? At the end a path is to be loaded by the data loader so why this constraint?

gau-nernst · 2024-09-02T13:43:21Z

Of course it is possible to modify the code, but then you have to do it yourself.

rajeevbaalwan added needs triage question labels Aug 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wav2Vec2 Pretraining #5533

Wav2Vec2 Pretraining #5533

rajeevbaalwan commented Aug 8, 2024

gau-nernst commented Aug 27, 2024

rajeevbaalwan commented Sep 2, 2024

gau-nernst commented Sep 2, 2024

Wav2Vec2 Pretraining #5533

Wav2Vec2 Pretraining #5533

Comments

rajeevbaalwan commented Aug 8, 2024

❓ Questions and Help

gau-nernst commented Aug 27, 2024

rajeevbaalwan commented Sep 2, 2024

gau-nernst commented Sep 2, 2024