-
Notifications
You must be signed in to change notification settings - Fork 6.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pretrain Hubert on english and chinese speech dataset. #5526
Comments
Can you show the config of your training? |
I use the hubert_base_librispeech.yaml for pretraining, only change the ddp_backend and max_sample_size.
|
The loss of training hubert on my side can eventually converge to around 2.5, and I used the wenetspeech dataset as the pretrain dataset,which used 10,000 hours of pure Chinese data. |
We believe that the key of training hubert base model is to look at the performance of the pre-trained model on main downstream tasks. You can finetune the pre-trained model trained by your recipe, and then test its accuracy on your tasks. |
OK, Thank you for your reply! We have tried to train the SpeechTokenizer with feature from Hubert, and the reconstructed speech is also good. We will try more experiments on downstream tasks. |
Hi~I'm trying pretrain hubert from scratch on english and chinese speech dataset. During pretrain, the 1st iteration loss dropped from 6.7 to 3.3, the 2nd iteration loss dropped from 11.2 to 4.0. Both of two stage iterations loss are too large, is this a normal phenomenon?
The text was updated successfully, but these errors were encountered: