Why my 16000 speech instance get poor performance? #7

JohnHerry · 2022-11-30T10:18:21Z

Thanks for the good job.
We are finding a vocoder for TTS, we tried the SawSingSub, our audio sample rate is 16000, to cooperate with our acounstic model, I had changed the params in preprocess.py, set hop_length=200, win_length=800, n_mel_channels=80, I also changed the , sawsingsub.yaml, accordingly , set block_size = 200, left other settings unchanged.

I have trained the model for nearly 1 million steps, but the quality of generated waveform is far worse then that of HiFi-GAN with the same steps, Is there any parameter adjustments I had missed?

The Blured Spectrogram of SawSingSub generated speech

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why my 16000 speech instance get poor performance? #7

Why my 16000 speech instance get poor performance? #7

JohnHerry commented Nov 30, 2022 •

edited

Loading

Why my 16000 speech instance get poor performance? #7

Why my 16000 speech instance get poor performance? #7

Comments

JohnHerry commented Nov 30, 2022 • edited Loading

JohnHerry commented Nov 30, 2022 •

edited

Loading