Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MMS-TTS] how to mark the speaker‘s character tones, can't find any documents #5520

Open
shuxiang opened this issue Jul 5, 2024 · 0 comments

Comments

@shuxiang
Copy link

shuxiang commented Jul 5, 2024

Questions

I already search the issues and the docs. but can't find any message.

What is question?

Some language in China or Vietnam have tones,the tone of each character is different, different tones make the words have different meanings.

for example chinese Hakka language have six tones. I use https://hf-mirror.com/facebook/mms-tts-cnh to generate wav, but don't know how to set tones of my text, I can not find any document about tones of mms-tts

this code can generate wav file, but tones is strange, i don't know how to change it

Code

from transformers import VitsModel, AutoTokenizer
import torch

model = VitsModel.from_pretrained("facebook/mms-tts-hak")
tokenizer = AutoTokenizer.from_pretrained("facebook/mms-tts-hak")

text = "ngi shit fan liau mo" # hakka text
inputs = tokenizer(text, return_tensors="pt")

with torch.no_grad():
    output = model(**inputs).waveform

from IPython.display import Audio
au = Audio(output, rate=model.config.sampling_rate)
with open('test_hak.wav', 'wb') as f:
    f.write(au.data)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant