Use raw format for `AzureTTSService` output #519

amacapri · 2024-09-27T16:49:39Z

In the AzureTTSService class, there is a comment that states:

# Azure always sends a 44-byte header. Strip it off.
yield AudioRawFrame(
        audio=result.audio_data[44:],
        sample_rate=self._sample_rate,
        num_channels=1,
    )

I believe this behavior occurs because the default speech_synthesis_output_format is being used, which is most likely in WAV format. Instead, you should consider using one of the "raw" formats, such as SpeechSynthesisOutputFormat.Raw16Khz16BitMonoPcm.

The text was updated successfully, but these errors were encountered:

amacapri changed the title ~~Use raw format for AzureTTS output~~ Use raw format for AzureTTSService output Sep 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use raw format for `AzureTTSService` output #519

Use raw format for `AzureTTSService` output #519

amacapri commented Sep 27, 2024 •

edited

Loading

Use raw format for AzureTTSService output #519

Use raw format for AzureTTSService output #519

Comments

amacapri commented Sep 27, 2024 • edited Loading

Use raw format for `AzureTTSService` output #519

Use raw format for `AzureTTSService` output #519

amacapri commented Sep 27, 2024 •

edited

Loading