Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Use Opus in the CBR mode #2757

Merged
merged 1 commit into from
Sep 21, 2024
Merged

fix: Use Opus in the CBR mode #2757

merged 1 commit into from
Sep 21, 2024

Conversation

nurupo
Copy link
Member

@nurupo nurupo commented May 11, 2024

VBR is susceptible to a transcription attack, where words can be
deducted from bandwidth fluctuations, even despite the audio being
encrypted. Toxcore does add padding, but it's just 0-7 bytes, to pad to
a 8 byte boundary, which might not be enough. CBR is safe from this
attack, it is the industry recommendation to use CBR: "Applications
conveying highly sensitive unstructured information SHOULD NOT use
codecs in VBR mode."[1], and is what other secure messengers use too,
e.g. Signal.

Here are some papers on this topic:

  • A. M. White, A. R. Matthews, K. Z. Snow and F. Monrose, "Phonotactic
    Reconstruction of Encrypted VoIP Conversations: Hookt on Fon-iks,"
    2011 IEEE Symposium on Security and Privacy, Oakland, CA, USA, 2011,
    pp. 3-18, doi: 10.1109/SP.2011.34.
  • L. A. Khan, M. S. Baig, and Amr M. Youssef. Speaker recognition
    from encrypted VoIP communications. Digit. Investig. 7, 1–2 (October,
    2010), 65–73. https://doi.org/10.1016/j.diin.2009.10.001
  • C. V. Wright, L. Ballard, S. E. Coull, F. Monrose and G. M. Masson,
    "Spot Me if You Can: Uncovering Spoken Phrases in Encrypted VoIP
    Conversations," 2008 IEEE Symposium on Security and Privacy (sp 2008),
    Oakland, CA, USA, 2008, pp. 35-49, doi: 10.1109/SP.2008.21.

Thanks to an IRC user who asked to remain anonymous for sending the
diff.

[1] https://datatracker.ietf.org/doc/html/rfc6562#section-3


This change is Reviewable

@nurupo nurupo added toxav Audio/video security Security labels May 11, 2024
@nurupo nurupo added this to the v0.2.20 milestone May 11, 2024
@nurupo
Copy link
Member Author

nurupo commented May 11, 2024

@iphydf cimple doesn't recognize OPUS_SET_VBR macro

c-toxcore/toxav/audio.c:388: definition of create_audio_encoder references undefined global function/constant OPUS_SET_VBR [-Wcallgraph]

It's defined in opus_defines.h, which is included by opus.h, included by audio.h, included by audio.c.

@nurupo nurupo marked this pull request as ready for review May 11, 2024 03:16
@zoff99
Copy link

zoff99 commented May 11, 2024

is this fully working?

#define OPUS_SET_VBR
Warning:
Only the MDCT mode of Opus can provide hard CBR behavior.

#define OPUS_SET_VBR_CONSTRAINT
Warning:
Only the MDCT mode of Opus currently heeds the constraint. Speech mode ignores it completely, hybrid mode may fail to obey it if the LPC layer uses more bitrate than the constraint would have permitted.

@nurupo
Copy link
Member Author

nurupo commented May 12, 2024

@zoff99 Are you saying that it's not fully working for you? How so?

Also, what are quoting? Can you link to the source of what are you quoting? Is that an output you get when compiling toxav? Perhaps an output of one of our CIs? Or are you just quoting the documentation?


#define OPUS_SET_VBR
Warning:
Only the MDCT mode of Opus can provide hard CBR behavior.

I saw this warning when reading earlier versions of the Opus documentation, e.g. in Opus 1.0.3 released on Jul 11, 2013. But later versions removed that warning, e.g. it seems to be gone in Opus 1.1.3 released on Jul 15, 2016, and the later versions. So it appears that this was resolved in Opus long time ago and that warning no longer applies.

Am I wrong? Do you think we need to enable the MDCT mode?


#define OPUS_SET_VBR_CONSTRAINT
Warning:
Only the MDCT mode of Opus currently heeds the constraint. Speech mode ignores it completely, hybrid mode may fail to obey it if the LPC layer uses more bitrate than the constraint would have permitted.

Could you clarify on how is OPUS_SET_VBR_CONSTRAINT relevant here?

The documentation says:

This setting is ignored when the encoder is in CBR mode.

so it appears that this setting does nothing when using the CBR mode, which is what this PR makes toxav use.

Am I missing something?

@zoff99
Copy link

zoff99 commented May 12, 2024

i activated CBR mode for opus some years ago in my fork, and then switched back to VBR again. can't remember the reason right now.

if we are switching to CBR mode i would want that $someone tests with the 2 (3) still maintained clients in some real world audio calls.
audio calls in UDP mode, audio calls in TCP mode, audio calls in TCP mode over TOR.
if that works without any issues (compared the what we have now) we can merge and i will also do that in my fork.

@nurupo
Copy link
Member Author

nurupo commented May 17, 2024

I'm unable to test it at the moment. Did you get any chance to test this, @zoff99?

There is a person on IRC claiming that it works for them using toxic and qtox clients, over localhost udp, tcp and tcp over tor. Tough given they are the same person who authored the PR diff, I'd prefer if someone else tested it too.

@zoff99
Copy link

zoff99 commented May 18, 2024

i will hopefully get the chance to test it in the next weeks

zoff99 added a commit to zoff99/c-toxcore that referenced this pull request May 24, 2024
setting VBR to false, and use CBR.
for details see: TokTok#2757
@zoff99
Copy link

zoff99 commented May 24, 2024

first test completed: tox_videoplayer -> toxblinkenwall (audio quality test with https://yewtu.be/watch?v=o8dEgTMZ81g)
no issues in quality with CBR mode on LAN

@zoff99
Copy link

zoff99 commented May 25, 2024

2nd test completed: making sure it actually has the wanted effect in the context and setup of toxav (caveat: using my fork)
(tested with tox_videplayer and speech)

with VBR:

===> opus bytes: 8
===> opus bytes: 8
===> opus bytes: 8
===> opus bytes: 8
===> opus bytes: 8
===> opus bytes: 8
===> opus bytes: 8
===> opus bytes: 8
===> opus bytes: 8
===> opus bytes: 629
===> opus bytes: 985
===> opus bytes: 962
===> opus bytes: 962
===> opus bytes: 962
===> opus bytes: 962
===> opus bytes: 956
===> opus bytes: 876
===> opus bytes: 874
===> opus bytes: 854
===> opus bytes: 950
===> opus bytes: 962
===> opus bytes: 962

with CBR:

===> opus bytes: 960
===> opus bytes: 960
===> opus bytes: 960
===> opus bytes: 960
===> opus bytes: 960
===> opus bytes: 960
===> opus bytes: 960
===> opus bytes: 960
===> opus bytes: 960
===> opus bytes: 960
===> opus bytes: 960
===> opus bytes: 960
===> opus bytes: 960
===> opus bytes: 960
===> opus bytes: 960
===> opus bytes: 960
===> opus bytes: 960
===> opus bytes: 960
===> opus bytes: 960
===> opus bytes: 960
===> opus bytes: 960
===> opus bytes: 960
===> opus bytes: 960
===> opus bytes: 960
===> opus bytes: 960
===> opus bytes: 960
===> opus bytes: 960
===> opus bytes: 960
===> opus bytes: 960

seems it does work properly.
but it does waste a lot of bandwidth especially if only 1 person is talking at a time in a call.

@zoff99
Copy link

zoff99 commented Jun 11, 2024

@JFreegman did anybody test this with toxic audio and video calls?

@Green-Sky
Copy link
Member

Green-Sky commented Jul 1, 2024

depends on TokTok/hs-tokstyle#254
edit: merged

@nurupo
Copy link
Member Author

nurupo commented Jul 6, 2024

The cimple CI check is still failing.

@Green-Sky
Copy link
Member

The cimple CI check is still failing.

yea, something still needs updating.

@Green-Sky
Copy link
Member

The cimple CI check is still failing.

yea, something still needs updating.

@nurupo is it possible we need to update cimple in the toktok-stack repo too?

@Green-Sky
Copy link
Member

Dependabot updates are paused

We noticed you haven't used Dependabot in a while, so we've paused automated Dependabot updates for this repository. To resume, simply interact with Dependabot.

lets see if it automatically creates a pr after I interacted with it.

@Green-Sky
Copy link
Member

Errored with the message "Dependabot cannot open any more pull requests"

@Green-Sky
Copy link
Member

toktok-stack docker images should be up-to-date now, @nurupo can you try to force push again?

@nurupo
Copy link
Member Author

nurupo commented Jul 19, 2024

The CI is passing now. Only a review approval is left.

Copy link
Member

@Green-Sky Green-Sky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good. I compared it to other opus voip applications, and they do is exactly the same.

@zoff99
Copy link

zoff99 commented Jul 20, 2024

still missing the tests for toxic like me and nurupo like to see. did it get tested?

@Green-Sky
Copy link
Member

Green-Sky commented Sep 21, 2024

I just performed a test with 3 clients.
qtox and toxic where from the package repo, running the last toxcore release (unpatched)
tomato with latest mater + CBR patch

I routed the audio from qtox(mic) to tomato, and from tomato to toxic(headphones).
At tomato the opus stream was converted to pcm and back to opus using the normal toxav api.
So it was VBR(send) -> CBR(receive) ; CBR(send) -> VBR(receive) .

No audible difference was detected.

edit: the routing in tomato, I am still working on the ui.
image

VBR is susceptible to a transcription attack, where words can be
deducted from bandwidth fluctuations, even despite the audio being
encrypted. Toxcore does add padding, but it's just 0-7 bytes, to pad to
a 8 byte boundary, which might not be enough. CBR is safe from this
attack, it is the industry recommendation to use CBR: "Applications
conveying highly sensitive unstructured information SHOULD NOT use
codecs in VBR mode."[1], and is what other secure messengers use too,
e.g. Signal.

Here are some papers on this topic:
- A. M. White, A. R. Matthews, K. Z. Snow and F. Monrose, "Phonotactic
  Reconstruction of Encrypted VoIP Conversations: Hookt on Fon-iks,"
  2011 IEEE Symposium on Security and Privacy, Oakland, CA, USA, 2011,
  pp. 3-18, doi: 10.1109/SP.2011.34.
- L. A. Khan, M. S. Baig, and Amr M. Youssef. Speaker recognition
  from encrypted VoIP communications. Digit. Investig. 7, 1–2 (October,
  2010), 65–73. https://doi.org/10.1016/j.diin.2009.10.001
- C. V. Wright, L. Ballard, S. E. Coull, F. Monrose and G. M. Masson,
  "Spot Me if You Can: Uncovering Spoken Phrases in Encrypted VoIP
  Conversations," 2008 IEEE Symposium on Security and Privacy (sp 2008),
  Oakland, CA, USA, 2008, pp. 35-49, doi: 10.1109/SP.2008.21.

Thanks to an IRC user who asked to remain anonymous for sending the
diff.

[1] https://datatracker.ietf.org/doc/html/rfc6562#section-3
@Green-Sky
Copy link
Member

For some reason Cirrus CI / bazel-dbg does not exit after a successful run and times out.

@nurupo nurupo merged commit 03e9fbf into TokTok:master Sep 21, 2024
61 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
security Security toxav Audio/video
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants