Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make ToxAV implementation pluggable #1369

Open
strfry opened this issue Feb 17, 2020 · 4 comments
Open

Make ToxAV implementation pluggable #1369

strfry opened this issue Feb 17, 2020 · 4 comments
Assignees
Labels
enhancement New feature for the user, not a new feature for build script P2 Medium priority toxav Audio/video
Milestone

Comments

@strfry
Copy link

strfry commented Feb 17, 2020

Motivation

I'm currently doing some experiments with making ToxAV compatible with the standard RTP/WebRTC stack.
From my impression, Tox initially followed this path (encapsulating RTP in Tox), as evidenced by the naming and header fields of RTPMessage, but then diverged and invented custom solutions for standard problems (for example: Inventing a "Large Frame" protocol with the data_length and offset fields instead of RFC 7741 VP8 Payloading).

The "batteries-included" approach of ToxAV, meaning that library users only deal with uncompressed buffers, makes it easy to build an AV-enabled Tox client and shields the application programmer from the nitty gritty details of video streaming, but also makes some features, like hardware accelerated video decoding with zero-copy display impossible.

As a result, Tox now maintains an inadequate, incompatible implementation of an AV streaming stack, an entirely out-of-scope endeavour for a project this size.
IMHO, Tox should redefine it's role as being a distributed, secure transport layer for a standards compliant RTP implementation.
In the future, TokTok might still provide a basic AV implementation, but focus on application-layer compatibility, and avoid sinking developer time in developing isolated solutions for problems that have already been solved by a multitude of other projects.

Besides the benefits of using existing implementations, a RTP-compliant ToxAV would enable many other applications, like bridging to WebRTC peers, video-on-demand delivery and more.

Migration Path

Since legacy compatibility is an important requirement for Tox, replacing the protocol overnight isn't an option. A migration path could look like this:

  1. Carving out the specification of current/"legacy" ToxAV protocol
  2. Write adapter code to convert ToxAV->RTP / RTP->ToxAV
  3. Use those converters within Toxcore, to convert packets between legacy peers and the new RTP based implementation

The feasability of 2) is currently researched/demonstrated within github.com/strfry/gotox

API Changes

Signalling

Signalling commands (Call, Answer, Hangup, etc.) are currently sent on a special comm channel, and internally handled by msi.c, where a callback-style interface is provided.
This seemed like a good spot to hook on, since it's exactly what toxav_new does, which i intend to replace.
In my prototype, i just expose this internal API by means of the FFI ( TokTok/go-toxcore-c@master...strfry:feature/msi ), so no direct action is needed, but i think it is debatable whether the API described in msi.h would be a good cutting point for a public API.

Packet ID filtering

The lossy_packet interface basically enables us to send and receive AV packets.
Only problem, it explicitly checks for AV related packet IDs, to multiplex between ToxAV and userspace.
This just affects a few lines, that need to handle this condition in a different way: master...strfry:feature/pluggable_rtp
Of course statically disabling the ToxAV codepath isn't an option because it would break existing AV clients. Another option would be to disable these checks when tox is built without AV support, but this isn't an ideal solution other.

Maybe this could be dependend on the previous allocation of a ToxAV object?
I'm not sure if there would be unexpected side-effects for existing ToxAV-clients, that might get confused by video packets coming in through the custom_lossy_packet interface?

Open Issues

The (Web)RTC stack uses the Session Description Protocol (SDP, RFC4566) to negotiate various details, such as codecs, protocol extension and ICE connection candidates.
SDP is a common point of criticism of the WebRTC spec, and was mostly accepted for easier compatibility with existing SIP networks.
It's complexity, and integral support for features that are not necessary in the context of ToxAV make it unfavorable for inclusion in Tox.
It is yet to be researched how the necessary options can be mapped to Tox capabilities, and how much can be skipped through higher base specifications within Tox. For example, we can assume support for the Opus audio codec, and don't need to negotiate about things like "PCMU/8000".

Another topic of research is mapping the various RTCP commands to existing comm channel commands for bandwidth regulation.

@sudden6
Copy link

sudden6 commented Mar 4, 2020

Motivation

I generally agree with your motivation, ToxAV tries to re-invent the wheel in several places and has numerous issues. Therefore it should be replaced rather sooner than later.

Migration path

I'm not sure if you know about this project: https://github.com/toxext/toxext/

Right now it's only the basis for a PoC of a new messaging format, but it might save you some headaches when trying to keep backwards compatibility.

I'd propose to implement whatever idea you have in mind with this as an extension and once it's sufficiently widespread remove the original ToxAV.

Open Issues

It's complexity, and integral support for features that are not necessary in the context of ToxAV make it unfavorable for inclusion in Tox.

I agree here and I want to add some more reasons why such a protocol might be bad.
IMO the beauty of ToxAV at the moment is that it's only one supported video codec that is patent free and can be used anywhere. No complexity needed in selecting anything. Additionally I don't want to step into the minefield that is h264 or h265. I would be ok though if VPx was required to be supported and e.g. h264 optional.

Some more things you might want to think about:

  • Co-existence with large file transfers
  • Threading, currently all encoding/decoding is done in one thread. This doesn't scale with higher resolutions or multiple parallel calls or groups. Ideally we'd use as many threads as possible, with a run-time setable limit by the client

@sudden6 sudden6 added enhancement New feature for the user, not a new feature for build script toxav Audio/video labels Mar 4, 2020
@sudden6
Copy link

sudden6 commented Mar 10, 2020

@strfry you might also find some good ideas here: #1382 (comment)

It's not 100% fleshed out, but tell me what you think anyway.

@iphydf iphydf added this to the v0.2.x milestone Apr 24, 2020
@iphydf iphydf added P3 Low priority P1 High priority and removed P3 Low priority labels Apr 27, 2020
@iphydf
Copy link
Member

iphydf commented May 1, 2020

@strfry what would you need from toxcore's side to make this happen?

@iphydf iphydf modified the milestones: v0.2.x, v0.3.0 Feb 4, 2022
@iphydf iphydf added P2 Medium priority and removed P1 High priority labels Feb 4, 2022
@zoff99
Copy link

zoff99 commented Dec 20, 2023

also when this is merged #1431 you will be able to do this (mostly)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature for the user, not a new feature for build script P2 Medium priority toxav Audio/video
Projects
None yet
Development

No branches or pull requests

4 participants