Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ffprobe: add show_vt_info option #462

Draft
wants to merge 3 commits into
base: jellyfin
Choose a base branch
from
Draft

Conversation

gnattu
Copy link
Member

@gnattu gnattu commented Sep 20, 2024

Changes

This adds a new option -show_vt_info to ffprobe when VideoToolbox is configured at build time, which output decoder and encoder info in a minimized json. For example:

./ffprobe -hide_banner -show_vt_info
{"Decoders":["AV_CODEC_ID_H264","AV_CODEC_ID_HEVC","AV_CODEC_ID_VP9"],"Encoders":[{"Codec":"AV_CODEC_ID_H264","MaxWidth":4096,"Profiles":["FF_PROFILE_H264_BASELINE","FF_PROFILE_H264_CONSTRAINED_BASELINE","FF_PROFILE_H264_MAIN","FF_PROFILE_H264_HIGH","FF_PROFILE_H264_HIGH_422","FF_PROFILE_H264_HIGH_444_PREDICTIVE"],"SupportYuv444Encode":true,"MaxHeight":4096,"Support10bitEncode":true,"SupportHDREncode":true},{"Codec":"AV_CODEC_ID_HEVC","MaxWidth":8192,"Profiles":["FF_PROFILE_HEVC_MAIN","FF_PROFILE_HEVC_MAIN_10","FF_PROFILE_HEVC_MAIN_STILL_PICTURE","FF_PROFILE_HEVC_REXT"],"SupportYuv444Encode":true,"MaxHeight":8192,"Support10bitEncode":true,"SupportHDREncode":true},{"SupportHDREncode":false,"MaxWidth":8192,"Support10bitEncode":false,"MaxHeight":8192,"SupportYuv444Encode":false,"Codec":"AV_CODEC_ID_MJPEG"}]}

For decoders, only the codec name where the system declares hardware acceleration is available is printed, with no extra information. This is because:

  • There isn't a straightforward API to fetch detailed information for decoders. You would need to emulate a real video input, including extra data like NAL units, to check if the decoding session can be created with software fallback disabled.

  • VideoToolbox gracefully handles software fallback internally, making such detailed checks redundant for our use cases.

For encoders, extra data like the max dimension supported is also printed for us to pick encoders in the future.

I prefer adding this option to ffprobe rather than creating another standalone fftool. Creating an additional portable binary just for hardware info (which would be around 50-80MB) seems excessive to me.

We also need to standardize the JSON output format for hardware info across different platforms. The current draft for other hardware exposes too much unnecessary information and isn't very easy to work with.

Issues

@gnattu gnattu requested a review from a team September 20, 2024 06:06
Shadowghost
Shadowghost previously approved these changes Sep 20, 2024
@nyanmisaka nyanmisaka marked this pull request as draft September 20, 2024 07:05
@nyanmisaka
Copy link
Member

Our lives would be much easier if Linux and Windows could also provide software fallbacks like Apple does, but unfortunately this is not going to happen.

There isn't a straightforward API to fetch detailed information for decoders. You would need to emulate a real video input, including extra data like NAL units, to check if the decoding session can be created with software fallback disabled.

We choose to trust the capabilities reported by the driver. But if it fails (which is likely to happen especially with Intel GPUs that lack firmware), I'd rather handle software fallback in Jellyfin.

I prefer adding this option to ffprobe rather than creating another standalone fftool. Creating an additional portable binary just for hardware info (which would be around 50-80MB) seems excessive to me.

I originally wanted to put the code in ffprobe as well, but I found that I didn't want to disturb ffprobe's existing logic. And just like the overhaul of ffmpeg.c in ffmpeg 7.0, almost any large patches to ffmpeg.c no longer apply.

  --disable-all            disable building components, libraries and programs

You can simply using --disable-all --enable-fftool... for this standalone fftool, or best, fix and build with --enable-shared.

We also need to standardize the JSON output format for hardware info across different platforms. The current draft for other hardware exposes too much unnecessary information and isn't very easy to work with.

I agree to unify common attributes of JSON output, but some HWA/vendor specific attributes are also needed, such as driver version.

@nyanmisaka
Copy link
Member

Another problem is what json writer to use. There is no NSJSONSerialization on platforms other than Apple, so we have to use the json utils provided by ffmpeg. But the data format must be the same on all platforms.

@gnattu
Copy link
Member Author

gnattu commented Sep 20, 2024

There is no NSJSONSerialization on platforms other than Apple

VideoToolbox info is specifically for Apple platforms so that is fine to use that as it is easier to work with than the ffmpeg one. As long as the serialized json has the same schema I think we should be fine.

I originally wanted to put the code in ffprobe as well, but I found that I didn't want to disturb ffprobe's existing logic.

The change to original file is quite small though? You just push one more option into the ffprobe.c, adding more includes, and all other logics are handled in our own files which makes the conflicts easy to be resolved. 44cd30b

You can simply using --disable-all --enable-fftool... for this standalone fftool, or best, fix and build with --enable-shared.

If we can disable all other things we can just not depends on ffmpeg at all and make a lightweight standalone tool ourselves though? Fix enable-shared is not trivial due to ffmpeg's weird build system and clang's special linkers. The portable builds are expected to be fully static though, and windows was the only outlier in the history. Is there anything that ffmpeg provides other than type enums that has to be used here?

But if it fails (which is likely to happen especially with Intel GPUs that lack firmware), I'd rather handle software fallback in Jellyfin.

This requires a transcoder rewrite though. We catch the exit status and retry with pure software. Some problematic hardware does not even fail the pipeline but just output wrong pictures, and I think that will require admin actions.

@Shadowghost Shadowghost dismissed their stale review September 20, 2024 08:07

Still needs some work

@nyanmisaka
Copy link
Member

VideoToolbox info is specifically for Apple platforms so that is fine to use that as it is easier to work with than the ffmpeg one. As long as the serialized json has the same schema I think we should be fine.

I guess you could mix NSJSON with ffmpeg's json writer. But if you don't use ffmpeg's writer at all, the output might be hard to handle, and you'd need to manually manage the sections to be consistent with other platforms. And the output is limited to json. ffprobe's default format is not supported.

The change to original file is quite small though? You just push one more option into the ffprobe.c, adding more includes, and all other logics are handled in our own files which makes the conflicts easy to be resolved. 44cd30b

It should be feasible to just add a few new options with OPT_EXIT flag to ffprobe.

If we can disable all other things we can just not depends on ffmpeg at all and make a lightweight standalone tool ourselves though? Fix enable-shared is not trivial due to ffmpeg's weird build system and clang's special linkers. The portable builds are expected to be fully static though, and windows was the only outlier in the history. Is there anything that ffmpeg provides other than type enums that has to be used here?

Then it will be a problem to distribute this standalone tool downstream, such as the Archlinux repo which builds from source. And there is no need to create a new package for this. In addition, some headers and compat(dyn linked cuda/nvml) libs require ffmpeg/configure.

This requires a transcoder rewrite though. We catch the exit status and retry with pure software. Some problematic hardware does not even fail the pipeline but just output wrong pictures, and I think that will require admin actions.

In the worst case the h26x_qsv decoder cannot find the sequence header in the bitstream, which causes it to keep trying the next packet and not exit gracefully.

@gnattu
Copy link
Member Author

gnattu commented Sep 20, 2024

I guess you could mix NSJSON with ffmpeg's json writer.

It is not that easy as the data type inside is completely different

But if you don't use ffmpeg's writer at all, the output might be hard to handle, and you'd need to manually manage the sections to be consistent with other platforms.

Current sections are not suitable for VT specific cases anyway. The consistency is easier to handle than you think after we decided on how the output json schema would look like.

And the output is limited to json. ffprobe's default format is not supported.

I think json only output is Okay for our use case, this is not a general cli tool for human read, and json is human friendly enough after prettify.

@gnattu
Copy link
Member Author

gnattu commented Oct 1, 2024

It turns out that VT does have some private APIs that can get more info about the decoder and those APIs are used by Safari.

Those are not exposed directly in VideoToolbox.h but it is callable by define an extern function signature.

For example: VTCopyHEVCDecoderCapabilitiesDictionary would return something like:

{
  "VTIsHDRAllowedOnDevice": true,
  "VTSupportedProfiles": [
    1,
    2,
    3,
    4
  ],
  "VTDoViIsHardwareAccelerated": true,
  "VTDoViSupportedLevels": [
    "01",
    "02",
    "03",
    "04",
    "05",
    "06",
    "07",
    "08",
    "09"
  ],
  "VTPerProfileSupport": {
    "1": {
      "VTMaxPlaybackLevel": 186,
      "VTIsHardwareAccelerated": true,
      "VTMaxDecodeLevel": 186
    },
    "2": {
      "VTMaxPlaybackLevel": 186,
      "VTIsHardwareAccelerated": true,
      "VTMaxDecodeLevel": 186
    },
    "3": {
      "VTMaxPlaybackLevel": 186,
      "VTIsHardwareAccelerated": true,
      "VTMaxDecodeLevel": 186
    },
    "4": {
      "VTMaxPlaybackLevel": 186,
      "VTIsHardwareAccelerated": true,
      "VTMaxDecodeLevel": 186
    }
  },
  "VTDoViSupportedProfiles": [
    "05",
    "08"
  ]
}

The profile here is the number representation of HEVC general level idc, where 1 == Main, 2 == Main10, 3 == MainStill and 4 == RExt.

There is also a similar method for AV1: VTCopyAV1DecoderCapabilitiesDictionary, but that one returns null on M1 series.

I'm not very confident to use private APIs like this, and probably we don't need such info as:

  • VT fallbacks to software gracefully
  • The DoVi related capabilities are not fully exposed

@nyanmisaka
Copy link
Member

I'm not very confident to use private APIs like this, and probably we don't need such info as:

  • VT fallbacks to software gracefully
  • The DoVi related capabilities are not fully exposed

Due to the existence of software fallback, Level info seems unnecessary. In addition, DoVi metadata is parsed in ffmpeg, and we only need to decode HEVC bitstream/base layer from VT.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants