ffprobe: add show_vt_info option #462

gnattu · 2024-09-20T06:06:04Z

Changes

This adds a new option -show_vt_info to ffprobe when VideoToolbox is configured at build time, which output decoder and encoder info in a minimized json. For example:

./ffprobe -hide_banner -show_vt_info
{"Decoders":["AV_CODEC_ID_H264","AV_CODEC_ID_HEVC","AV_CODEC_ID_VP9"],"Encoders":[{"Codec":"AV_CODEC_ID_H264","MaxWidth":4096,"Profiles":["FF_PROFILE_H264_BASELINE","FF_PROFILE_H264_CONSTRAINED_BASELINE","FF_PROFILE_H264_MAIN","FF_PROFILE_H264_HIGH","FF_PROFILE_H264_HIGH_422","FF_PROFILE_H264_HIGH_444_PREDICTIVE"],"SupportYuv444Encode":true,"MaxHeight":4096,"Support10bitEncode":true,"SupportHDREncode":true},{"Codec":"AV_CODEC_ID_HEVC","MaxWidth":8192,"Profiles":["FF_PROFILE_HEVC_MAIN","FF_PROFILE_HEVC_MAIN_10","FF_PROFILE_HEVC_MAIN_STILL_PICTURE","FF_PROFILE_HEVC_REXT"],"SupportYuv444Encode":true,"MaxHeight":8192,"Support10bitEncode":true,"SupportHDREncode":true},{"SupportHDREncode":false,"MaxWidth":8192,"Support10bitEncode":false,"MaxHeight":8192,"SupportYuv444Encode":false,"Codec":"AV_CODEC_ID_MJPEG"}]}

For decoders, only the codec name where the system declares hardware acceleration is available is printed, with no extra information. This is because:

There isn't a straightforward API to fetch detailed information for decoders. You would need to emulate a real video input, including extra data like NAL units, to check if the decoding session can be created with software fallback disabled.
VideoToolbox gracefully handles software fallback internally, making such detailed checks redundant for our use cases.

For encoders, extra data like the max dimension supported is also printed for us to pick encoders in the future.

I prefer adding this option to ffprobe rather than creating another standalone fftool. Creating an additional portable binary just for hardware info (which would be around 50-80MB) seems excessive to me.

We also need to standardize the JSON output format for hardware info across different platforms. The current draft for other hardware exposes too much unnecessary information and isn't very easy to work with.

Issues

nyanmisaka · 2024-09-20T07:06:00Z

Our lives would be much easier if Linux and Windows could also provide software fallbacks like Apple does, but unfortunately this is not going to happen.

There isn't a straightforward API to fetch detailed information for decoders. You would need to emulate a real video input, including extra data like NAL units, to check if the decoding session can be created with software fallback disabled.

We choose to trust the capabilities reported by the driver. But if it fails (which is likely to happen especially with Intel GPUs that lack firmware), I'd rather handle software fallback in Jellyfin.

I prefer adding this option to ffprobe rather than creating another standalone fftool. Creating an additional portable binary just for hardware info (which would be around 50-80MB) seems excessive to me.

I originally wanted to put the code in ffprobe as well, but I found that I didn't want to disturb ffprobe's existing logic. And just like the overhaul of ffmpeg.c in ffmpeg 7.0, almost any large patches to ffmpeg.c no longer apply.

  --disable-all            disable building components, libraries and programs

You can simply using --disable-all --enable-fftool... for this standalone fftool, or best, fix and build with --enable-shared.

We also need to standardize the JSON output format for hardware info across different platforms. The current draft for other hardware exposes too much unnecessary information and isn't very easy to work with.

I agree to unify common attributes of JSON output, but some HWA/vendor specific attributes are also needed, such as driver version.

nyanmisaka · 2024-09-20T07:14:32Z

Another problem is what json writer to use. There is no NSJSONSerialization on platforms other than Apple, so we have to use the json utils provided by ffmpeg. But the data format must be the same on all platforms.

gnattu · 2024-09-20T07:24:18Z

There is no NSJSONSerialization on platforms other than Apple

VideoToolbox info is specifically for Apple platforms so that is fine to use that as it is easier to work with than the ffmpeg one. As long as the serialized json has the same schema I think we should be fine.

I originally wanted to put the code in ffprobe as well, but I found that I didn't want to disturb ffprobe's existing logic.

The change to original file is quite small though? You just push one more option into the ffprobe.c, adding more includes, and all other logics are handled in our own files which makes the conflicts easy to be resolved. 44cd30b

You can simply using --disable-all --enable-fftool... for this standalone fftool, or best, fix and build with --enable-shared.

If we can disable all other things we can just not depends on ffmpeg at all and make a lightweight standalone tool ourselves though? Fix enable-shared is not trivial due to ffmpeg's weird build system and clang's special linkers. The portable builds are expected to be fully static though, and windows was the only outlier in the history. Is there anything that ffmpeg provides other than type enums that has to be used here?

But if it fails (which is likely to happen especially with Intel GPUs that lack firmware), I'd rather handle software fallback in Jellyfin.

This requires a transcoder rewrite though. We catch the exit status and retry with pure software. Some problematic hardware does not even fail the pipeline but just output wrong pictures, and I think that will require admin actions.

Still needs some work

nyanmisaka · 2024-09-20T08:21:56Z

VideoToolbox info is specifically for Apple platforms so that is fine to use that as it is easier to work with than the ffmpeg one. As long as the serialized json has the same schema I think we should be fine.

I guess you could mix NSJSON with ffmpeg's json writer. But if you don't use ffmpeg's writer at all, the output might be hard to handle, and you'd need to manually manage the sections to be consistent with other platforms. And the output is limited to json. ffprobe's default format is not supported.

The change to original file is quite small though? You just push one more option into the ffprobe.c, adding more includes, and all other logics are handled in our own files which makes the conflicts easy to be resolved. 44cd30b

It should be feasible to just add a few new options with OPT_EXIT flag to ffprobe.

If we can disable all other things we can just not depends on ffmpeg at all and make a lightweight standalone tool ourselves though? Fix enable-shared is not trivial due to ffmpeg's weird build system and clang's special linkers. The portable builds are expected to be fully static though, and windows was the only outlier in the history. Is there anything that ffmpeg provides other than type enums that has to be used here?

Then it will be a problem to distribute this standalone tool downstream, such as the Archlinux repo which builds from source. And there is no need to create a new package for this. In addition, some headers and compat(dyn linked cuda/nvml) libs require ffmpeg/configure.

This requires a transcoder rewrite though. We catch the exit status and retry with pure software. Some problematic hardware does not even fail the pipeline but just output wrong pictures, and I think that will require admin actions.

In the worst case the h26x_qsv decoder cannot find the sequence header in the bitstream, which causes it to keep trying the next packet and not exit gracefully.

gnattu · 2024-09-20T08:34:13Z

I guess you could mix NSJSON with ffmpeg's json writer.

It is not that easy as the data type inside is completely different

But if you don't use ffmpeg's writer at all, the output might be hard to handle, and you'd need to manually manage the sections to be consistent with other platforms.

Current sections are not suitable for VT specific cases anyway. The consistency is easier to handle than you think after we decided on how the output json schema would look like.

And the output is limited to json. ffprobe's default format is not supported.

I think json only output is Okay for our use case, this is not a general cli tool for human read, and json is human friendly enough after prettify.

gnattu · 2024-10-01T00:11:57Z

It turns out that VT does have some private APIs that can get more info about the decoder and those APIs are used by Safari.

Those are not exposed directly in VideoToolbox.h but it is callable by define an extern function signature.

For example: VTCopyHEVCDecoderCapabilitiesDictionary would return something like:

{
  "VTIsHDRAllowedOnDevice": true,
  "VTSupportedProfiles": [
    1,
    2,
    3,
    4
  ],
  "VTDoViIsHardwareAccelerated": true,
  "VTDoViSupportedLevels": [
    "01",
    "02",
    "03",
    "04",
    "05",
    "06",
    "07",
    "08",
    "09"
  ],
  "VTPerProfileSupport": {
    "1": {
      "VTMaxPlaybackLevel": 186,
      "VTIsHardwareAccelerated": true,
      "VTMaxDecodeLevel": 186
    },
    "2": {
      "VTMaxPlaybackLevel": 186,
      "VTIsHardwareAccelerated": true,
      "VTMaxDecodeLevel": 186
    },
    "3": {
      "VTMaxPlaybackLevel": 186,
      "VTIsHardwareAccelerated": true,
      "VTMaxDecodeLevel": 186
    },
    "4": {
      "VTMaxPlaybackLevel": 186,
      "VTIsHardwareAccelerated": true,
      "VTMaxDecodeLevel": 186
    }
  },
  "VTDoViSupportedProfiles": [
    "05",
    "08"
  ]
}

The profile here is the number representation of HEVC general level idc, where 1 == Main, 2 == Main10, 3 == MainStill and 4 == RExt.

There is also a similar method for AV1: VTCopyAV1DecoderCapabilitiesDictionary, but that one returns null on M1 series.

I'm not very confident to use private APIs like this, and probably we don't need such info as:

VT fallbacks to software gracefully
The DoVi related capabilities are not fully exposed

nyanmisaka · 2024-10-01T04:44:03Z

I'm not very confident to use private APIs like this, and probably we don't need such info as:

VT fallbacks to software gracefully

The DoVi related capabilities are not fully exposed

Due to the existence of software fallback, Level info seems unnecessary. In addition, DoVi metadata is parsed in ffmpeg, and we only need to decode HEVC bitstream/base layer from VT.

ffprobe: add show_vt_info option

79d9f6f

gnattu requested a review from a team September 20, 2024 06:06

Shadowghost previously approved these changes Sep 20, 2024

View reviewed changes

nyanmisaka marked this pull request as draft September 20, 2024 07:05

ffprobe: fix include and options

44cd30b

gnattu force-pushed the ffprobe-add-vtinfo branch from d6887dc to 44cd30b Compare September 20, 2024 07:39

ffprobe: fix if macro

22afaab

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ffprobe: add show_vt_info option #462

ffprobe: add show_vt_info option #462

gnattu commented Sep 20, 2024

nyanmisaka commented Sep 20, 2024

nyanmisaka commented Sep 20, 2024

gnattu commented Sep 20, 2024 •

edited

Loading

nyanmisaka commented Sep 20, 2024

gnattu commented Sep 20, 2024

gnattu commented Oct 1, 2024 •

edited

Loading

nyanmisaka commented Oct 1, 2024

ffprobe: add show_vt_info option #462

Are you sure you want to change the base?

ffprobe: add show_vt_info option #462

Conversation

gnattu commented Sep 20, 2024

nyanmisaka commented Sep 20, 2024

nyanmisaka commented Sep 20, 2024

gnattu commented Sep 20, 2024 • edited Loading

nyanmisaka commented Sep 20, 2024

gnattu commented Sep 20, 2024

gnattu commented Oct 1, 2024 • edited Loading

nyanmisaka commented Oct 1, 2024

gnattu commented Sep 20, 2024 •

edited

Loading

gnattu commented Oct 1, 2024 •

edited

Loading