Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Progressive AVIF encoding #761

Draft
wants to merge 18 commits into
base: main
Choose a base branch
from

Conversation

tongyuantongyu
Copy link
Contributor

  • Add support to write a1lx box for progressive AVIF, and interleave each layer of alpha and color AV1 payload when writing mdat.
  • Add layer config to avifEncoder and avifCodec impl to encode scalable AV1 stream. (aom only, rav1e and SVT-AV1 simply fail.)
  • New avifenc args: --progressive.

@joedrago
Copy link
Collaborator

First, thank you for your continued contributions to libavif.

This PR is going to take some time to think about. I haven't thought at all about how to expose progressive encoding as an API or how it impacts libavif's internals. This PR is quite large and is making some fairly large user-facing decisions (in both the API and in avifenc), and seems to be quite tailored to how libaom internally thinks about layer encoding. My first reactions to some of the new definitions are poor (I really don't like avifScalingMode, for example), but I don't know if it is simply that I haven't thought it through and this is the only (or best) solution, or if it is simply making something libaom offers without thinking through a more elegant approach. The avifenc SUB_CONFIG design alone is quite complicated, and the style avifParseProgressiveConfig() is quite different than anything else in libavif, with its anonymous enum definitions and the FAIL_IF macro.

This PR needs much more time to digest than I can give it in the near future. I'm curious as to @wantehchang's first impressions. This PR needs a high-level design / API review before we can even get into the review of parts of the implementation. It's a lot.

@tongyuantongyu
Copy link
Contributor Author

I understand your concern, and I agree the final interface will be quite different from the current state of this PR. Progressive is an important feature for AVIF to have a wider adoption (at least on the web), so I wrote this PR in the hope of making it available sooner. I'm also writing this to have some early experiments on the possible usages. Meanwhile, it also spots two (probably) bugs in the current decoder implementation: #762, #763.

I wroted this implementation in quite short time, and I agree there's quite some places my designs are not very good. I've changed some of the points you mentioned:

My first reactions to some of the new definitions are poor (I really don't like avifScalingMode, for example)

I've changed it to allow arbitrary ratios. Current API aom provides is somewhat awkward: AV1E_SET_SVC_PARAMS can set any scale ratio but can't set horizontal and vertical ratios to different values; AOME_SET_SCALEMODE can only set a limited set of scale ratios but allow horizontal and vertical ratios to be different.
By the way, @wantehchang could you please have a check on my usage of aom? When using AV1E_SET_SVC_PARAMS to set scale ratio, speed <= 6, compressing large images (1920x1080 can trigger it), aom access some invalid memory address when calculating SAD for motion vector searching. I'm not sure if it's me using it wrongly or it's some bug in aom.

The avifenc SUB_CONFIG design alone is quite complicated

I'm not a fan of extra "script" files (like what cjpeg -scans does), and for me I feel using lots of command line switches is harder to understand, so I'm taking this way. I haven't come up with other ideas.

anonymous enum definitions

I've changed them to named enums.

the FAIL_IF macro

I've changed it to the CHECK style as in internal.h.

@tongyuantongyu
Copy link
Contributor Author

I realized that we can provide different inputs for each layer to encoder, and claim them as different layers of one image. We can, for example, pass a blurred version as first layer to hide unpleasant compression artifacts. My current design can't support this usage. It requires specifying input image for each layer (and consequently, call avifEncoderAddImage for each layer).

But doing this loses the ability to specify different layer configs for color and alpha. From my test, adding layers do increase the file size (please point out if it there's config to avoid it), so we may want to reduce or disable progressive on alpha to save some bytes.

Hiding the detail that color and alpha are encoded separately seems resulting in this awkward situation: we have to give up one of the two abilities above.

Copy link
Collaborator

@y-guyon y-guyon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for this PR, I used it recently and it worked as is. I believe this change should be merged without too many modifications. It can be improved if needed later on and it would make progressive encoding available in the meantime (unless it generates ill-encoded files).

Specifically, the following should be enough to submit, in my opinion:

  • Add // WARNING: Experimental above the new API in avif.h.
  • Postpone avifenc changes to another PR.
  • Apply nits.

If you do not plan to fix these in the coming weeks/months, that is alright; we may be able to take the PR over, with your permission.

There should be at least one simple test exercising this new API but adding it in another PR is fine.

include/avif/avif.h Outdated Show resolved Hide resolved
src/codec_aom.c Outdated Show resolved Hide resolved
apps/avifenc.c Outdated
printf(" Color planes have 2 layers, alpha plane is not layered.\n");
printf("\n");
printf(" ;30,1/2:10\n");
printf(" Color planes is not layered, alpha plane have 2 layers.\n");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is quite different from the current syntax for encoding regular single-layered images, as Joe pointed out. Similar to what you did with avifEncoder, I would suggest --progressive only adds layers and does not replace the "default" layer nor override --min and --max values.

Going even further, we could reuse the current API for layers: instead of defining layers within --progressive and its own syntax, use a new "separator" flag. For example:

avifenc \
  --min ColorLayer0MinQ --max ColorLayer0MaxQ \
  --add-color-layer \
  --min ColorLayer1MinQ --max ColorLayer1MaxQ \
  --add-alpha-layer \
  --min AlphaLayer0MinQ --max AlphaLayer0MaxQ

This new flag (--progressive or any) should be tagged as "experimental" and prone to future changes.

By the way, the current avifenc flags are currently being redesigned (see PR #669 and #955).

Also, I remember someone arguing that avifenc should stay simple and that ffmpeg should be considered/improved instead, so I am not so sure about modifying avifenc at all actually. In any case, this file can be modified in another PR in order to move the API changes forward faster.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that break what current --progressive does into multiple per-layer flags seems to be a better design.

include/avif/avif.h Outdated Show resolved Hide resolved
src/codec_aom.c Outdated Show resolved Hide resolved
src/write.c Outdated Show resolved Hide resolved
src/write.c Outdated Show resolved Hide resolved
src/write.c Show resolved Hide resolved
src/write.c Outdated Show resolved Hide resolved
src/write.c Outdated Show resolved Hide resolved
@tongyuantongyu
Copy link
Contributor Author

@y-guyon Thanks for your interest on this!

I have some further developments on progressive avif, but not on this branch:

There are two bugs in libaom handling resizing of oddish dimension frames: https://bugs.chromium.org/p/aomedia/issues/detail?id=3203 and https://bugs.chromium.org/p/aomedia/issues/detail?id=3210. I've written patches for both, but forgot to push them forward, due to me being busy until very recent. Probably get them fixed first, or limited to even dimension input for now.

I'd like to continue working on this. Can you have a look of the two branches above, and decide if we want them included in this PR? I'll update this PR then.

@y-guyon
Copy link
Collaborator

y-guyon commented Jun 2, 2022

There are two bugs in libaom handling resizing of oddish dimension frames:
https://bugs.chromium.org/p/aomedia/issues/detail?id=3203 and https://bugs.chromium.org/p/aomedia/issues/detail?id=3210.
I've written patches for both, but forgot to push them forward, due to me being busy until very recent. Probably get them fixed first, or limited to even dimension input for now.

Both solutions seem ok to me. If the patches are trivial enough, it probably is the simplest option. Otherwise focus on pushing this PR, returning UNIMPLEMENTED in the cases where it would generate an image that cannot be decoded.

I'd like to continue working on this. Can you have a look of the two branches above, and decide if we want them included in this PR? I'll update this PR then.

Thanks!

https://github.com/tongyuantongyu/libavif/tree/progressive_encoding_layer_source

  • Note: I only looked at the changes to avif.h for this specific commit.
  • I prefer that API (using avifEncoderAddImage*() to add layers) to the one in this PR (extending avifEncoder). I believe having both is too much.
  • Which use cases are not covered by adding avifEncoderAddImage*() functions compared to extending avifEncoder? Different number of layers for alpha and for color? I wonder if the complexity added to the API for that is worth it.
  • Why not extending avifAddImageFlags with some new AVIF_ADD_IMAGE_FLAG_PROGRESSIVE or similar instead of adding avifEncoderAddImage*() functions? It could signal "still_picture" to AV1 encoders too and would keep changes to avif.h to a minimum.

https://github.com/tongyuantongyu/libavif/tree/progressive_encoding_layer_source

  • Note: I only looked at the changes to write.c for this specific commit.
  • It is unnecessary to include that change into this PR, right? If so, I would keep it in a different PR, especially since that is a breaking change.
  • Would AVIF files encoded without that change be valid according to the latest published specifications?

My current feeling is that we should decide which API change we would like to make right now, even if it is not final, since you researched several options. I would vote for the new avifAddImageFlags flag if it fills enough use cases, otherwise the avifEncoderAddImageProgressive() approach. The danger is to postpone this PR again because of too much required work or decision taking. What do you think?

Copy link
Contributor Author

@tongyuantongyu tongyuantongyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated this PR and fixed the simple changes. Github lost tracking of some reviews, so I quoted them.

include/avif/avif.h Outdated Show resolved Hide resolved
src/codec_aom.c Outdated Show resolved Hide resolved
src/codec_aom.c Show resolved Hide resolved
src/codec_aom.c Outdated Show resolved Hide resolved
src/codec_aom.c Show resolved Hide resolved
src/write.c Show resolved Hide resolved
src/write.c Outdated Show resolved Hide resolved
src/write.c Outdated Show resolved Hide resolved
src/write.c Show resolved Hide resolved
src/codec_aom.c Outdated
@@ -712,23 +748,115 @@ static avifResult aomCodecEncodeImage(avifCodec * codec,
return AVIF_RESULT_UNKNOWN_ERROR;
}
}

if (layerCount > 1) {
#if defined(AVIF_AOM_LAYER_CONFIG_PREFER_SVC_PARAMS)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm experimenting here, and actually I haven't figure out how to make AV1E_SET_SVC_PARAMS method working.

Are you familiar, or can invite someone who is familiar with aom's API, to review my usage here?

@y-guyon
Copy link
Collaborator

y-guyon commented Jun 7, 2022

I've updated this PR and fixed the simple changes. Github lost tracking of some reviews, so I quoted them.

Please avoid to force-push after a code review; now I cannot see the diff between the current version and the last time I read this pull request. If you still have the previous version somewhere, could you force-push it and send a separate commit with your recent changes? Otherwise I will reread all lines.

@tongyuantongyu
Copy link
Contributor Author

Please avoid to force-push after a code review; now I cannot see the diff between the current version and the last time I read this pull request. If you still have the previous version somewhere, could you force-push it and send a separate commit with your recent changes? Otherwise I will reread all lines.

Done. I prefer rebase to keep the tree clean, but if it makes bad review experience, I'll avoid that. Can you "squash and merge" this PR when it's finished to make it cleaner?

@y-guyon
Copy link
Collaborator

y-guyon commented Jun 7, 2022

Can you "squash and merge" this PR when it's finished to make it cleaner?

That is the plan, yes.

include/avif/avif.h Outdated Show resolved Hide resolved
src/codec_aom.c Outdated Show resolved Hide resolved
src/write.c Outdated Show resolved Hide resolved
src/write.c Outdated Show resolved Hide resolved
src/write.c Outdated Show resolved Hide resolved
src/codec_aom.c Outdated
@@ -712,23 +748,115 @@ static avifResult aomCodecEncodeImage(avifCodec * codec,
return AVIF_RESULT_UNKNOWN_ERROR;
}
}

if (layerCount > 1) {
#if defined(AVIF_AOM_LAYER_CONFIG_PREFER_SVC_PARAMS)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you share more details? Is aom_codec_control(AV1E_SET_SVC_PARAMS) returning an error or is the encoding broken later on?

@jzern may have some insight on this or know who to ping.

Copy link
Contributor Author

@tongyuantongyu tongyuantongyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay.

include/avif/avif.h Outdated Show resolved Hide resolved
src/codec_aom.c Outdated
@@ -712,23 +748,115 @@ static avifResult aomCodecEncodeImage(avifCodec * codec,
return AVIF_RESULT_UNKNOWN_ERROR;
}
}

if (layerCount > 1) {
#if defined(AVIF_AOM_LAYER_CONFIG_PREFER_SVC_PARAMS)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AV1E_SET_SVC_PARAMS produces valid bitstream, but scaling_factor_num and scaling_factor_den values in avifLayerConfig are not honored.

src/codec_aom.c Outdated Show resolved Hide resolved
src/write.c Outdated Show resolved Hide resolved
src/write.c Outdated Show resolved Hide resolved
src/write.c Outdated Show resolved Hide resolved
src/write.c Show resolved Hide resolved
Copy link
Collaborator

@y-guyon y-guyon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, I think we should revert the changes in avifenc.c and keep them for another cl. I also believe avifScalingMode and avifLayerConfig are overkill in the public API. avifEncoderAddImageProgressive() and avifEncoderAddImageProgressiveGrid() should be enough.

Also, adding a simple test in this PR would be reassuring, such as the following tests/gtest/avifprogressivetest.cc draft:

TEST(ProgressiveTest, EncodeDecode) {
  testutil::avifImagePtr image = testutil::createImage(width, height etc.);
  ASSERT_NE(image, nullptr);
  testutil::fillImageGradient(image.get());

  // Encode
  testutil::avifEncoderPtr encoder(avifEncoderCreate(), avifEncoderDestroy);
  ASSERT_NE(encoder, nullptr);
  encoder->speed = AVIF_SPEED_FASTEST;
  ASSERT_EQ(avifEncoderAddImageProgressive(encoder.get() etc.), AVIF_RESULT_OK);
  ASSERT_EQ(avifEncoderAddImageProgressive(encoder.get() etc.), AVIF_RESULT_OK);    
  testutil::avifRWDataCleaner encodedAvif;
  ASSERT_EQ(avifEncoderFinish(encoder.get(), &encodedAvif), AVIF_RESULT_OK);
  
  // Decode
  testutil::avifImagePtr decoded(avifImageCreateEmpty(), avifImageDestroy);
  ASSERT_NE(image, nullptr);
  testutil::avifDecoderPtr decoder(avifDecoderCreate(), avifDecoderDestroy);
  ASSERT_NE(decoder, nullptr);
  ASSERT_EQ(avifDecoderSetIOMemory(decoder.get(), encodedAvif.data, encodedAvif.size), AVIF_RESULT_OK);
  ASSERT_EQ(avifDecoderNextImage(decoder.get()), AVIF_RESULT_OK);
  // Check decoder->image
  ASSERT_EQ(avifDecoderNextImage(decoder.get()), AVIF_RESULT_OK);
  // Check decoder->image
}

More extensive testing can be done in following PRs.

src/write.c Outdated Show resolved Hide resolved
src/write.c Show resolved Hide resolved
@tongyuantongyu
Copy link
Contributor Author

As discussed, I think we should revert the changes in avifenc.c and keep them for another cl.

Done

I also believe avifScalingMode and avifLayerConfig are overkill in the public API.

AOM_SCALING_MODE enums provided by libaom are rather arbitrary selected, and AV1 works with any scale ratio, so I defined avifScalingMode as a fraction.

avifLayerConfig holds quality settings, so I put it in avifEncoder like all other quality settings, instead of making it a paramter in avifEncoderAddImage*.

We need to set AOME_SET_NUMBER_SPATIAL_LAYERS before we start encoding, so we need to get all layers at once, instead of adding one layer for each call like encoding an animation.

include/avif/avif.h Outdated Show resolved Hide resolved
src/write.c Outdated Show resolved Hide resolved
tests/gtest/avifprogressivetest.cc Outdated Show resolved Hide resolved
tests/gtest/avifprogressivetest.cc Outdated Show resolved Hide resolved
tests/gtest/avifprogressivetest.cc Outdated Show resolved Hide resolved
tests/gtest/avifprogressivetest.cc Outdated Show resolved Hide resolved
tests/gtest/avifprogressivetest.cc Outdated Show resolved Hide resolved
namespace libavif {
namespace {

class ProgressiveTest : public testing::Test {};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary line

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was talking about class ProgressiveTest : public testing::Test {}; which can be removed, not about the empty line. Sorry for the confusion.

src/codec_aom.c Outdated
uint8_t layerCount = alpha ? encoder->layerCountAlpha : encoder->layerCount;
avifLayerConfig * layers = alpha ? encoder->layersAlpha : encoder->layers;
if (layerCount > 1) {
addImageFlags &= ~AVIF_ADD_IMAGE_FLAG_SINGLE;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I meant was: I wonder if it would be safer to return an error if addImageFlags & AVIF_ADD_IMAGE_FLAG_SINGLE rather than silently setting the bit to 0.

src/codec_aom.c Outdated
@@ -712,23 +748,115 @@ static avifResult aomCodecEncodeImage(avifCodec * codec,
return AVIF_RESULT_UNKNOWN_ERROR;
}
}

if (layerCount > 1) {
#if defined(AVIF_AOM_LAYER_CONFIG_PREFER_SVC_PARAMS)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have a short snippet to reproduce the issue, I suggest filing a bug.

@tongyuantongyu
Copy link
Contributor Author

What I meant was: I wonder if it would be safer to return an error if addImageFlags & AVIF_ADD_IMAGE_FLAG_SINGLE rather than silently setting the bit to 0.

Done. Now the code no longer tweaks AVIF_ADD_IMAGE_FLAG_SINGLE flag, and user should not use AVIF_ADD_IMAGE_FLAG_SINGLE when encoding layer image. Encoder will still produces an valid avif anyway (with all inter-layer compression disable), so maybe no need to return error here?

If you have a short snippet to reproduce the issue, I suggest filing a bug.

I will try to do that. For now I removed the usage of AV1E_SET_SVC_PARAMS.

Copy link
Collaborator

@y-guyon y-guyon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your patience.

I removed the usage of AV1E_SET_SVC_PARAMS.

This is simpler, thanks.

Encoder will still produces an valid avif anyway

This may be tricky for the user: AVIF_ADD_IMAGE_FLAG_SINGLE will have an effect only without extra layers. This is already the case for grids, so I guess it is consistent, even if it is the other way around:

  • avifEncoderAddImageGrid() will always force AVIF_ADD_IMAGE_FLAG_SINGLE, which should be at least as good as omitting it (it "upgrades" the encoder settings so no penalty for the user).
  • avifEncoderAddImageProgressive() will always ignore AVIF_ADD_IMAGE_FLAG_SINGLE, but for the main image it may result in worse results. As an example, a 4k main image with a 1x1 layer will not benefit from AVIF_ADD_IMAGE_FLAG_SINGLE, but the encoder silently proceeds (it "downgrades" the encoder settings so there is a penalty for the user, compared with the same 4k main image without the 1x1 layer).

avifLayerConfig holds quality settings, so I put it in avifEncoder like all other quality settings, instead of making it a paramter in avifEncoderAddImage*.

We need to set AOME_SET_NUMBER_SPATIAL_LAYERS before we start encoding, so we need to get all layers at once, instead of adding one layer for each call like encoding an animation.

I still think extraLayerCount* and layers* in avifEncoder are misleading and obscure. All other fields of avifEncoder are used, irrelevant of how many times and which avifEncoderAddImage*() function is called.
I believe there are two options here:

  • Get rid of avifEncoderAddImageProgressive() and call avifEncoderAddImage() for each layer.
    • The downside is 8 new avifAddImageFlags:
      • AVIF_ADD_IMAGE_FLAG_LAYER to indicate a new layer.
      • AVIF_ADD_IMAGE_FLAG_LAYER_SCALE_H_ONETWO, _ONEFOUR, _ONEEIGHT to specify the optional horizontal scaling. This limits scaling to only three possibilities, but users can scale input images themselves if they need more flexibility.
      • Same for vertical scaling.
      • AVIF_ADD_IMAGE_FLAG_PERSISTENT_IMAGE to indicate that there is no need to create an internal temporary copy of the input image until avifEncoderFinish() is called. This is necessary because AOME_SET_NUMBER_SPATIAL_LAYERS must be set before encoding, as you said. Alternatively to adding this new flag, it could be written in the comment that input must persist until encoding is done, but this is dangerous.
  • Move extraLayerCount* and layers* from avifEncoder fields to avifEncoderAddImageProgressive() arguments.
    • The downside is the complexity if we get to layered grids one day, but this is another issue.

In order to move forward, I would suggest the following:

  • Return AVIF_RESULT_INVALID_LAYERS when AVIF_ADD_IMAGE_FLAG_SINGLE is passed to avifEncoderAddImageProgressive(). It can still be allowed later on (the other way around, permitting it now and then forbidding it in a later PR, is riskier).
  • Return AVIF_RESULT_INVALID_LAYERS when layers and grids are mixed. The code at head forces single-frame for grid images, and layers require several frames. I find it hard to grasp all the implications and corner cases of having grid layered images, so I would prefer to forbid them for now, and investigate that topic further in another PR, with a lot more test coverage alongside.
  • I still believe the current public API should be reworked. Maybe there are other possibilities than the options discussed so far, if they do not satisfy everyone. Any thought @wantehchang?

avifEncoderAddImageProgressive(encoder.get(), layer_image_ptrs.data(),
AVIF_ADD_IMAGE_FLAG_SINGLE),
AVIF_RESULT_OK);
avifImage* layer_image_ptrs[2] = {image.get(), image.get()};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const avifImage* layer_image_ptrs[2]

namespace libavif {
namespace {

class ProgressiveTest : public testing::Test {};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was talking about class ProgressiveTest : public testing::Test {}; which can be removed, not about the empty line. Sorry for the confusion.

@@ -1041,6 +1041,8 @@ typedef struct avifEncoder
uint64_t timescale; // timescale of the media (Hz)

// Layers (used by progressive rendering)
// * Note: libavif currently can only properly decode images without alpha,
// or images whose extraLayerCount == extraLayerCountAlpha, if progressive decode is enabled.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this warning.

What happens exactly with alpha and extraLayerCount != extraLayerCountAlpha? Does libavif return an error or does it proceed with the decoding but in some wrong way? I cannot tell from your comment.

@@ -879,13 +880,16 @@ avifResult avifEncoderAddImageGrid(avifEncoder * encoder,
avifAddImageFlags addImageFlags)
{
avifDiagnosticsClearError(&encoder->diag);
return avifEncoderAddImageInternal(encoder, gridCols, gridRows, cellImages, 1, addImageFlags | AVIF_ADD_IMAGE_FLAG_SINGLE); // only single image grids are supported
if (encoder->extraLayerCount == 0 && encoder->extraLayerCountAlpha == 0) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if ((encoder->extraLayerCount == 0) && (encoder->extraLayerCountAlpha == 0)) {

@tongyuantongyu
Copy link
Contributor Author

Sorry for the late reply. I'm having less spare time than I expected.


The progressive encoding API indeed need a rework. Let's decide what the API would be like first. I have some ideas and considerations.

  • Regarding AVIF_ADD_IMAGE_FLAG_SINGLE :

AVIF_ADD_IMAGE_FLAG_SINGLE enables AOM_USAGE_ALL_INTRA, which also works for layer images or even animations, though it may not be very useful (same quality, but size will increase); it also limit max frames to encode to 1, which does not work for layered images.
In this PR's current state, when AVIF_ADD_IMAGE_FLAG_SINGLE is set during encoding layered image, it only do the tweaks that are meaningful to layered images. I was still testing what config would work, so didn't carefully think if this is reasonable.

I do agree avifEncoderAddImageProgressive() should simply fail if called with AVIF_ADD_IMAGE_FLAG_SINGLE, and avifEncoderAddImageGrid() should only force AVIF_ADD_IMAGE_FLAG_SINGLE when encoding single layer image, if we are going to keep them.

  • Regarding avifLayerConfig:

The main reason for avifLayerConfig is to support AV1E_SET_SVC_PARAMS, which wants these information before encoding. Now after we get rid of that we can reconsider what's better.

There's difference between "progressive image" in traditional formats and "layer image" in AVIF. "Progressive image" only encodes one image, but managed to reconstruct image in lower quality with half the encoded stream. "Layer image", in contrast, encodes multiple layers (that can be arbitrary images), and half the encoded stream contains data of layers at front, so it 's more like an animation.

So avifEncoderAddImageProgressive() feels more like traditional "progressive image", that user sends some pixel, and get an image file back. Calling avifEncoderAddImage() multiple times instead is teaching user "layer image is actually animation that plays next frame as soon as possible, and stop at last frame". I didn't know this at first, so I went to the traditional way.

Now I prefer using avifEncoderAddImage() to add one layer per call. Later we may provide something like avifEncoderWriteProgressive() to encode progressive AVIF from single image in one call, for convenience of basic usage.

We also need to set different quality settings for each layer, and this requires update quality settings from avifEncoder in each avifEncoderAddImage() call. Currently libavifonly set them once before encoding.
Consider this, I prefer add widthScaleMode and heightScaleMode as settings in avifEncoder, and supports updating encoding settings generally, as animation encoding may also benefit from this. Note that rav1e doesn't support updating settings, so we may add a flag AVIF_ADD_IMAGE_FLAG_UPDATE_SETTINGS to indicate user's intention of updating settings.


users can scale input images themselves

From comments in aom/aom_encoder.h, it seems input must have the presentation resolution, so we have to rely on encoder to do the scaling? Do you know how to let encoder accept already scaled input frame?

  /*!\brief Width of the frame
   *
   * This value identifies the presentation resolution of the frame,
   * in pixels. Note that the frames passed as input to the encoder must
   * have this resolution. Frames will be presented by the decoder in this
   * resolution, independent of any spatial resampling the encoder may do.
   */
  unsigned int g_w;

By the way, in av1/encoder/encoder.c#av1_set_internal_size, AOM_SCALING_MODE eventually got converted into layer size in pixel. If we have to rely on encoder to do the scaling, I'd prefer AOM to allow any ratio to be used (or directly set scaled frame size), instead of choosing from a very limited set, so I don't like the AVIF_ADD_IMAGE_FLAG_LAYER_SCALE_* flags.

I find it hard to grasp all the implications and corner cases of having grid layered images

I'd like to point out that layered images with alpha is also hard to deal with. libavif now only correctly decodes those having same number of layers in their color and alpha sub image.


So here is my design:

  1. Support updating settings during encoding as a separate PR.
  2. Refactor this PR, remove avifEncoderAddImageProgressive() and use only avifEncoderAddImage() to encode layered image. Add scale mode settings in avifEncoder that user should update before each avifEncoderAddImage() call.

Would like to know your opinions.

@y-guyon
Copy link
Collaborator

y-guyon commented Aug 12, 2022

Thanks for your detailed answer.

Let's decide what the API would be like first.

I agree.

I don't like the AVIF_ADD_IMAGE_FLAG_LAYER_SCALE_* flags.

Yeah, me neither. It is convenient for users that just want to scale down a layer a bit, but that's it.

From comments in aom/aom_encoder.h, it seems input must have the presentation resolution, so we have to rely on encoder to do the scaling? Do you know how to let encoder accept already scaled input frame?

Indeed libaom's validate_img() will only accept same-size frames. Maybe libaom can be patched?

I'd prefer AOM to allow any ratio to be used (or directly set scaled frame size)

I would have preferred a way for the user to provide their own custom-sized layers as avifEncoderAddImage() input too, but it seems impractical with the current AV1 format and/or libaom implementation, so let's try to move forward with a feasible solution first.


  1. Support updating settings during encoding as a separate PR.

That sounds reasonable to me.
Suggestion for codecs that do not currently support updating settings:

  • Store the encoding settings during the first avifEncoderAddImage() call, and compare them at any other avifEncoderAddImage() call. Return AVIF_RESULT_NOT_IMPLEMENTED if they changed. It would be simpler than adding a new flag such as AVIF_ADD_IMAGE_FLAG_UPDATE_SETTINGS.
  1. Refactor this PR, remove avifEncoderAddImageProgressive() and use only avifEncoderAddImage() to encode layered image. Add scale mode settings in avifEncoder that user should update before each avifEncoderAddImage() call.

I approve that plan. AVIF_ADD_IMAGE_FLAG_LAYER will be needed then.

wantehchang added a commit to wantehchang/libavif that referenced this pull request Aug 27, 2022
Clean up the changes made in the following two pull requests to support
updating encoder settings during encoding:
AOMediaCodec#1033
AOMediaCodec#1058

In particular, restore the aomCodecEncodeImage() function in
src/codec_aom.c to its original structure, plus a new block of code to
handle encoder changes.

Rename some functions and data members. Edit some comments and messages.

In the avifEncoderChange enum, left-shift the unsigned int constant 1u
because if we left-shift the signed int constant 1 by 31 bits, it will
be shifted into the sign bit.

Other miscellaneous cosmetic changes.

AOMediaCodec#761
wantehchang added a commit to wantehchang/libavif that referenced this pull request Aug 27, 2022
Clean up the changes made in the following two pull requests to support
updating encoder settings during encoding:
AOMediaCodec#1033
AOMediaCodec#1058

In particular, restore the aomCodecEncodeImage() function in
src/codec_aom.c to its original structure, plus a new block of code to
handle encoder changes.

Rename some functions and data members. Edit some comments and messages.

In the avifEncoderChange enum, left-shift the unsigned int constant 1u
because if we left-shift the signed int constant 1 by 31 bits, it will
be shifted into the sign bit.

Other miscellaneous cosmetic changes.

AOMediaCodec#761
wantehchang added a commit to wantehchang/libavif that referenced this pull request Aug 27, 2022
Clean up the changes made in the following two pull requests to support
updating encoder settings during encoding:
AOMediaCodec#1033
AOMediaCodec#1058

In particular, restore the aomCodecEncodeImage() function in
src/codec_aom.c to its original structure, plus a new block of code to
handle encoder changes.

Rename some functions and data members. Edit some comments and messages.

In the avifEncoderChange enum, left-shift the unsigned int constant 1u
because if we left-shift the signed int constant 1 by 31 bits, it will
be shifted into the sign bit.

Other miscellaneous cosmetic changes.

AOMediaCodec#761
wantehchang added a commit that referenced this pull request Aug 29, 2022
Clean up the changes made in the following two pull requests to support
updating encoder settings during encoding:
#1033
#1058

In particular, restore the aomCodecEncodeImage() function in
src/codec_aom.c to its original structure, plus a new block of code to
handle encoder changes.

Rename some functions and data members. Edit some comments and messages.

In the avifEncoderChange enum, left-shift the unsigned int constant 1u
because if we left-shift the signed int constant 1 by 31 bits, it will
be shifted into the sign bit.

Other miscellaneous cosmetic changes.

#761
@jyrkialakuijala
Copy link

Is progressive compression more or less dense than the sequential compression?

@y-guyon
Copy link
Collaborator

y-guyon commented Oct 18, 2022

Is progressive compression more or less dense than the sequential compression?

Here is an out-of-date data point (using libavif 0f85943):

image results
Original image from unspash.com resized to 640×480 pixels PNG
Regular AVIF 28362 bytes [ 0.74 bpp ] 15.16 dB*
Incremental AVIF using avifenc --grid 1x5 29010 bytes [ 0.76 bpp ] 15.09 dB*
Progressive AVIF using this PR 31192 bytes [ 0.81 bpp ] 15.21 dB*

*SSIM according to https://chromium.googlesource.com/codecs/libwebp2#get_disto

The settings were arbitrarily picked so this is not a fair comparison. Does it still answer your question? Could you give more details on the reason of it? To engage in a longer conversation, please open a separate issue.

@niutech
Copy link

niutech commented Nov 9, 2022

Now when the progressive decoding in #640 is merged, what needs to be done to merge this PR and when can we expect to support progressive AVIF in libavif and web browsers?

@y-guyon
Copy link
Collaborator

y-guyon commented Nov 9, 2022

Now when the progressive decoding in #640 is merged

#640 is already merged. Are you talking about another issue or the issue 640 of another repository perhaps?

what needs to be done to merge this PR and when can we expect to support progressive AVIF in libavif and web browsers?

Progressive AVIF decoding is already supported in libavif and Chrome. This PR is about encoding support.

@niutech
Copy link

niutech commented Nov 10, 2022

@y-guyon You didn't understand me. I acknowledged that progressive decoding in #640 is merged but I am asking about progressive encoding in this PR. In order to progressively decode an image, it has to be progressively encoded first. So when do you expect to have it shipped in libavif?

@y-guyon
Copy link
Collaborator

y-guyon commented Nov 10, 2022

@y-guyon You didn't understand me.

Sorry about that. The question about web browser support made me think you were asking about the decoding side.

So when do you expect to have it shipped in libavif?

The author of this PR @tongyuantongyu is the main contributor of this feature. At the time of writing this comment, before moving forward with this #761, they sent three prior patches:

Unfortunately at this point I cannot give any timed road map for this feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants