Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve rendering performance by reducing at least one frame rendering delay. #16896

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

walterlv
Copy link
Contributor

@walterlv walterlv commented Sep 3, 2024

What does the pull request do?

After several hours testing, we found that the Avalonia app renders with a delay after the input event. Comparing to all the same code in WPF, the delay is very noticeable.

image

Please see the picture above:

  1. We draw a polyline point by point in the PointerMove event.
  2. Move the finger left to right with a constant speed. Record a video or capture the screen to view the rendering delay.

The first one is the Avalonia app before this PR. The sencond one is the Avalonia app after this PR. The third one is the WPF app. We can see that:

  1. The WPF renders very fast and it tracks the finger very well. (We see the line moves ahead of the finger circle because the finger circle needs some time to render.)
  2. The Avalonia app before this PR renders with a long delay. (We see the line is almost outside of the finger circle.)
  3. The Avalonia app after this PR renders with a short delay. (We see the line is a little bit following the finger circle center.)

What is the current behavior?

Before this PR, when a pointer event is triggered:

  1. The Pointer message is dispatched on the UI thread.
  2. An rendering request is sent to the render thread (a frame of the DwmRenderTimerLoop).
  3. A callback is called on the UI thread and it commits the final rendering.
  4. The next rendering (a frame of the DwmRenderTimerLoop) is ticked and all the rendering is done here.

As a result, after a pointer event, the rendering is done at the second frame. This is the reason why we see the rendering delay.

What is the updated/expected behavior with this PR?

After this PR, when a pointer event is triggered:

  1. The Pointer message is dispatched on the UI thread.
  2. We commit the final rendering without sending and waiting for the callback.
  3. When a rendering come, all the rendering is done.

After a pointer event, the rendering is done at the first frame. As a result, the rendering delay is reduced at least one frame. (16ms on a 60fps screen)

How was the solution implemented (if it's not obvious)?

I can't figure out the best solution to fix this issue. So I want to discuss it with you.

Solution 1 (this PR):

  1. Add an option for developers to choose whether to render immediately or wait for the callback.
  2. If the option is enabled, we skip the callback and render immediately.

Solution 2:

  1. Always skip the callback and render immediately.

Solution 3:

  1. Your better advice is appreciated.

Checklist

Breaking changes

If someone create the MediaContext with reflection, it will fail because the constructor is changed.

Obsoletions / Deprecations

Fixed issues

Before this commit: UI -> render -> UI -> render (then it rendered)
After this commit: UI -> render (then it rendered)
Yes, It reduced rendering delay at least one frame.
@walterlv walterlv force-pushed the t/walterlv/rendering-performance branch from c5607d0 to 26b2a62 Compare September 3, 2024 02:48
@avaloniaui-bot
Copy link

You can test this PR using the following package version. 11.2.999-cibuild0051616-alpha. (feed url: https://nuget-feed-all.avaloniaui.net/v3/index.json) [PRBUILDID]

lindexi added a commit to lindexi/lindexi_gd that referenced this pull request Sep 3, 2024
@kekekeks
Copy link
Member

kekekeks commented Sep 3, 2024

The Pointer message is dispatched on the UI thread.
An rendering request is sent to the render thread (a frame of the DwmRenderTimerLoop).
A callback is called on the UI thread and it commits the final rendering.
The next rendering (a frame of the DwmRenderTimerLoop) is ticked and all the rendering is done here.

It actually works (or supposed to work) in the following way:

  1. The Pointer message is dispatched on the UI thread, layout/render pass is scheduled
  2. If there is no frame that is pending processing (which is different from rendering), goto 4
  3. Wait for the previous frame to start being rendered on the UI thread (Processed event)
  4. Execute layout/render pass with proper priorities
  5. Send the frame to the render thread

On the render frame the following happens (or supposed to happen):

  1. check for any pending frames/commits, process the data, mark them as Processed (this should allow the UI thread to start processing the next frame)
  2. render the frame (takes time)
  3. call DwmFlush/Thread.Sleep or use other OS-specific way to wait for the next refresh rate tick
  4. goto 1

image

Your PR skips the UI thread step 3, which removes any backpressure, so way more layout/render passes can be triggered by input events:
image

(note that there are way more layout passes than 3 per frame, it's snown like this for readability sake).

Since you are doing the layout pass as soon as you have changes, the perceived input delay would indeed be lower, however it comes at the cost of doing way more layout/render passes that can have various consequences for more complex layouts:

  1. somewhat increased memory usage by the compositor transport
  2. greatly increased memory usage by controls that can be allocating bitmaps on per-frame basis
  3. dispatcher jobs with priority lower than input not running at all (which should be OK, I guess?)

Another concern is that WPF does the same what we currently do - it doesn't send try to schedule the next render pass until it gets a completion callback from the render thread - https://github.com/dotnet/wpf/blob/a37f6effb9304cd5479dc58427e13e3645f71c88/src/Microsoft.DotNet.Wpf/src/PresentationCore/System/Windows/Media/MediaContext.cs#L561-L568

@kekekeks
Copy link
Member

kekekeks commented Sep 3, 2024

Another problem would be various animations like indeterminate ProgressBar, that would be ticking synchronized with composition commit rate. Since animations are processed as a part of the render/layout pass, the dispatcher might not be able to process jobs with Input priority and user input in general.

@walterlv
Copy link
Contributor Author

walterlv commented Sep 4, 2024

@kekekeks Thank you for the two diagrams you drew! The logic, which was previously assembled from various fragments while reading the code, has now become clear and comprehensive.
Of course, we are very eager to minimize rendering latency as much as possible, so we still need some solutions to address this delay. If WPF is doing the same as what Avalonia is currently doing, then there must be some other mechanisms in place to reduce such rendering delay, as evidenced by the actual performance.
With your help, I am continuing to read the relevant code of WPF and Avalonia, searching for the fundamental reasons behind WPF's lower rendering delay. Additionally, we hope you might have better solutions for resolving the delay issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants