Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving SKData performance #2127

Closed
wants to merge 11 commits into from
Closed

Improving SKData performance #2127

wants to merge 11 commits into from

Conversation

mattleibow
Copy link
Contributor

@mattleibow mattleibow commented Jun 26, 2022

Description of Change

When trying to make Skottie work better, I noticed a few places where there was a less than ideal code path.

One performance improvement that can be applied to specific instances is to skip the object registration with the handler dictionary. Some objects are never returned from an API - such as SKPaint - because they are user objects and are always given to the native APIs and never stored.

Other objects differ in that they are sometimes not stored in native code. In these cases we can skip the registration for specific methods - such as with the SKUnregisteredDynamicMemoryWStream used in the construction of the SKData instance. This stream is only alive for the duration of the Create method, so we do not need to register the stream with the handle dictionary. These special objects can be created by implementing the ISKSkipObjectRegistration interface on a private class.

The handler dictionary is required because some APIs will return a native object that we already have a binding instance for in managed code. To prevent multiple C# objects from sharing a single C++ object, we register the C++ handle and the C# object in a dictionary for lookups later.

Changes

  • SKData.AsStream() now is a writeable stream and can be used to copy from another stream

SKData Performance

The benchmarks can be run by opening the benchmarks\SkiaSharp.Benchmarks\SkiaSharp.Benchmarks.sln in the IDE or running with dotnet run -c Release -f net6.0 from the benchmarks\SkiaSharp.Benchmarks directory.

This PR has a few changes that make creating SKData a faster and slimmer implementation.

My benchmark for the construction of SKData is below. There are 3 benchmarks:

  • CreatePrevious - the previous implementation of the SKData construction
  • Create - the new, default implementation of SKData construction using .NET Stream objects
  • CreateUnregistered - a high-performance construction that bypasses all the typical object registration making it only suitable for cases where the data instance is read and then disposed of without the callee keeping a reference (such as loading a file and then deserializing it)
  • CreateRaw - an even higher-performance construction that bypasses everything managed and just is calling the native code via p/invoke

Source for benchmarks: https://github.com/mono/SkiaSharp/pull/2127/files#diff-85b6258ec115fee242f199220e54c7eb66db621306f800eebeee85fb054aa36c

// * Summary *

BenchmarkDotNet=v0.13.1, OS=Windows 10.0.22000
Intel Core i9-9980HK CPU 2.40GHz, 1 CPU, 16 logical and 8 physical cores
.NET SDK=6.0.400-preview.22301.10
  [Host]   : .NET 6.0.6 (6.0.622.26707), X64 RyuJIT
  .NET 6.0 : .NET 6.0.6 (6.0.622.26707), X64 RyuJIT

Job=.NET 6.0  Runtime=.NET 6.0  InvocationCount=1
UnrollFactor=1

|             Method | SizeMB | IsSeekable |     Mean |   Error |   StdDev | Ratio | RatioSD | Allocated |
|------------------- |------- |----------- |---------:|--------:|---------:|------:|--------:|----------:|
|     CreatePrevious |      1 |      False | 376.0 ms | 4.02 ms |  3.35 ms |  1.00 |    0.00 | 457,008 B |
|             Create |      1 |      False | 363.0 ms | 7.01 ms |  8.35 ms |  0.98 |    0.02 | 184,816 B |
| CreateUnregistered |      1 |      False | 390.2 ms | 5.53 ms |  5.17 ms |  1.04 |    0.02 | 160,816 B |
|          CreateRaw |      1 |      False | 356.5 ms | 3.56 ms |  2.98 ms |  0.95 |    0.01 |     816 B |
|                    |        |            |          |         |          |       |         |           |
|     CreatePrevious |      1 |       True | 300.1 ms | 2.95 ms |  2.76 ms |  1.00 |    0.00 | 352,816 B |
|             Create |      1 |       True | 275.4 ms | 5.45 ms |  7.08 ms |  0.93 |    0.03 | 200,816 B |
| CreateUnregistered |      1 |       True | 261.9 ms | 4.03 ms |  3.77 ms |  0.87 |    0.01 | 176,816 B |
|          CreateRaw |      1 |       True | 378.2 ms | 5.92 ms |  5.53 ms |  1.26 |    0.02 |     816 B |
|                    |        |            |          |         |          |       |         |           |
|     CreatePrevious |     10 |      False | 464.9 ms | 8.93 ms |  8.77 ms |  1.00 |    0.00 |  46,456 B |
|             Create |     10 |      False | 458.2 ms | 8.96 ms | 15.92 ms |  0.99 |    0.04 |  19,216 B |
| CreateUnregistered |     10 |      False | 461.2 ms | 8.70 ms |  8.94 ms |  0.99 |    0.03 |  17,008 B |
|          CreateRaw |     10 |      False | 426.4 ms | 8.43 ms |  8.28 ms |  0.92 |    0.02 |     856 B |
|                    |        |            |          |         |          |       |         |           |
|     CreatePrevious |     10 |       True | 302.5 ms | 5.96 ms |  7.95 ms |  1.00 |    0.00 |  36,016 B |
|             Create |     10 |       True | 217.6 ms | 4.22 ms |  3.74 ms |  0.72 |    0.03 |  20,816 B |
| CreateUnregistered |     10 |       True | 210.6 ms | 3.57 ms |  3.16 ms |  0.70 |    0.02 |  18,416 B |
|          CreateRaw |     10 |       True | 425.8 ms | 5.80 ms |  5.43 ms |  1.41 |    0.05 |     816 B |

// * Hints *
Outliers
  SkDataCreateBenchmark.CreatePrevious: .NET 6.0     -> 4 outliers were removed (392.43 ms..414.06 ms)
  SkDataCreateBenchmark.Create: .NET 6.0             -> 1 outlier  was  removed (399.04 ms)
  SkDataCreateBenchmark.CreateRaw: .NET 6.0          -> 2 outliers were removed (374.71 ms, 376.22 ms)
  SkDataCreateBenchmark.CreateRaw: .NET 6.0          -> 1 outlier  was  detected (366.11 ms)
  SkDataCreateBenchmark.CreatePrevious: .NET 6.0     -> 1 outlier  was  detected (434.34 ms)
  SkDataCreateBenchmark.CreateUnregistered: .NET 6.0 -> 2 outliers were detected (445.68 ms, 446.14 ms)
  SkDataCreateBenchmark.CreatePrevious: .NET 6.0     -> 2 outliers were removed, 3 outliers were detected (279.98 ms, 336.67 ms, 339.74 ms)
  SkDataCreateBenchmark.Create: .NET 6.0             -> 1 outlier  was  removed (228.25 ms)
  SkDataCreateBenchmark.CreateUnregistered: .NET 6.0 -> 1 outlier  was  removed (226.38 ms)
  SkDataCreateBenchmark.CreateRaw: .NET 6.0          -> 1 outlier  was  detected (410.67 ms)

// * Legends *
  SizeMB     : Value of the 'SizeMB' parameter
  IsSeekable : Value of the 'IsSeekable' parameter
  Mean       : Arithmetic mean of all measurements
  Error      : Half of 99.9% confidence interval
  StdDev     : Standard deviation of all measurements
  Ratio      : Mean of the ratio distribution ([Current]/[Baseline])
  RatioSD    : Standard deviation of the ratio distribution ([Current]/[Baseline])
  Allocated  : Allocated memory per single operation (managed only, inclusive, 1KB = 1024B)
  1 ms       : 1 Millisecond (0.001 sec)

@mattleibow mattleibow marked this pull request as draft June 27, 2022 17:19
@mattleibow
Copy link
Contributor Author

Not quite sure what is going on here since the direct p/invokes are slower than the code that has to do all sorts of things. Maybe the diff is so minimal that it is just the electrons flowing at random speeds due to the walk of the dino earth?

# Conflicts:
#	benchmarks/SkiaSharp.Benchmarks/Program.cs
#	benchmarks/SkiaSharp.Benchmarks/SkiaSharp.Benchmarks.csproj
@mattleibow
Copy link
Contributor Author

Closing this as it does not seem to make a difference.

@mattleibow mattleibow closed this Feb 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant