posts/5-java-mundane-performance-tricks #30

utterances-bot · 2021-11-29T04:26:33Z

5 Mundane Java Performance Tips | Richard Startin’s Blog

Most of the time it isn’t really necessary to optimise software, but this post contains 5 tips to avoid making software written in Java slower for the sake of it.

https://richardstartin.github.io/posts/5-java-mundane-performance-tricks

himanshumps · 2021-11-29T04:26:34Z

Very informative

cbaldan · 2021-11-29T08:41:20Z

Great tips!
Thanks for sharing

danjmarques · 2021-11-29T15:48:47Z

If we're going to go through the trouble of making a composite key "wrapper" object (item 2), we should consider making it mutable and reusing it.

I created my own Pair (running Java 17, don't have javafx), and made it so the elements are mutable, and created another benchmark:

public void wrapReuse(Blackhole bh) {
    Pair mapKey = new Pair(null, null);
    for (int i = 0; i < prefixes.length; ++i) {
        bh.consume(pairMap.get(mapKey.set(prefixes[i], suffixes[i])));
    }
}

I see a further 5% performance improvement, and (obviously) a complete elimination of allocations

Benchmark                                                     (size)  Mode  Cnt    Score     Error   Units
CompositeLookup.concatenate                                     1024  avgt    5  126.628 ±   2.713   ns/op
CompositeLookup.concatenate:·gc.alloc.rate.norm                 1024  avgt    5   96.002 ±   0.001    B/op
CompositeLookup.wrap                                            1024  avgt    5   44.994 ±   0.874   ns/op
CompositeLookup.wrap:·gc.alloc.rate.norm                        1024  avgt    5   24.001 ±   0.001    B/op
CompositeLookup.wrapReuse                                       1024  avgt    5   42.706 ±   0.065   ns/op
CompositeLookup.wrapReuse:·gc.alloc.rate.norm                   1024  avgt    5    0.023 ±   0.001    B/op

Of course, one must make sure they never reuse such a "wrapper" object that was put into the map.

richardstartin · 2021-11-29T16:08:16Z

@danjmarques that saves 2ns and 24 bytes per lookup, but risks a memory leak. I would proceed with caution in that direction, and only if this is your application’s primary function and bottleneck.

mikejhill · 2021-11-29T17:46:02Z

@danjmarques @richardstartin the java.util.Map Javadoc emphasizes mutability of keys as dangerous for the same reason. But, to add, the flywheel technique might be helpful and I have seen it successfully used in other cases (though not with maps) when isolated behind an abstraction. E.g., a custom get method which internally uses a private mutable/homomorphic type with compatible equals/hashCode.

That said, unless this is a critical bottleneck, why not avoid this risk/work and let the JVM optimize (maybe scalar replacement is even possible in some cases?). When JEP 169 (value types) goes to GA, it seems like compatible value type replacements might also supersede this when keys are composite on primitives.

danjmarques · 2021-11-29T18:15:39Z

@mikejhill @richardstartin Mutability of keys is dangerous, of course, if the key object is the one stored in the map. For lookups, they are safe (I mean, presuming some other thread isn't changing the key).

It seems that, once it becomes important to make "lookup much faster and reduce allocation rate" you might want make it even faster and eliminate allocation all together.

The flyweight as map key is actually a pretty standard trick in low-latency messaging. Let's say each message you receive has a "topic" and you need to do some lookup based on that topic. If your message is just some bytes you've received into a buffer, and you know that the "topic" is some offset N into the buffer, of length L, you just wrap the buffer under your pre-allocated flyweight key, and do the lookup with that. Of course, you must be careful to make sure you don't accidentally store that pre-allocated key in the map.

richardstartin · 2021-11-29T18:45:18Z

@danjmarques there are applications where it makes sense to avoid allowing the GC to run, so keep allocation at an absolute minimum, but the post isn’t aimed at people working on low latency systems, this is more about how someone working on the median Java application can arm themselves with simple defence mechanisms to avoid unnecessary inefficiency. You’re right, you can go further than this post demonstrates, but it wouldn’t be appropriate in some teams where some developer will end up putting a mutable key in the map and mutating it.

danjmarques · 2021-11-29T19:06:48Z

@richardstartin "...about how someone working on the median Java application ... simple defence mechanisms to avoid unnecessary inefficiency".

Yes, understood, and I think your examples do a great job of that. I was just expanding on one of those examples - I figured that a developer which saw/understood the ideas you presented for the first time might be interested to "push it" a bit further ("can we eliminate the allocations completely? how much performance do we gain?"). And I think once someone understands why that flyweight key can't be put into the map they show a pretty significant maturity as a dev (and I love asking a similar question when I interview).

jmzc · 2021-12-09T11:55:03Z

I don't understand , on tip number 2, the reason why wrapping is better than String concatenation . You say "String instance caches its hash code, and constructing a new String requires calculation of a new hash code and String’s hash code algorithm also isn’t very efficient."

I guess that it's also required to calculate Pair hash in
pairMap.get(new Pair(prefixes[i], suffixes[i]))

and I don't think that Pair's hashing algorithm be worse than String's

Thanks

richardstartin · 2021-12-09T12:14:56Z

I don't understand , on tip number 2, the reason why wrapping is better than String concatenation . You say "String instance caches its hash code, and constructing a new String requires calculation of a new hash code and String’s hash code algorithm also isn’t very efficient."

I guess that it's also required to calculate Pair hash in
pairMap.get(new Pair(prefixes[i], suffixes[i]))

and I don't think that Pair's hashing algorithm be worse than String's

Let's say you have two Strings with n bytes combined, then you have to hash n bytes. If you have a pair, you just have to combine two precomputed hash probably by multiplying one of them by a prime and adding the other. So they have different complexities.

s1ck · 2022-01-06T17:02:08Z

This is very interesting. Thanks for sharing!

I reproduced your benchmarks and realized that for Use enums instead of constant Strings, the length of the enum values seems to be relevant. If I use ten single letter names, e.g. A to J the hashmap is actually faster compared to the enum map. For longer value names (~10 chars), I see the behavior that you showed, but not as drastic, more like 60us/op vs 80us/op. If I use your AnEnum I don't see any difference.

Those are my results

Benchmark                                          (seed)  (size)  Mode  Cnt   Score    Error   Units
EnumMapBenchmark.longEnumMap                           42   10000  avgt   10  57.703 ±  3.460   us/op
EnumMapBenchmark.longEnumMap:·gc.alloc.rate            42   10000  avgt   10  ≈ 10⁻⁴           MB/sec
EnumMapBenchmark.longEnumMap:·gc.alloc.rate.norm       42   10000  avgt   10   0.024 ±  0.001    B/op
EnumMapBenchmark.longEnumMap:·gc.count                 42   10000  avgt   10     ≈ 0           counts
EnumMapBenchmark.longHashMap                           42   10000  avgt   10  79.336 ±  0.284   us/op
EnumMapBenchmark.longHashMap:·gc.alloc.rate            42   10000  avgt   10  ≈ 10⁻⁴           MB/sec
EnumMapBenchmark.longHashMap:·gc.alloc.rate.norm       42   10000  avgt   10   0.034 ±  0.004    B/op
EnumMapBenchmark.longHashMap:·gc.count                 42   10000  avgt   10     ≈ 0           counts
EnumMapBenchmark.shortEnumMap                          42   10000  avgt   10  57.722 ±  3.789   us/op
EnumMapBenchmark.shortEnumMap:·gc.alloc.rate           42   10000  avgt   10  ≈ 10⁻⁴           MB/sec
EnumMapBenchmark.shortEnumMap:·gc.alloc.rate.norm      42   10000  avgt   10   0.024 ±  0.002    B/op
EnumMapBenchmark.shortEnumMap:·gc.count                42   10000  avgt   10     ≈ 0           counts
EnumMapBenchmark.shortHashMap                          42   10000  avgt   10  57.004 ±  0.565   us/op
EnumMapBenchmark.shortHashMap:·gc.alloc.rate           42   10000  avgt   10  ≈ 10⁻⁴           MB/sec
EnumMapBenchmark.shortHashMap:·gc.alloc.rate.norm      42   10000  avgt   10   0.024 ±  0.005    B/op
EnumMapBenchmark.shortHashMap:·gc.count                42   10000  avgt   10     ≈ 0           counts

My code is here

Again, thanks for sharing!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

posts/5-java-mundane-performance-tricks #30

posts/5-java-mundane-performance-tricks #30

utterances-bot commented Nov 29, 2021

himanshumps commented Nov 29, 2021

cbaldan commented Nov 29, 2021

danjmarques commented Nov 29, 2021

richardstartin commented Nov 29, 2021

mikejhill commented Nov 29, 2021

danjmarques commented Nov 29, 2021

richardstartin commented Nov 29, 2021

danjmarques commented Nov 29, 2021

jmzc commented Dec 9, 2021

richardstartin commented Dec 9, 2021

s1ck commented Jan 6, 2022

posts/5-java-mundane-performance-tricks #30

posts/5-java-mundane-performance-tricks #30

Comments

utterances-bot commented Nov 29, 2021

5 Mundane Java Performance Tips | Richard Startin’s Blog

himanshumps commented Nov 29, 2021

cbaldan commented Nov 29, 2021

danjmarques commented Nov 29, 2021

richardstartin commented Nov 29, 2021

mikejhill commented Nov 29, 2021

danjmarques commented Nov 29, 2021

richardstartin commented Nov 29, 2021

danjmarques commented Nov 29, 2021

jmzc commented Dec 9, 2021

richardstartin commented Dec 9, 2021

s1ck commented Jan 6, 2022