Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BaseUniqueIDGenerator uses System.currentTimeMillis #3

Open
buko opened this issue Jun 30, 2019 · 13 comments
Open

BaseUniqueIDGenerator uses System.currentTimeMillis #3

buko opened this issue Jun 30, 2019 · 13 comments

Comments

@buko
Copy link

buko commented Jun 30, 2019

BaseUniqueIDGenerator uses System.currentTimeMillis to determine the current millisecond.

This makes it impossible to regenerate a sequence of IDs given a sequence of timestamps and calls to the IDGenerator. So this library can't be used in deterministic systems that need to replay a sequence of events and regenerate the exact same IDs. It also makes it impossible to write deterministic tests.

It would be nice if it was possible to pass a 'Clock' to the interface. The Clock might have a single method, 'currentTimeMillis()'. Custom implementations could provide their own Clock implementation which would provide for determinism. The default 'WallClock' could of course continue to use System.currrentTimeMillis() to get the wall time.

jdhoek added a commit that referenced this issue Jul 1, 2019
@jdhoek
Copy link
Member

jdhoek commented Jul 1, 2019

Sounds sensible; even for testing alone it makes sense to do this. Please have a look at #4 for a possible approach.

I'm not sure if this would create a truly deterministic scenario in normal use though. Wouldn't there be too many factors to control to create a scenario that can be replayed?

@buko
Copy link
Author

buko commented Jul 1, 2019

#4 looks fine. (Though I would suggest that the method really should be called 'currentTimeMillis()' and not 'getCurrentTimeMillis'. This value is not a standard Javabean property. It calculates a value. You could also just call it 'time()' since even 'currentTimeMillis()' is pretty redundant.)

There's a lot to like about this project. The code is high-quality and well-tested and the support for SPREAD and SEQUENTIAL is nice. But we can't use it as is. The use of ByteBuffer.allocate() (instead of shifting bits into a long), the Thread.sleep and the garbage generated by the Blueprint object allocation make it a no-go for use in a high-performance system.

I'm curious if you're open to further refactoring of the code? I understand if you like the code the way it is and heck if it ain't broke don't fix it, but I think with some minor tweaks this could be a useful project.

@buko
Copy link
Author

buko commented Jul 1, 2019

When I say ByteBuffer.allocate what I meant is that it's possible to just use a long (created on the stack) and then shift the information into the long. See https://www.callicoder.com/distributed-unique-id-sequence-number-generator/ .

The irony is that I've written code to do this several times but never open-sourced it. Would be nice to get it into a decent open-source Apache2 project once and for all.

@jdhoek
Copy link
Member

jdhoek commented Jul 1, 2019

Though I would suggest that the method really should be called 'currentTimeMillis()' and not 'getCurrentTimeMillis'.

Fair enough.


Thank you for the input — we're certainly open to pull requests. This library scratches a particular itch we have, but as far as I know we (Lable, a small healthcare SaaS company in the Netherlands) are its only serious user at the moment. (Although, who knows?)

The use of ByteBuffer.allocate() (instead of shifting bits into a long), the Thread.sleep and the garbage generated by the Blueprint object allocation make it a no-go for use in a high-performance system.

The ByteBuffer usage sounds trivial to fix. I probably wrote it that way for clarity at first; I agree that a more performant manner is preferable.

I'd like to know more about the Blueprint object allocation and Thread.sleep issues you point out. Is there a way to avoid using Thread.sleep in this manner?

jdhoek added a commit that referenced this issue Jul 1, 2019
@buko
Copy link
Author

buko commented Jul 1, 2019

We try to avoid object allocation unless it's absolutely necessary. It's why we would want to return a 'long' rather than allocating a new byte[]. The Blueprint object doesn't seem to add much value honestly it's just wrapping four parameters that could be passed directly to IDBuilder.

For the Thread.sleep issue we'd want to raise an exception and fail message processing rather than sleeping the thread. We really don't want to keep sleep in a high performance system, like ever. This could be a configurable policy issue?

This library scratches a particular itch we have, but as far as I know we (Lable, a small healthcare SaaS company in the Netherlands) are its only serious user at the moment. (Although, who knows?)

I hear that. I think everybody just ends up writing their own idgeneration stuff. But I've written such code a dozen times and I was thinking it might be nice to open-source it. You guys already have this nice project which is basically what we want and the maven integration and the zookeeper stuff which looks interesting.

All the changes I'm thinking of are purely about performance. But the change from byte[] to long would be a big, API-breaking change that you and your clients might not like. It might be better for us to do our own thing in that case. Let me know what you think.

@jdhoek
Copy link
Member

jdhoek commented Jul 1, 2019

The Blueprint object doesn't seem to add much value honestly it's just wrapping four parameters that could be passed directly to IDBuilder.

The Blueprint is more useful when deconstructing a generated ID for its constituent parts. I agree that it isn't necessary internally at the generation step.

We really don't want to keep sleep in a high performance system, like ever. This could be a configurable policy issue?

Certainly, making it configurable behaviour is simple enough. But how will you handle the generation of many IDs at roughly the same time? This library limits itself to 8 bytes as a fundamental design choice, which limits the number of IDs that can be generated by a single generator to 64 per millisecond (there can of course be several generators — up to 256). This can be fine depending on the intended use — in our case, a single process may need to assign hundreds of IDs to, e.g., newly created database records, but the processing of the input data tends to take enough time that even the low limit of 64 per millisecond (per generator) is not something we tend to hit.

But the library is designed to handle the rare cases where it does need to generate more IDs than it can fit into a single millisecond. It simply waits for the next clock tick if it does hit that limit — i.e., the Thread.sleep won't happen often. How would a process that foregoes the waiting handle the case where 64 IDs are generated within a single millisecond?

We try to avoid object allocation unless it's absolutely necessary. It's why we would want to return a 'long' rather than allocating a new byte[]. […] But the change from byte[] to long would be a big, API-breaking change that you and your clients might not like.

It's probably not as big a change as it seems, or at least one that can be affected gradually. I think I can keep the original interface methods in place as default methods wrapping the long at first.

@buko
Copy link
Author

buko commented Jul 1, 2019

How would a process that foregoes the waiting handle the case where 64 IDs are generated within a single millisecond?

I think 64 per a millisecond would be enough. That's 64K a second. We're looking to process ~20K messages a second. When a message arrives time freezes and everything happens at once. I don't see us ever generating more than 10-20 ids for each message. This is all happening in a single thread.

Anyways if we did generate more than 64 I think the better thing to do would be to surface the exception, reject the request, and move on to the next message. We wouldn't want the processing of one message to freeze the whole system in such an unpredictable manner and slow down the processing of all subsequent messages.

@jdhoek
Copy link
Member

jdhoek commented Jul 1, 2019

Anyways if we did generate more than 64 I think the better thing to do would be to surface the exception, reject the request, and move on to the next message. We wouldn't want the processing of one message to freeze the whole system in such an unpredictable manner and slow down the processing of all subsequent messages.

I guess that depends on your processing architecture. Anyway, making that a configurable setting is not that involved.

@buko
Copy link
Author

buko commented Jul 1, 2019

I can create the following issues if you're interested:

  1. Zero garbage. Don't allocate any objects on the id generation hotpath.
  2. Make the policy decision to sleep vs throw an exception on sequence overflow configurable via standard Java. (We don't need any properties file or runtime configuration option here).
  3. Add an Automatic Module Name (http://branchandbound.net/blog/java/2017/12/automatic-module-name//#disqus_thread) to make the jars usable in Java9+

I think those are the basic changes we'd be interested in. Let me know what you think.

BTW, one thing I might add is that the documentation is great but a little confusing in some places. I think it's because what you call a 'cluster ID' I've always known as 'instance ID' -- it identifies a unique instance within a cluster. I found this line in particular:

In addition to the generator-ID, the cluster-ID allows for 16 'clusters' of generators to simultaneously generate identifiers.

Kept throwing me off. The word cluster here is being used to refer to the opposite of a computing cluster, a computing instance. This could be reworded to be a little clearer:

In addition to the generator-ID each process also has a cluster-ID. There can be up to 16 processes within a cluster that share the same generator-ID but have different cluster-IDs (0-15). All 16 processes can generate IDs simultaneously and their IDs are guaranteed to always be unique.

@jdhoek
Copy link
Member

jdhoek commented Jul 2, 2019

generator-ID, […] cluster-ID

The purpose of the generator-IDs is to be able to run a number of generators concurrently in multiple processes on multiple servers (mostly running Tomcat instances in our case). For the (automatic) selection of a generator-ID we use a queueing mechanism in the ZooKeeper quorum (that is, claiming a generator-ID is generally instantaneous, but the queueing algorithm is there to handle multiple processes coming in to claim a generator-ID at roughly the same time). The cluster-ID is the same for all processes attempting to claim an available generator-ID on the same ZooKeeper quorum.

With the cluster-ID it is possible to run ID-generators on other server-farms that don't share the same ZooKeeper quorum (in our case a second server-farm in another hosting location), without generating duplicate IDs. This is useful in our case, because we replicate data between our server-farms.

Given that explanation, do you think the terms generator-ID and cluster-ID are chosen wrong?

@buko
Copy link
Author

buko commented Jul 2, 2019

Interesting. It's certainly a different perspective :)

I come from a domain where you have an application and there are multiple instances of the same application running in a cluster at the same time. Each instance of the application is processing a subset of the incoming request stream. Each instance of the application has a primary and backup. I think this is also what Wikipedia calls cluster -- multiple machine nodes all running the same code, doing the same thing, that look like one machine to a user.

What you call 'generator id' I've always known as 'application id' and what you call 'cluster id' I've always known as 'instance id'.

Given that explanation, do you think the terms generator-ID and cluster-ID are chosen wrong?

I don't think it's wrong per se it's just different. For people who come from a world where cluster means one application running simultaneously across multiple machine instances it's not immediately obvious that you're using the word 'cluster' to refer to a kind of Zookeeper namespace -- even nodes that end up with the same generator-id will have generate different ids when they're connected to different zookeepers.

Anyways it confused me for a bit but I eventually figured it out :)

@jdhoek
Copy link
Member

jdhoek commented Jul 2, 2019

I can create the following issues if you're interested:

1. Zero garbage. Don't allocate any objects on the id generation hotpath.

I think the conversion from byte[] to long is probably the most involved part, especially in IDBuilder#build. I'd like to keep Blueprint around as the output of IDBuilder#parse.

2. Make the policy decision to sleep vs throw an exception on sequence overflow configurable via standard Java. (We don't need any properties file or runtime configuration option here).

Passed as a boolean to BaseUniqueIDGenerator's constructor?

3. Add an Automatic Module Name (http://branchandbound.net/blog/java/2017/12/automatic-module-name//#disqus_thread) to make the jars usable in Java9+

We haven't gotten around to progressing to Java 11 yet. If this helps anyone using the library in Java 9+, by all means.

Sounds great. Could you branch of from feature/clock?

@jdhoek
Copy link
Member

jdhoek commented Jul 2, 2019

Anyways it confused me for a bit but I eventually figured it out :)

Thanks for the feedback. I'll improve the documentation on that point with some examples.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants