Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support 'correction-of-prediction' encoding for linestrings #137

Open
Jakobovski opened this issue May 12, 2019 · 2 comments
Open

Support 'correction-of-prediction' encoding for linestrings #137

Jakobovski opened this issue May 12, 2019 · 2 comments

Comments

@Jakobovski
Copy link

I am suggesting a new encoding type for linestrings which is smaller for objects that have minimal curvature, such as roads which generally continue in their previous direction.

The current approach is to encode the first point and then the delta for the following points. I tested a slightly different approach that is ~17% smaller (at least on the data I used). The idea is as follows. The first point is encoded in full, the second point is encoded as the delta of the first, and then all subsequent points are encoded as a 'correction-of-prediction'.

The first two points are identical to the current approach. To calculate the 3rd point (and subsequent points), a vector is drawn from the first to second point, that vector is then added to the second point(this is the prediction of where the 3rd point should be) and then the value that is encoded is used to correct the prediction. Mathematically

p1: encoded as (x1, y1), calculated as (x1, y1)
p2: encoded as (dx1, dy1), calculated as (x1 + dx1, y1 + dy1)
p3: encoded as (dx2, dy2), calculated as p2 + (p2-p1) + (dx2, dy2)

This works best for roads because they generally continue in their previous direction.

Next step:
Test this encoding on a larger dataset and compare results to current encoding.

@joto
Copy link

joto commented May 14, 2019

It would be interesting to see how this approach works with other types of data. For instance I would assume that data for buildings (4 corners, right angles) becomes worse, because the prediction will be quit bad. It might also depend on how large objects are to begin with compared to the (typically 4096) extent. After all the new encoding only matters if the deltas are now so much smaller that the varint encoding actually uses less bytes.

I would want to see more data from diverse datasets before this backwards-incompatible change would be made.

Of course we can make this optional which would help a bit with compatibility, but many options can lead to problems with interoparability between different implementations.

If we make it optional so the user can decide which one to use and maybe use the new one for roads and the old one for buildings etc., we put the burden on the user to decide which one is best for their data. In this case we would need some good guidelines and/or software that does the right thing.

@Jakobovski
Copy link
Author

@joto

In my opinion the encoding type should be optional, as there are many cases where it is inferior to the current encoding. I think it should be left to the user to decide which encoding type is best for the given data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants