Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add LabelCenterline function #426

Open
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

quincylvania
Copy link

Closes #354. The goal is to replace acalcutt/osm-lakelines with a faster SQL function. I've never written anything this complex in SQL before, so any help with efficiency and best practices is appreciated. I'm also new to this project, so I'm not sure about where to hook this in.

The logic:

  1. Get the exterior polygon (ignoring islands) and use a convex hull if for some reason there's more than one exterior polygon
  2. Use ST_LineInterpolatePoints to evenly distribute vertices around the polygon
  3. Get ST_VoronoiLines from the interpolated vertices and covert to edges
  4. Filter to only edges within the polygon
  5. Recursively clip the shortest branch edges until only a single line is left
  6. Simplify, smooth, and return

SFCGAL is not required for this method.

It takes about 40 seconds to calculate lines for the 395 largest lakes of Michigan, which I think is already a big improvement over osm-lakelines. The recursion takes about 75% of this time and can probably be made vastly more efficient.

The results are relatively consistent with osm-lakelines and I think look pretty good, except when there are large islands or bays.

Screenshot 2023-02-27 at 9 47 15 PM

Screenshot 2023-02-27 at 9 47 01 PM

Screenshot 2023-02-27 at 9 45 37 PM

@nvkelso
Copy link
Collaborator

nvkelso commented Feb 28, 2023

Rad!

@nyurik
Copy link
Member

nyurik commented Feb 28, 2023

this is really impressive! The sql style is a bit different from other files (e.g. using ', etc), but no biggie. I will be studying it for a bit for sure

@pnorman
Copy link
Collaborator

pnorman commented Feb 28, 2023

I'll try to give this a detailed review, as it's something I've been interested in.

@nyurik
Copy link
Member

nyurik commented Feb 28, 2023

Among the above examples, I think only this one seems a bit incorrect:
image

@quincylvania
Copy link
Author

The sql style is a bit different from other files (e.g. using ', etc), but no biggie.

I converted these to PL/pgSQL functions to avoid the block string, but I see that this also enables some language features we might be able to use to speed up the code. I'll look into it at some point.

Among the above examples, I think only this one seems a bit incorrect:

Yes, the more circular the feature, the harder it is to determine a sensical centerline. We might try baking in a threshold where circular stuff just defaults to a horizontal line.

We could also achieve better results by allowing multiple lines in the output to account for islands and bays, perhaps based on the zoom. Lots of of shapes just really aren't modeled sufficiently by a single line. Ex: Manicouagan Reservoir.

Screenshot 2023-02-28 at 11 17 21 AM

Copy link
Collaborator

@pnorman pnorman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As all three functions are simply a single SQL statement and returning the result of that statement, they should be SQL, not plpgsql. This has performance advantages by avoiding the time to start up a plpgsql environment every function call, and allowing inlining.

More comments would be good, as it's not clear what's going on in all steps.

Minor code nits:

  • these functions should probably all be STRICT
  • caps on as vs AS is inconsistent

CREATE OR REPLACE FUNCTION CountDisconnectedEndpoints(polyline geometry, testline geometry)
RETURNS integer AS $$
BEGIN
RETURN ST_NPoints(ST_RemoveRepeatedPoints(ST_Points(polyline)))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An explanation of the equation would be good here.

sql/LabelCenterline.sql Outdated Show resolved Hide resolved
sql/LabelCenterline.sql Outdated Show resolved Hide resolved
sql/LabelCenterline.sql Outdated Show resolved Hide resolved
@quincylvania
Copy link
Author

Okay, I added a number of enhancements to get nicer results in a reasonably fast timeframe. It now takes just 15 seconds to calculate the basic centerlines for the 395 largest lakes of Michigan, down from 40 seconds. Smoothing is now done proportionally to the size of the feature.

Screenshot 2023-03-10 at 1 42 49 PM

Multiple lines over a given length threshold can be included in the output. This allows more accurate labeling at higher zooms.

Screenshot 2023-03-10 at 1 44 08 PM

Holes over a given size can be included in the calculation as well. This takes a little longer to calculate, 22 seconds for these lakes.

Screenshot 2023-03-10 at 1 44 55 PM

More examples:
Screenshot 2023-03-10 at 1 46 03 PM
Screenshot 2023-03-10 at 1 51 02 PM
Screenshot 2023-03-10 at 1 46 53 PM
Screenshot 2023-03-10 at 1 46 30 PM

There as six parameters to customize these behaviors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Updating lake centerlines
4 participants