Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SEDONA-237] Add ST_Dimension #867

Merged
merged 23 commits into from
Jun 25, 2023
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -554,6 +554,9 @@ public static Geometry split(Geometry input, Geometry blade) {
return new GeometrySplitter(GEOMETRY_FACTORY).split(input, blade);
}

public static Integer dimension(Geometry geometry) {
return geometry.getDimension();
}

/**
* get the coordinates of a geometry and transform to Google s2 cell id
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -268,6 +268,21 @@ public void splitHeterogeneousGeometryCollection() {
assertNull(actualResult);
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please separate different testCases to different tests since these are unit tests.

You can have tests like dimensionGeom2D, dimensionGeomCollection, dimensionGeomEmpty, etc.

Also, use assertEquals(expected, actual) format so that the test failure flags expected and actual properly.

Additionally, please add test cases testing 3D geometry as well

@Test
public void dimensionSingleGeometry() {
Geometry point = GEOMETRY_FACTORY.createPoint(new Coordinate(1, 2));
assert Functions.dimension(point) == 0;

Geometry lineString = GEOMETRY_FACTORY.createLineString(coordArray(0.0, 0.0, 1.5, 1.5));
assert Functions.dimension(lineString) == 1;

Geometry polygon = GEOMETRY_FACTORY.createPolygon(coordArray(0.0, 0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0));
assert Functions.dimension(polygon) == 2;

GeometryCollection geometryCollection = GEOMETRY_FACTORY.createGeometryCollection(new Geometry[]{point, lineString, polygon});
assert Functions.dimension(geometryCollection) == 2;
}

private static boolean intersects(Set<?> s1, Set<?> s2) {
Set<?> copy = new HashSet<>(s1);
copy.retainAll(s2);
Expand Down
14 changes: 14 additions & 0 deletions docs/api/sql/Function.md
Original file line number Diff line number Diff line change
Expand Up @@ -400,6 +400,20 @@ Result:
POLYGON ((0 -3, -3 -3, -3 3, 0 3, 0 -3))
```

## ST_Dimension

Introduction: Return the topological dimension of this Geometry object, which must be less than or equal to the coordinate dimension. OGC SPEC s2.1.1.1 - returns 0 for POINT, 1 for LINESTRING, 2 for POLYGON, and the largest dimension of the components of a GEOMETRYCOLLECTION. If the dimension is unknown (e.g. for an empty GEOMETRYCOLLECTION) 0 is returned.

Format: `ST_Dimension (A:geometry), ST_Dimension (C:geometrycolletion), `
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

type in collection here


Since: `v1.0.0`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be v1.5.0

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please provide docs for Sedona Flink as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


Spark SQL example:
```sql
SELECT ST_Dimension('GEOMETRYCOLLECTION(LINESTRING(1 1,0 0),POINT(0 0))');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please provide the output of the example.

In addition, have you tested other possible types, such as MultiPoint, MultiLineString, MultiPolygon. GeometryCollection can also have nested Multi objects.

We are trying to provide consistent behavior as PostGIS. Please install PostGIS on your end and test the output in PostGIS as well.

```


## ST_Distance

Introduction: Return the Euclidean distance between A and B
Expand Down
1 change: 1 addition & 0 deletions flink/src/main/java/org/apache/sedona/flink/Catalog.java
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ public static UserDefinedFunction[] getFuncs() {
new Functions.ST_Buffer(),
new Functions.ST_ConcaveHull(),
new Functions.ST_Envelope(),
new Functions.ST_Dimension(),
new Functions.ST_Distance(),
new Functions.ST_DistanceSphere(),
new Functions.ST_DistanceSpheroid(),
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
import org.apache.flink.table.annotation.DataTypeHint;
import org.apache.flink.table.functions.ScalarFunction;
import org.locationtech.jts.geom.Geometry;
import org.locationtech.jts.geom.GeometryCollection;
import org.opengis.referencing.FactoryException;
import org.opengis.referencing.operation.TransformException;

Expand Down Expand Up @@ -88,6 +89,15 @@ public Geometry eval(@DataTypeHint(value = "RAW", bridgedTo = org.locationtech.j
}
}

public static class ST_Dimension extends ScalarFunction {
@DataTypeHint("Integer")
public Integer eval(@DataTypeHint(value = "RAW", bridgedTo = org.locationtech.jts.geom.Geometry.class) Object o) {
Geometry geom = (Geometry) o;
return org.apache.sedona.common.Functions.dimension(geom);
}

}

public static class ST_Distance extends ScalarFunction {
@DataTypeHint("Double")
public Double eval(@DataTypeHint(value = "RAW", bridgedTo = org.locationtech.jts.geom.Geometry.class) Object o1,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -154,6 +154,11 @@ public void testTransformWKT() throws FactoryException {
}


@Test
public void testDimension(){
Table pointTable = tableEnv.sqlQuery("SELECT ST_Dimension('GEOMETRYCOLLECTION(LINESTRING(1 1,0 0),POINT(0 0))')");
assertEquals(1, first(pointTable).getField(0));
}
@Test
public void testDistance() {
Table pointTable = createPointTable(testDataSize);
Expand Down
11 changes: 11 additions & 0 deletions python/sedona/sql/st_functions.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@
"ST_ConcaveHull",
"ST_ConvexHull",
"ST_Difference",
"ST_Dimension",
"ST_Distance",
"ST_DistanceSphere",
"ST_DistanceSpheroid",
Expand Down Expand Up @@ -384,6 +385,16 @@ def ST_ConvexHull(geometry: ColumnOrName) -> Column:
"""
return _call_st_function("ST_ConvexHull", geometry)

@validate_argument_types
def ST_Dimension(geometry: ColumnOrName):
"""Calculate the inherent dimension of a geometry column.

:param geometry: Geometry column to calculate the dimension for.
:type geometry: ColumnOrName
:return: Dimension of geometry as an integer column.
:rtype: Column
"""
return _call_st_function("ST_Dimension", geometry)

@validate_argument_types
def ST_Difference(a: ColumnOrName, b: ColumnOrName) -> Column:
Expand Down
3 changes: 3 additions & 0 deletions python/tests/sql/test_dataframe_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,7 @@
(stf.ST_ConcaveHull, ("geom", 1.0, True), "triangle_geom", "", "POLYGON ((1 1, 1 0, 0 0, 1 1))"),
(stf.ST_ConvexHull, ("geom",), "triangle_geom", "", "POLYGON ((0 0, 1 1, 1 0, 0 0))"),
(stf.ST_Difference, ("a", "b"), "overlapping_polys", "", "POLYGON ((1 0, 0 0, 0 1, 1 1, 1 0))"),
(stf.ST_Dimension, ("geom",), "geometry_geom_collection", "", 1),
(stf.ST_Distance, ("a", "b"), "two_points", "", 3.0),
(stf.ST_DistanceSpheroid, ("point", "point"), "point_geom", "", 0.0),
(stf.ST_DistanceSphere, ("point", "point"), "point_geom", "", 0.0),
Expand Down Expand Up @@ -390,6 +391,8 @@ def base_df(self, request):
return TestDataFrameAPI.spark.sql("SELECT ST_GeomFromWKT('LINESTRING (0 0, 2 1)') AS line, ST_GeomFromWKT('POLYGON ((1 0, 2 0, 2 2, 1 2, 1 0))') AS poly")
elif request.param == "square_geom":
return TestDataFrameAPI.spark.sql("SELECT ST_GeomFromWKT('POLYGON ((1 0, 1 1, 2 1, 2 0, 1 0))') AS geom")
elif request.param == "geometry_geom_collection":
return TestDataFrameAPI.spark.sql("SELECT ST_GeomFromWKT('GEOMETRYCOLLECTION(POINT(1 1), LINESTRING(0 0, 1 1, 2 2))' AS geom")
raise ValueError(f"Invalid base_df name passed: {request.param}")

def _id_test_configuration(val):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,7 @@ object Catalog {
function[ST_Within](),
function[ST_Covers](),
function[ST_CoveredBy](),
function[ST_Dimension](),
function[ST_Disjoint](),
function[ST_Distance](),
function[ST_3DDistance](),
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1010,3 +1010,15 @@ case class ST_Translate(inputExpressions: Seq[Expression])
}
}

/**
* Return the topological dimension of this Geometry object
*
* @param inputExpressions
*/
case class ST_Dimension(inputExpressions: Seq[Expression])
extends InferredUnaryExpression(Functions.dimension) with FoldableExpression {
protected def withNewChildrenInternal(newChildren: IndexedSeq[Expression]) = {
copy(inputExpressions = newChildren)
}
}

Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,9 @@ object st_functions extends DataFrameAPI {
def ST_Difference(a: Column, b: Column): Column = wrapExpression[ST_Difference](a, b)
def ST_Difference(a: String, b: String): Column = wrapExpression[ST_Difference](a, b)

def ST_Dimension(geometry: Column): Column = wrapExpression[ST_Dimension](geometry)
def ST_Dimension(geometry: String): Column = wrapExpression[ST_Dimension](geometry)

def ST_Distance(a: Column, b: Column): Column = wrapExpression[ST_Distance](a, b)
def ST_Distance(a: String, b: String): Column = wrapExpression[ST_Distance](a, b)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -231,6 +231,14 @@ class dataFrameAPITestScala extends TestBaseScala {
assert(actualResult == expectedResult)
}

it("Passed ST_Dimension") {
val polygonDf = sparkSession.sql("SELECT ST_GeomFromWKT('POLYGON ((0 0, 1 0, 1 1, 0 0))') AS geom")
val df = polygonDf.select(ST_Dimension("geom"))
val actualResult = df.take(1)(0).get(0).asInstanceOf[Int]
val expectedResult = 2
assert(actualResult == expectedResult)
}

it("Passed ST_Distance") {
val pointDf = sparkSession.sql("SELECT ST_Point(0.0, 0.0) AS a, ST_Point(1.0, 0.0) as b")
val df = pointDf.select(ST_Distance("a", "b"))
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,18 @@ class functionTestScala extends TestBaseScala with Matchers with GeometrySample
assert(functionDf.count() > 0);
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can follow the map way of specifying multiple test cases and their expected output here, it is more readable

it("Passed ST_Dimension") {
val test1 = sparkSession.sql("SELECT ST_Dimension(ST_GeomFromWKT('POINT(1 2)'))")
assert(test1.take(1)(0).get(0).asInstanceOf[Int] == 0)
val test2 = sparkSession.sql("SELECT ST_Dimension(ST_GeomFromWKT('LINESTRING(1 2, 3 4)'))")
assert(test2.take(1)(0).get(0).asInstanceOf[Int] == 1)
val test3 = sparkSession.sql("SELECT ST_Dimension(ST_GeomFromWKT('POLYGON((0 0,0 5,5 0,0 0))'))")
assert(test3.take(1)(0).get(0).asInstanceOf[Int] == 2)
val test4 = sparkSession.sql("SELECT ST_Dimension(ST_GeomFromWKT('GEOMETRYCOLLECTION EMPTY'))")
assert(test4.take(1)(0).get(0).asInstanceOf[Int] == 0)
val test5 = sparkSession.sql("SELECT ST_Dimension('GEOMETRYCOLLECTION(LINESTRING(1 1,0 0),POINT(0 0))')")
assert(test5.take(1)(0).get(0).asInstanceOf[Int] == 1)
}
it("Passed ST_Distance") {
var polygonWktDf = sparkSession.read.format("csv").option("delimiter", "\t").option("header", "false").load(mixedWktGeometryInputLocation)
polygonWktDf.createOrReplaceTempView("polygontable")
Expand Down