The Hive UDF's are modeled after existing
implementations of ST_Geometry. Some functions exist only in Hive’s
implementation, a few behave different or don’t exist.
Overloaded
constructors - These overloaded constructors differ from other
ST_Geometry implementations in how the caller can specify the spatial-reference
ID. Default SRID is plane, when the SRID is not specified. Hive does not accept SRID in second argument - wrap with ST_SetSRID or use ST_GeomFromText. Applies to ST_Point, ST_LineString, ST_Polygon, ST_MultiPoint, ST_MultiLineString, ST_MultiPolygon.
ST_PointN - Return type varies in
the case of index out of range - Hive: null;
ST_AsText -
The OGC WKT standard dictates that a MultiPoint is represented as MULTIPOINT
((1 2),(3 4)); however some
existing WKT parsers accept only MULTIPOINT (1 2, 3 4). ST_AsText outputs the former,
compliant format, with the nested parentheses.
ST_Envelope -
In the case of a point or a vertical or horizontal line, ST_Envelope may either
apply a tolerance or return an empty envelope.
ST_Intersection - In the case where the two
geometries intersect in a lower dimension, ST_Intersection may drop the
lower-dimension intersections, or output a closed linestring.
To share functions globally across
sessions create them without “temporary” option. This has the advantage that you do not need to declare the functions for every session.
create function ST_AsBinary as'com.esri.hadoop.hive.ST_AsBinary'
You can also include the jar file in the create
function statement which makes it easier to create a permanent declaration. For example, for the definition of
the ST_Point function you would write the following SQL statement:
create function ST_Point as ‘com.esri.hadoop.hive.ST_Point‘ using jar ‘hdfs://YourHDFSClientNode:8020/esri/spatial-sdk-hive-1.1.1-SNAPSHOT.jar’;
Final Notes
As discussed with ESRI recently, there are no plans to open source all spatial functions currently available for
traditional RDBMS like Oracle, SQL Server, or Netezza, as those are commercially
licensed packages. The best option to compensate for the 5-10% missing
functions is to contribute to ESRI’s open source repository: https://github.com/Esri/spatial-framework-for-hadoop. ESRI does not provide a commercial library
for Hive including all spatial functions.