Support Questions
Find answers, ask questions, and share your expertise

GeoSpatial data in Hadoop - HDFS or HBase?

GeoSpatial data in Hadoop - HDFS or HBase?

New Contributor

Hi,

 

1. When should I use GeoSpatial data in HDFS and when in HBase?

 

2. Seem that for HDFS I should use ESRI Geometry and Hive Spatial, and for HBase use GeoMesa?

 

3. Sample query we require - For a defined polygon (GeoJSON column), find all routes (table row = route) that go through it.

 

4. Another sample query we require - For a defined polygon (GeoJSON column), find all other polygons (other rows) which match it (congruent).

 

5. General question - As for as I know, HBase is not suitable for analytics - When is it suitable to use it as an Analytics engine using Spark? How would this perform compared to HDFS?

Seems the presentation below from ESRI addresses the HDFS vs HBase question, but it is not quite clear to me:
http://proceedings.esri.com/library/userconf/fed15/papers/fed_142.pdf
Thanks!

1 REPLY 1
Highlighted

Re: GeoSpatial data in Hadoop - HDFS or HBase?

Cloudera Employee

Hello @EranK 

 

Here is an example of using a combination of HDFS and HBase to manage geospatial data - you may find their architecture of interest: https://www.slideshare.net/Hadoop_Summit/grailer-hochmuth-june27515pmroom212v3