GeoMesa is an Apache licensed open source suite of tools that
enables large-scale geospatial analytics on cloud and distributed computing
systems, letting you manage and analyze the huge spatio-temporal datasets that
IoT, social media, tracking, and mobile phone applications seek to take
advantage of today.
GeoMesa does this by providing spatio-temporal data persistence
on top of the Accumulo, HBase, and Cassandra distributed column-oriented
databases for massive storage of point, line, and polygon data. It allows rapid
access to this data via queries that take full advantage of geographical
properties to specify distance and area. GeoMesa also provides support for near
real time stream processing of spatio-temporal data by layering spatial
semantics on top of the Apache Kafka messaging system.
GeoMesa features include the ability to:
gigabytes to petabytes of spatial data (tens of billions of points or more)
up tens of millions of points in seconds
data faster than 10,000 records per second per node
horizontally easily (add more servers to add more capacity)
a map through GeoServer or other OGC Clients
An existing HBase 1.1.x
installation is helpful but not necessary. The tutorial described will work
either with an existing HBase server or by downloading the HBase binary
distribution and running it in "standalone" mode (described below).
GeoServer is only
required for visualizing the HBase data. Setting up GeoServer is beyond the
scope of this tutorial.
Download and Build the Tutorial
geomesa-tutorials distribution from GitHub:
The pom.xml file contains an explicit list of dependent libraries that
will be bundled together into the final tutorial. You should confirm that the
versions of HBase and Hadoop match what you are running; if it does not match,
change the values of the hbase.version and hbase.hadoop.version properties. The version of GeoMesa that this tutorial
targets matches the project version of the pom.xml. (Note that this tutorial has been
tested with GeoMesa 1.2.2 or later).
reason these libraries are bundled into the final JAR is that this is easier
for most people to do this than it is to set the classpath when running the
tutorial. If you would rather not bundle these dependencies, mark them as provided in the POM, and update your classpath as appropriate.
GeoMesa's HBaseDataStore searches for a file called hbase-site.xml, which among other things configures the Zookeeper host(s) and
port. If this file is not present on the classpath, the hbase-default.xml provided by hbase-common sets the default zookeeper quorum
to "localhost" and port to 2181, which is what is used by the
standalone HBase described in "Setting up HBase in standalone mode"
above. If you have an existing HBase installation, you should copy your hbase-site.xml file into geomesa-quickstart-hbase/src/main/resources (or otherwise add it to the
classpath when you run the tutorial).
To build the tutorial
$ cd geomesa-quickstart-hbase
$ mvn clean install
When this is complete,
it should have built a JAR file that contains all of the code you need to run
Run the Tutorial
First, make sure that hbase.table.sanity.check
property is set to false in hbase-site.xml
The only argument passed
is the name of the HBase table where GeoMesa will store the feature type
information. It will also create a table called <tablename>_<featuretype>_z3 which will store the
Z3-indexed features. In our specific case, the table name will be geomesa_QuickStart_z3.
You should see output
similar to the following (not including some of Maven's output and log4j's
warnings), which lists the features that match the specified query in the
There are many reasons that GeoMesa can provide the best solution to your spatio-temporal database needs:
You have Big Spatial Data sets and are reaching performance limitations of relational database systems. Perhaps you are looking at sharding strategies and wondering if now is the time to look for a new storage solution.
You have very high-velocity data and need high read and write speeds.
Your analytics operate in the cloud, perhaps using Spark, and you want to enable spatial analytics.
You are looking for a supported, open-source alternative to expensive proprietary solutions.
You are looking for a Platform as a Service (PaaS) database where you can store Big Spatial Data.
You want to filter data using the rich Common Query Language (CQL) defined by the OGC.