Community Articles
Find and share helpful community-sourced technical articles
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.
Labels (1)

Introduction

Geospatial data is generated in huge volumes with the rise of the Internet of Things. IoT sensor networks are pushing the geospatial data rates even higher. There has been an explosion of sensor networks on the ground, mobile devices carried by people or mounted on vehicles, drones flying overhead, tethered aerostats (such as Google’s Project Loon), atmosats at high altitude, and microsats in orbit.

Opportunity

Geospatial analytics can provide us with the tools and methods we need to make sense of all that data and put it to use in solving problems we face at all scales.

Challenges

Geospatial work requires atypical data types (e.g., points, shapefiles, map projections), potentially many layers of detail to process and visualize, and specialized algorithms—not your typical ETL (extract, transform, load) or reporting work.

Apache Spark Role in Geospatial Development

While Spark might seem to be influencing the evolution of accessory tools, it’s also becoming a default in the geospatial analytics industry. For example, consider the development of Azavea’s open source geospatial library GeoTrellis. GeoTrellis was written in Scala and designed to handle large-scale raster operations. GeoTrellis recently adopted Spark as its distributed computation engine and, in combination with Amazon Web Services, scaled the existing raster processing to support even larger datasets. Spark brings amazing scope to the GeoTrellis project, and GeoTrellis supplies the geospatial capabilities that Spark lacks. This reciprocal partnership is an important contribution to the data engineering ecosystem, and particularly to the frameworks in development for supporting Big Data.

About GeoTrellis

GeoTrellisis a Scala library and framework that uses Spark to work with raster data. It is released under the Apache 2 License.

GeoTrellis reads, writes, and operates on raster data as fast as possible. It implements manyMap Algebraoperations as well as vector to raster or raster to vector operations.

GeoTrellis also provides tools to render rasters into PNGs or to store metadata about raster files as JSON. It aims to provide raster processing at web speeds (sub-second or less) with RESTful endpoints as well as provide fast batch processing of large raster data sets.

Getting Started

GeoTrellis is currently available for Scala 2.11 and Spark 2.0+.

To get started with SBT, simply add the following to your build.sbt file:

libraryDependencies += "org.locationtech.geotrellis" %% "geotrellis-raster" % "1.0.0"

geotrellis-rasteris just one submodule that you can depend on.

To grab the latest snapshot build, add our snapshot repository:

resolvers += "LocationTech GeoTrellis Snapshots" at "https://repo.locationtech.org/content/repositories/geotrellis-snapshots"

GeoTrellis Modules

  • geotrellis-proj4: Coordinate Reference systems and reproject (Scala wrapper around Proj4j)
  • geotrellis-vector: Vector data types and operations (Scala wrapper around JTS)
  • geotrellis-raster: Raster data types and operations
  • geotrellis-spark: Geospatially enables Spark; save to and from HDFS
  • geotrellis-s3: S3 backend for geotrellis-spark
  • geotrellis-accumulo: Accumulo backend for geotrellis-spark
  • geotrellis-cassandra: Cassandra backend for geotrellis-spark
  • geotrellis-hbase: HBase backend for geotrellis-spark
  • geotrellis-spark-etl: Utilities for writing ETL (Extract-Transform-Load), or "ingest" applications for geotrellis-spark
  • geotrellis-geotools: Conversions to and from GeoTools Vector and Raster data
  • geotrellis-geomesa: Experimental GeoMesa integration
  • geotrellis-geowave: Experimental GeoWave integration
  • geotrellis-shapefile: Read shapefiles into GeoTrellis data types via GeoTools
  • geotrellis-slick: Read vector data out of PostGIS viaLightBend Slick
  • geotrellis-vectortile: Experimental vector tile support, including reading and writing
  • geotrellis-raster-testkit: Testkit for testing geotrellis-raster types
  • geotrellis-vector-testkit: Testkit for testing geotrellis-vector types
  • geotrellis-spark-testkit: Testkit for testing geotrellis-spark code

A more complete feature list can be found at https://github.com/locationtech/geotrellis, GeoTrellis Features section.

Hello Raster with GeoTrellis

scala> import geotrellis.raster._
import geotrellis.raster._

scala> import geotrellis.raster.op.focal._
import geotrellis.raster.op.focal._

scala> val nd = NODATA
nd: Int = -2147483648

scala> val input = Array[Int](
     |         nd, 7, 1, 1, 3, 5, 9, 8, 2,
     |         9, 1, 1, 2, 2, 2, 4, 3, 5,
     |
     |         3, 8, 1, 3, 3, 3, 1, 2, 2,
     |         2, 4, 7, 1, nd, 1, 8, 4, 3)
2, 2, 4, 3, 5, 3, 8, 1, 3, 3, 3, 1, 2, 2, 2, 4, 7, 1, -2147483648, 1, 8, 4, 3)

scala> val iat = IntArrayTile(input, 9, 4)  // 9 and 4 here specify columns and rows
iat: geotrellis.raster.IntArrayTile = IntArrayTile([I@278434d0,9,4)
// The asciiDraw method is mostly useful when you're working with small tiles
// which can be taken in at a glance

scala> iat.asciiDraw()
res0: String =
"    ND     7     1     1     3     5     9     8     2
     9     1     1     2     2     2     4     3     5
     3     8     1     3     3     3     1     2     2
     2     4     7     1    ND     1     8     4     3
"

scala> val focalNeighborhood = Square(1)  // a 3x3 square neighborhood
focalNeighborhood: geotrellis.raster.op.focal.Square =
 O  O  O
 O  O  O
 O  O  O

scala> val meanTile = iat.focalMean(focalNeighborhood)
meanTile: geotrellis.raster.Tile = DoubleArrayTile([D@7e31c125,9,4)

scala> meanTile.getDouble(0, 0)  // Should equal (1 + 7 + 9) / 3
res1: Double = 5.666666666666667

Documentation

  • Further examples and documentation of GeoTrellis use-cases can be found in the docs/ folder
  • Scaladocs for the latest version of the project can be found here:

http://geotrellis.github.com/scaladocs/latest/#geotrellis.package

References

Geospatial Data and Analysisby Aurelia Moser; Bill Day; Jon BrunerPublished by O'Reilly Media, Inc., 2017

http://geotrellis.io/

2,719 Views
Don't have an account?
Coming from Hortonworks? Activate your account here
Version history
Revision #:
1 of 1
Last update:
‎02-22-2017 12:07 AM
Updated by:
 
Contributors
Top Kudoed Authors