Created on 02-22-2017 12:07 AM
Geospatial data is generated in huge volumes with the rise of the Internet of Things. IoT sensor networks are pushing the geospatial data rates even higher. There has been an explosion of sensor networks on the ground, mobile devices carried by people or mounted on vehicles, drones flying overhead, tethered aerostats (such as Google’s Project Loon), atmosats at high altitude, and microsats in orbit.
Geospatial analytics can provide us with the tools and methods we need to make sense of all that data and put it to use in solving problems we face at all scales.
Geospatial work requires atypical data types (e.g., points, shapefiles, map projections), potentially many layers of detail to process and visualize, and specialized algorithms—not your typical ETL (extract, transform, load) or reporting work.
While Spark might seem to be influencing the evolution of accessory tools, it’s also becoming a default in the geospatial analytics industry. For example, consider the development of Azavea’s open source geospatial library GeoTrellis. GeoTrellis was written in Scala and designed to handle large-scale raster operations. GeoTrellis recently adopted Spark as its distributed computation engine and, in combination with Amazon Web Services, scaled the existing raster processing to support even larger datasets. Spark brings amazing scope to the GeoTrellis project, and GeoTrellis supplies the geospatial capabilities that Spark lacks. This reciprocal partnership is an important contribution to the data engineering ecosystem, and particularly to the frameworks in development for supporting Big Data.
GeoTrellisis a Scala library and framework that uses Spark to work with raster data. It is released under the Apache 2 License.
GeoTrellis reads, writes, and operates on raster data as fast as possible. It implements manyMap Algebraoperations as well as vector to raster or raster to vector operations.
GeoTrellis also provides tools to render rasters into PNGs or to store metadata about raster files as JSON. It aims to provide raster processing at web speeds (sub-second or less) with RESTful endpoints as well as provide fast batch processing of large raster data sets.
GeoTrellis is currently available for Scala 2.11 and Spark 2.0+.
To get started with SBT, simply add the following to your build.sbt file:
libraryDependencies += "org.locationtech.geotrellis" %% "geotrellis-raster" % "1.0.0"
geotrellis-raster
is just one submodule that you can
depend on.
To grab the latest snapshot build, add our snapshot repository:
resolvers += "LocationTech GeoTrellis Snapshots" at "https://repo.locationtech.org/content/repositories/geotrellis-snapshots"
geotrellis-proj4
:
Coordinate Reference systems and reproject (Scala wrapper around Proj4j)geotrellis-vector
:
Vector data types and operations (Scala wrapper around JTS)geotrellis-raster
:
Raster data types and operationsgeotrellis-spark
:
Geospatially enables Spark; save to and from HDFSgeotrellis-s3
:
S3 backend for geotrellis-sparkgeotrellis-accumulo
:
Accumulo backend for geotrellis-sparkgeotrellis-cassandra
:
Cassandra backend for geotrellis-sparkgeotrellis-hbase
:
HBase backend for geotrellis-sparkgeotrellis-spark-etl
:
Utilities for writing ETL (Extract-Transform-Load), or "ingest"
applications for geotrellis-sparkgeotrellis-geotools
:
Conversions to and from GeoTools Vector and Raster datageotrellis-geomesa
:
Experimental GeoMesa integrationgeotrellis-geowave
:
Experimental GeoWave integrationgeotrellis-shapefile
:
Read shapefiles into GeoTrellis data types via GeoToolsgeotrellis-slick
:
Read vector data out of PostGIS viaLightBend Slickgeotrellis-vectortile
:
Experimental vector tile support, including reading and writinggeotrellis-raster-testkit
:
Testkit for testing geotrellis-raster typesgeotrellis-vector-testkit
:
Testkit for testing geotrellis-vector typesgeotrellis-spark-testkit
:
Testkit for testing geotrellis-spark codeA more complete feature list can be found at https://github.com/locationtech/geotrellis, GeoTrellis Features section.
scala> import geotrellis.raster._ import geotrellis.raster._ scala> import geotrellis.raster.op.focal._ import geotrellis.raster.op.focal._ scala> val nd = NODATA nd: Int = -2147483648 scala> val input = Array[Int]( | nd, 7, 1, 1, 3, 5, 9, 8, 2, | 9, 1, 1, 2, 2, 2, 4, 3, 5, | | 3, 8, 1, 3, 3, 3, 1, 2, 2, | 2, 4, 7, 1, nd, 1, 8, 4, 3) 2, 2, 4, 3, 5, 3, 8, 1, 3, 3, 3, 1, 2, 2, 2, 4, 7, 1, -2147483648, 1, 8, 4, 3) scala> val iat = IntArrayTile(input, 9, 4) // 9 and 4 here specify columns and rows iat: geotrellis.raster.IntArrayTile = IntArrayTile([I@278434d0,9,4) // The asciiDraw method is mostly useful when you're working with small tiles // which can be taken in at a glance scala> iat.asciiDraw() res0: String = " ND 7 1 1 3 5 9 8 2 9 1 1 2 2 2 4 3 5 3 8 1 3 3 3 1 2 2 2 4 7 1 ND 1 8 4 3 " scala> val focalNeighborhood = Square(1) // a 3x3 square neighborhood focalNeighborhood: geotrellis.raster.op.focal.Square = O O O O O O O O O scala> val meanTile = iat.focalMean(focalNeighborhood) meanTile: geotrellis.raster.Tile = DoubleArrayTile([D@7e31c125,9,4) scala> meanTile.getDouble(0, 0) // Should equal (1 + 7 + 9) / 3 res1: Double = 5.666666666666667
http://geotrellis.github.com/scaladocs/latest/#geotrellis.package
Geospatial Data and Analysisby Aurelia Moser; Bill Day; Jon BrunerPublished by O'Reilly Media, Inc., 2017