Cloud-Scale Geospatial Data Services

With the explosion of new sensors and service offerings producing geospatial telemetry, there’s an ever-increasing need for tools to gain business insights from this data. One of the premier tools for this in the geospatial domain is GeoServer.

Fully open-source and free to use, GeoServer provides Open Geospatial Consortium (OGC) web service interfaces to rendering images or complete metadata in most common geospatial interchange formats. In a consulting capacity, Applied Information Sciences has leveraged Geoserver with great success, allowing us to deploy a complete software stack in minutes instead days or weeks. In this post I’ll give an overview of the DevOps practices we’ve applied to enable this capability, as well as a brief overview of the supporting technologies.

The GeoServer project provides an exceptionally quick setup process for the server on all platforms suitable for small projects. When the need for additional servers (to satisfy heavy loads or the need for failover) arise, configuration becomes much more complex. AIS has been configuring GeoServer deployments for many years now and we’ve identified a number of optimizations and deployment best practices that can be applied to ensure consistent performance for the largest datasets. These lessons learned have been distilled and are available in Docker images that AIS has made publicly available via Docker Hub.

Docker a huge topic in the DevOps conversation, particularly over the last year, and will certainly continue to grow in popularity. Unless you’ve been living under a rock, you’ve probably heard that Docker lets you to package and run software in an isolated manner, somewhat akin to virtual machine images. But the real beauty of Docker is in the toolchain simplicity that it introduced to abstract the complexity of the container technologies, increasing adoption among the rank-and-file development crowd.

Using minimal Linux distributions within Docker images, AIS is able to package software in a manner that is entirely platform agnostic. For the GeoServer Docker images, we are using Alpine Linux with only Java Runtime Environment 8 and GeoServer 2.10.0. This yields a fully cross-platform GeoServer Docker image, weighing in at only 134MiBs (~140.5 MB).

Our primary use for these Docker images is via deployment from the Data Center Operation System (DC/OS) Universe. The DC/OS exposes heterogeneous infrastructure in a consistent manner, allowing for effortless deployment of complex software architectures, enabled by the Universe packaging. DC/OS builds on Apache Mesos for the hardware abstraction, providing execution frameworks such as Metronome and Marathon for scheduled or long-running cluster jobs. DC/OS can be deployed onto almost any local, on-premises or cloud infrastructure, granting portability for your entire software system. This allows for GUI or CLI installation of our GeoServer package in minutes using the DC/OS provided interfaces, running within Marathon providing a load-balanced, fault-tolerant service.

The following system diagram provides a high-level visual of how the GeoServer package operates within the DC/OS:

If this service sounds like a capability that could meet your needs, give it a try for yourself. Complete documentation for deployment can be found here. A deployment of DC/OS 1.8+ will be required and instructions for this can be found for your preferred deployment environment here.

About Jonathan Meyer

Jonathan Meyer is a veteran Software Engineer with over 10 years development experience. This includes multiple geospatially enabled web applications, ETL solutions and development on the Scale cluster procressing framework. He is dedicated to improving development practices, fostering open source development and executing sensible DevOps in the field of government contracting. In his spare time, Jonathan likes hanging out with his two girls, exploring new and exciting places and dabbling in amateur photography.