It’s time to introduce DevOps to your GIS Data management workflow. If your organization manages big volumes of data, commonly referred to as Big Data, your workflows exhibit these common scenarios; Multiuser distributed data editing, frequent check-out and check-in of spatial data, data versions comparison over short time changes, edits conflict resolution, maintenance of the different versions against the production version during the lifecycle of data collection, cleansing, development and production.
DevOps is a new term that has grown in the GEO-IT industry and GIS is not sidelined in its applicability. It presents a set of principles that in GIS would elaborate how to apply agility in administration and operation of your organizational data development lifecycle. It expands the value of collaboration between you data collection, cleaning, development and production operations team.
Managing multiple versions of geospatial data emanating from multiple teams of editors in a distributed editing environment can be complex especially with regard to version synchronization resulting in conflict at attribute level and geometries. These bottlenecks occur majorly with centralized databases especially for open crowd-sourced data, it is therefore, necessary for editors to track changes while editing data in disconnected environments, using source code version control like tools.
Getting your hands on a tool that would adapt seamlessly to your organisation’s data development workflow and capability has been a challenge faced by many organizations. This has led to most organization jumping into Esri’s world to enjoy ArcGIS Geodatabase Versioning, Editor Tracking and Archiving capabilities. Well, the consequences are forking hude sums for multiple Advanced Desktop and Server licenses.
Out in streets of Open Source there have been remarkable efforts in developing Git like tools for managing geospatial data much like ArcGIS Geodatabase Versioning capabilities. Little know however superb tools gave been made available by loving developers. Most of these tools are available for PostGIS and are delivered as QGIS plugins, well there wouldn’t be any other best way to deliver them except as plugins to the most popular OpenGIS. In the OpenStreetMap platform, Overpass queries can reveal modification history. However, the tool cannot be used for an in-house database.
In this article, we will examine three brilliant geospatial data version control tools which are, GeoGig, PGVersion and QGIS versioning plugin.
GeoGig has been described to have drawn its design concepts from Git the venerable software source code VCS. Git is a free and open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency.
GeoGig makes it possible for users to import raw geospatial data from from Shapefiles, PostGIS or SpatiaLite currently in to a repository where every change to the data is tracked. These changes can be viewed in a history, reverted to older versions, branched into sandboxed areas, merged back in, and pushed to remote repositories. Similarly concept to Git featuring commands like geogig-init, geogig-status, geogig-add, geogig-branch, geogig-apply, geogig-checkout, geogig-commit, geogig-clone. These will perform similar operation on your spatial data as would for Git on source code. To see all the commands and how to use them see more here.
GeoGig can be used in QGIS as a Plugin by installing the GeoGig QGIS Plugin (OLD), check out instructions here which allows you to manage your GeGig repositories right from within QGIS. There is a documentation of the GeoGig Usage workflow in QGIS found here
PostGIS Versioning System (PGVersion or pgvs)
the essential idea here is to version Postgis Layers when more than one person is editing them at the same time. it exposes numerous functions for achieving this;
pgvsinit() pgvscommit() pgvsmerge() pgvsdrop() pgvsrevert() pgvsrevision() pgvslogview() pgvsrollback(
You can find a complete documentation on this functions here.
These tool is developed by Dr. Horst Duester / Sourcepole AG, Zurich, the same developer of the QGIS Cloud.
And as usual the OpenGIS dev community are gracious enough to make your life easier by providing a QGIS Plugin for
pgvs, allowing you to conduct all your Postgis Layers version control from within QGIS.
QGIS versioning plugin
Until this moment the desire is to have a spatial data VCS that;
- employs PostGIS as the main repository, and
- exposes the ability to edit specific geographic sections of a data layer and
- also allow checkout to work offline and later check-in data without having to write any lines of code from the editor.
All while, reducing licensing costs.
In 2015, Valcea County in Romania and eHealth Africa (eHA) in Nigeria used FOSS4G and funded the development of a new QGIS versioning Plugin.
The goal was to develop a tool to manage history of geodata stored in PostGIS using QGIS, but also enable the creation of geographic filtered working copies in SpatiaLite file for offline edition. This plugin is based mainly on ogr2ogr (GDAL) open source library and SQL command line. ~Source GoGeomatics
Read the full story here
A typical use case met by the plugin involves one or more users checking out a local working copy to edit features in (potentially) offline mode using SpatiaLite and committing changes back to the PostGIS server.
Read more from official documentation.
The Plugin can be installed via the
Plugins -> Plugin Manager: Versioning or head over to the QGIS Versioning Plugin documentation to download the latest source code.
I hope these VCS help inject a more agile data management practices in your enterprise. Learn them. Practice them. Be a better spatial data steward.
These tools plus any other open source tool fits pretty well in the IT infrastructure, whether you’re building SDI, or a small data repository, exploit these free tools plus the widely available enterprise training and support and you’ll be great right out of the gate.