Tech Overview

Technologically, this platform is the result of a few layered applications.

  1. GeoNode - an open source geospatial content management system
    Components include: DjangoGeoServerPostgreSQL/PostGISOpenLayers
  2. A Georeference app - handles all of the georeferencing operations
    Dependencies added: MapServerSvelte
  3. A LOC Insurance Maps app - curates the ingestion and storage of the Sanborn maps
    Dependencies added: Svelte

GeoNode

GeoNode is an open source geospatial content management system that can function as a geospatial data portal, allowing organizations to publish and curate their spatial datasets. It has been implemented by non-profit and governmental entities around the world.

Choosing GeoNode as the base for this georeferencing platform provided the following advantages, to name a few:

The Georeference App

This app is designed as a standalone GeoNode extension. In theory, anyone could install it within their own GeoNode installation, independent of the LOC app described below (please open a ticket if you are interested in doing so).

This app facilitates the actual georeferencing process, i.e. it allows users to georeference Documents in the GeoNode CMS, turning them into raster Layers (GeoTIFFs). It consists of three user-accessible tool interfaces, as well as a new summary tab in the Layer and Document detail pages. A quick summary of these tools follows, but more detailed documentation on how to use each one can be found in the Georeferencing Process section.

Georeferencing in Stages

One central strategy is a predetermined set of GeoNode "thesaurus keywords" that are assigned to resources during the georeferencing process to keep track of their status along the way. These keywords will be highlighted in bold below, to illustrate the progression. To begin georeferencing any uploaded Document, one need only assign it the unprepared keyword.

Split

/split/<document_id>        # the splitting interface for a Document

The interface for the splitting tool.

This interface allows users to "split" a document into smaller pieces, which is necessary if the scanned image has two different maps on it (because each must be georeferenced separately). More generally, this process could be called "Preparation."

While the splitting process is running in the background, the keyword splitting is assigned. When the process is complete, the Document that has been split is flagged as metadata_only so it no longer appears in search results, and it is assigned the split keyword. New documents resulting from the split are marked as prepared.

If a document does not need to be split, this evaluation can be recorded and it will be marked as prepared, i.e. ready to be georeferenced.

Georeference

/georeference/<document_id> # the georeferencing interface for a Document

The georeferencing interface.

The "georeference" interface allows users to create ground control points (GCPs) which are then sent to a backend process and used to warp the document. During the warping process, the Document is assigned the georeferencing keyword. Upon completion, a new Layer is created in Geoserver (a GeoTIFF) and registered in Geonode. The Document and Layer are both assigned the keyword georeferenced.

Trim

/trim/<layer_alternate>     # the trimming interface for a Layer

The layer trimming interface.

The "trim" interface allows a user to draw a mask around the boundary of the layer in order to remove the margins of scanned maps. This is not necessary, but allows adjacent map sheets from a historical map series to be overlapped, creating a seamless mosaic. Once a layer has been trimmed it is assigned the trimmed keyword.

This process does not alter the file. It stores a mask polygon for the layer, generates a new GeoServer style passing the mask polygon to the CropCoverage SLD filter, and then sets this style as the default.

Overview tab

In the Document detail and Layer detail pages a new tab is added labeled Georeference. This tab provides a summary of all the georeferencing actions that have been performed on that Document or Layer. You can also access the next step in the georeferencing process for the resource from this tab.

A summary of the georeferencing actions for a document/layer are displayed in this tab.

In the search results pages, a list of links are added to each item, allowing quick access to any of the above pages. Only links to actions that are appropriate for that item's georeferencing progress are active.

Georeference links in a search result item. From left: summary tab, split, georeference, trim, and jump to the corresponding layer. This document has already been georeferenced, so the split link is disabled.


All interfaces are written using Svelte. Mapserver is used to generate the WMS preview used during georeferencing from a VRT that is dynamically updated with ground control points.

Note

Earlier iterations of this app incorporated IIIF with the intention of building from Bert Spaan's work at allmaps.org. Remnants of this approach have been moved into a separate app called iiif_support, and could be reincorporated in the future.

Data Model & Procedure

All georeferencing activity is stored in SessionBase objects, as implemented through the proxy models PrepSession, GeorefSession, and TrimSession. Each proxy model has its own implementation of a run() method which uses the information in the data field to perform the appropriate actions.

Data model for the georeference app.

Narrative Explanation

When a user begins preparing a Document, a new PrepSession is created. If the user creates cutlines to split the document, this information is saved in the session's data field as JSON and then used to run the splitting action that creates new child documents (the original document is not altered).

When a user begins georeferencing a Document, a new GeorefSession is created. When the ground control points have been created and submitted through the interface, they are stored as GeoJSON in the session's data field and then used to warp the Document and create a Layer. Finally, they are saved separately as GCP objects and aggregated into 1 GCPGroup per Document. This facilitates iterative editing of the Document's "canonical" GCPs, while also allowing for the reversion to a past set of GCPs if necessary.

Similarly, a user creates a TrimSession when they begin trimming a Layer. The mask polygon is stored in the session's data and then pushed to the Layer's canonical LayerMask object, and applied as a cropped style in Geoserver.

The LOC Insurance Maps App

This application creates database models and scaffolding to support the acquisition and management of content from the LOC Sanborn Map collection. On the front end, it provides the following urls:

/                       # the home page with branding, etc.
/loc/<volume_doi>       # overall progress page for sheets of a volume
/loc/volumes            # access point to load and explore volumes

All interfaces are written using Svelte. A custom GeoNode theme was created as well to manage general color branding, etc.

Icons in this app are by Alex Muravev and Olga from the Noun Project.

A Note About the Project

There were two main motivations for this endeavor:

Neither of these ideas is new, but I was especially inspired by the idea of the archival commons1 and wanted to build something with elements of that model—open access, public curation, and extensibility. The wealth of available open source geospatial software offers so many ways to do this. I wanted to make this site to present some polished ideas, but consider them more as starting points than a finished product.

Building around the LOC Sanborn collection was a natural fit, as it is a massive repository of archival content in the public domain that has a good JSON API around it. I have also long been enamored with Sanborn maps, and this project is as much a love letter to them as anything else.

I have presented about the project a few times during the development process:


    • Eveleigh, Alexandra. 2014. “Crowding out the Archivist? Locating Crowdsourcing within the Broader Landscape of Participatory Archives.” In Crowdsourcing Our Cultural Heritage, 211–29. Ashgate Publishing Farnham.
    • Anderson, Scott, and Robert Allen. 2009. “Envisioning the Archival Commons.” The American Archivist 72 (2): 383–400. https://doi.org/10.17723/aarc.72.2.g54085061q586416.