I should have made a post about this a while ago, but I didn't want a half complete post, and the scope of my project kept expanding!
Part 1: Scraping
I found two huge repositories of old digitised maps of Australia, many of which are in the public domain. The National Library of Australia and Parish Maps from the Department of Lands NSW. Unfortunately they didn't really have a nice documented RESTfull API for the use of the images and metadata. My first step was to extract as much information as I could and convert it into an intermediate format. Most of my code and documentation for doing this is at https://github.com/andrewharvey/govscrape in those two respective folders. Unfortunately it's not as easy as running one command from my repo to download and parse all the data. My goal was to get the data to my machine, not write a robust system that anyone could run to get a clone of the nla and pmap repositories.
Part 2: Georeferencing
It would be great if I could push out an easy to use API for the data I collected from the scrape stage, but I don't have the resources (let me know if you are willing to help out with server resources to host these old public domain maps). Even without a nice interface to the data, I could still play around with it and to see what use I could make of it. I dabbled into using these maps as a source of data for OpenStreetMap. I only got through a few of the maps, I put this on hold as I figured it would be easier (especially for others) to do this if they were georeferenced. I tried out both http://warper.geothings.net/ and QuantumGIS, but both had way to much lagging. So I rolled out my own solution which was just a bunch of scripts which used Inkscape and a hacked libchamplain demo as the GUI. The code and documentation for this is at https://github.com/andrewharvey/georeferencing-scripts.
The georeferencing data that I have made so far (it's a big task!) is at https://github.com/andrewharvey/georeferencing-data.
Part 3: Sharing
From the data and code from the last step, I'm able to push out these old maps in several formats. I used gdalwarp to convert the maps into Transverse Mercator (well actually I don't really know what they are, but this seems to work), from here I can use gdal2tiles.py (...finally understanding the difference between OSM Slippy map tilesnames and the OGC TMS... take note that gdal2tiles.py produces TMS format tiles which differs from OSM style as it has the y axis going bottom to top, see http://groups.google.com/group/maptiler/browse_thread/thread/aa89fc726b8f7261/8bdc39d7829cc80c) to push out an OSM slippy map like tile directory, I can push out a KML GroundOverlay, or you could probably use a WMS server to push it out through WMS. I really wanted to leave it open.
[caption id="attachment_1235" align="aligncenter" width="600" caption="Overlay from public domain map, http://nla.gov.au/nla.map-rm2795. Background CC BY-SA 2.0 OpenStreetMap Contributors, http://www.openstreetmap.org/"][/caption]
[caption id="attachment_1236" align="aligncenter" width="600" caption="Parishmap as backgrop in JOSM. Data CC BY-SA 2.0 OpenStreetMap Contributors, http://www.openstreetmap.org/. Background public domain map PMapMN04/14015601."][/caption]
[caption id="attachment_1237" align="aligncenter" width="600" caption="Overlay from public domain map, PMapMN04/14015601. Background CC BY-SA 2.0 OpenStreetMap Contributors, http://www.openstreetmap.org/"][/caption]
I would post a Google Earth one too, but its too much effort to get a free background in there for the screenshot. I'm not convinced that this display of the data is user friendly. Having control of the transparency of the overlay is a must. Maybe one day, someone will crop out all the non-map parts of the parish maps so we can get a single whole of NSW parish map slippy map.
I suppose now I need to focus on the infrastructure. It should be really easy for a user to browse the available maps and view them either as a KML, an OpenLayers overlay. I should also plug this into the meta-data I scraped and have stored in CSV like files.
The problem I have with distribution right now is that many of the maps need warping and that means I need to host the warped image somewhere. Some could probably be georeferenced from their source image using just translate, scale and rotate, and hence should be able to use the source image from the government server to serve the georeferenced imagery. But the work flow I've set up so far, relies on using gdalwarp, and hence having access to the warped image.