Unlock

Questions?

Project Blog

|

Contact Us envelope image

Unlock text

Unlock the location data implicit in your text documents.

A simple RESTful web service. POST your text content - either plain text, or HTML pages or XML containing metadata - and get back a feed of the named places found in the text, with best guesses as to their locations.

First, we extract likely placenames from a piece of text. Next, we look up the placenames in the gazetteer, and match the placenames to locations using the context provided by the text. For example, if "Leith" and "Portobello" are mentioned together, we're more likely to be talking about "Leith, Edinburgh" than "Leith, Ontario".

The text/places service can be used either with the open data worldwide gazetteer GeoNames, or with the Ordnance Survey derived UK gazetteer, Unlock.

Geo-Parser

Note: The Geo-Parser can take a while to process large documents. We recommend geo-parsing one page of text at a time.

A valid API key is only required to query the Unlock gazetteer.

Reading file...

Parsing text...

Parameter Description
Type The contents of the your file to be geo-parsed.
Plain Text, HTML or XML.
Gazetteer Which gazetteer will be used to look-up placesnames?
GeoNames (free access) or Unlock (Digimap key required).
Output Format What format do you want to receive results?
XML, JSON or KML document.
Upload File Browse for your local file to upload and Geo-parse!

Text/Places API

An API for Geo-parsing is available. Make a POST request to this URL with the parameters shown below: http://unlock.edina.ac.uk/text/places

You can use a command-line client, such as curl, to make the POST request like so:

curl -d type=plain -d "document=Carnock is a small town in Fife near Dunfermline" / 
-d gazetteer=geonames http://unlock.edina.ac.uk/text/places

Alternatively, in form-emulation mode:

curl -F "document=@filename.txt" -F "type=plain" -F "gazetteer=geonames" -F "format=json"
Parameter Value Type Description
document Text contents to be geoparsed. String This may be plain text, HTML or XML - specified by the "type" parameter
type One of ['plain','html','xml'] String
format One of ['json','kml','basic'] String The output format for the set of results
gazetteer Either 'geonames' or 'os' String The gazetteer to be used to resolve the placename locations. If requesting OS data, you must also specify an API key.
key A registered API key. The presented key is mapped to the IP address of the request. String Unlock Web Services authentication. Only needed for OS data.

The core geoparser software is a collaboration between EDINA and the Language Technology Group at the School of Informatics, University of Edinburgh. LTG have worked with us on enhancing the geoparser for very large bodies of similar text, including the BOPCRIS archive of 19th century parliamentary proceedings, and the HISTPOP archive of historic census and population data.

banner logo and buttons - images © iStockphoto 2010