Geoparsing Maps the Future of Text Documents

I forbindelse med en test af hvad man kan fremstille på 24 timer (se Udfordringen 2009) mødtes jeg her i sommers med Søren Johannesen, Microformats og Sven Boesen, Informi GIS. På 2 dage hvor vi arbejdede i dagtimerne kom vi frem til en løsning som i form minder utrolig meget om det som Douglas her taler om ...

During af 2 day code marathon me and two other guys made and designed a concept not unlike the one Douglas describes here. The neat thing about this is that we only used online services in a mashup approach ...


By Douglas Caldwell

Geoparsing offers the promise of modern geospatial alchemy... the ability to turn text documents into geospatial databases. This "magic" is done in two steps: 1) entity extraction and 2) disambiguation, which is also known as grounding or geotagging. Geospatial entity extraction uses natural language processing to identify place names in text, while disambiguation associates a name with its correct location. The geoparsing results can be inserted into the original document, used to produce a new document, or formatted for output to a geospatial application.

Geoparsing is most frequently used to automatically analyze collections of text documents. There are a number of commercial products with a geoparsing capability. Companies like MetaCarta extract information about place and time, while others like Digital Reasoning (GeoLocator), Lockheed Martin (AeroText), and SRA (NetOwl) extract places along with other entities, such as persons, organizations, time, money, etc. To process the large volumes of data, these systems rely on automated techniques optimized for speed. [...]

