Jacob Fenton developed What Word Where?, a tool that uses hOCR data to create near-literal document topography—that is, GIS data. We need to modularize this, so that it can stand alone—preferably a command-line tool that takes a PDF and generates GIS data.
More details about this are available on Jacob's repository.