From Archival Documents to Dots in the Map

Mila Oiva

Slides of the presentation.

The workshop teaches how to turn photographed archival documents into computer readable text and provides examples on how it can be used in historical research. The workshop utilizes tools such as ABBYY Finereader for Optical Character Recognition (OCR), which turns photographed archival documents into computer readable text. The second step is to do semi-automated search for place names with help of Named Entity Recognition (NER) function of Recogito web-service. Recogito visualizes the recognized places in a map. The workshop provides examples with easy-to use tools, so that the attendants will be able to start experimenting with their own sources after the workshop. It also provides preliminary understanding on the numerous ways preprocessing of the data affect the outcomes of the computer assisted analysis. (Workshop voidaan pitää suomeksi tai englanniksi.)

Before the workshop, please:

  1. Download test data here.
  2. Register to the web tool Recogito: http://recogito.pelagios.org/

 

We will also look at Abbyy Finereader and Palladio, but you don´t need to download or register to them beforehand.

Mainokset