The Library of a Billion Words
While traditional humanities find their source material in books, digital humanities depend on database text formats. Printed texts are easyly available (in libraries etc.), digital formats still have to be produced, and there are few standards for it. If university libraries want to support the digital humanities they have to provide full text versions of works old and new. Leipzig University Library (LUL), together with the department of information science, has started in May 2013 an ESF-funded project 'The Library of a Billion Words' to experimentally produce database text material for various usages. The project will conceptualize and implement infrastructure components in order to create, organize and provide fulltext-data and meta-data from heterogenous sources. The LUL and its project partners ASV (Arbeitsgruppe Automatische Sprachverarbeitung) and BSV (Arbeitsgruppe Bild- und Signalverarbeitung) aim at enabling researchers in the field of digital humanities to work with growing text corpora on different levels, e. g. for producing editions. The presentation of the project will higlight the technologies under investigation and outline its most important milestones.