Jump to Navigation

"Culture & Technology" European Summer University in Digital Humanities
University of Leipzig

Historical Text Corpora for the Humanities and Social Sciences. Digitization, Annotation, Quality Assurance and Analysis

Guiding Questions of this one week course (16 teaching hours) will be:

  1. How can historical texts be digitized, annotated and registered in a sustainable, interoperable format?
  2. How can these textual resources be integrated into the infrastructure of CLARIN(-D)?
  3. How can these textual resources be further processed by usage of CLARIN(-D)-Tools?

Abstract
This course gives an overview of the methods for creating standard conformant and interoperable resources of historical texts. We will show how new and existing textual resources can be built up or further processed, respectively, in order to meet the requirements of CLARIN. In this context, we present different possible workflows for the integration of textual resources into the infrastructure of CLARIN-D by example of the Deutsches Textarchiv project. Problem fields discussed here include:

  • the acquisition and provision of high quality image sources,
  • guidelines for transcriptions true to the source material,
  • text structuring and metadata recording according to the TEI-P5 based DTA ›Base Format‹ (DTABf), the best practice format for the annotation of historical written corpora in
    CLARIN-D,
  • quality assurance within the digitization process.

We will introduce several tools and services provided by the DTA and by CLARIN to support the different tasks in the process of building up CLARIN-conformant resources. Finally, we will demonstrate how resources which are built up and structured homogeneously according to the DTABf may be linguistically analyzed with automatic methods and automatically converted into other standardized formats, how they may be made available and provided in the long term within the CLARIN infrastructure and how they can be analyzed by usage of CLARIN-/DTA-tools.

DTAE workflow (www.deutschestextarchiv.de/dtae)

Further Information:

  • Deutsches Textarchiv DTA: www.deutschestextarchiv.de
  • DTA-Extensions DTAE: www.deutschestextarchiv.de/dtae
  • DTA ›Base Format‹ DTABf: www.deutschestextarchiv.de/doku/basisformat
  • DTA Quality Assurance Platform DTAQ: www.deutschestextarchiv.de/dtaq
  • Cascaded Analysis Broker CAB: www.deutschestextarchiv.de/doku/software#cab
    (tool for the automatic modernization of historical orthography)
  • CLARIN-D Service Center at the Zentrum Sprache of the BBAW: http://clarin.bbaw.de/
  • CLARIN-D WP 5: Services and Resources:
    http://clarin-d.de/en/home-en/work-packages/wp-5-services-and-resources.html
  • CLARIN-D Curation Project 1 WG 1: http://www.deutschestextarchiv.de/clarin_kupro
  • Deutsch
  • The Name
  • Background
  • Mission
  • Audience
  • Workshops
  • Lectures
  • Projects
  • Round Tables
  • Working Languages
  • Impressum
  • Kontakt

2022

  • Home
  • Important dates
  • Application
  • Workshops
  • Experts
  • ConfTool
  • Scholarships etc.
  • Participation fees
  • Moodle
  • Scientific Committee

2021

  • Home
  • ESU DH C&T 2021
  • Important dates 2021
  • ConfTool
  • Programme
  • Workshops
  • Experts
  • Application
  • Lectures
  • Scholarships
  • Participation fees
  • Scientific Committee

2020

  • Home
  • Important dates
  • Schedule
  • Workshops
  • Lectures (public)
  • Panel (public)
  • Experts
  • Lecturers
  • Application
  • Scholarships
  • Participation fees

2019

  • Home
  • Schedule
  • Workshops
  • Lectures (public)
  • Projects (public)
  • Poster Session (public)
  • Panel (public)
  • Teasers (public)
  • Cultural programme
  • Experts
  • Lecturers
  • Scientific Committee
  • Important dates (new)
  • Application
  • Scholarships (updated)
  • Participation fees
  • Refund policy
  • T-Shirts
  • Child care
  • Birthday thoughts

2018

  • Home
  • Schedule
  • Workshops
  • Lectures (public)
  • Projects (public)
  • Posters (public)
  • Panel discussion (public)
  • Teasers (public)
  • Cultural Programme
  • Experts
  • Lecturers
  • Scientific Committee
  • Important dates
  • Application
  • Scholarships
  • Fees
  • Refund policy
  • T-Shirt
  • The logo riddle
  • Child Care

2017

  • Home
  • Schedule
  • Workshops
  • Lectures (public)
  • Projects (public)
  • Panel (public)
  • Teasers / Specials
  • Cultural Programme
  • Experts
  • Lecturers
  • Scientific Committee
  • Important dates
  • Application
  • Scholarships
  • Fees
  • Refund Policy
  • T-Shirt
  • Flyer
  • Child care

2016

  • Home
  • Schedule
  • Workshops
  • Lectures (public)
  • Projects & Posters (public)
  • Panel
  • Teasers (public)
  • Slams
  • Experts
  • Lecturers
  • Scientific Committee
  • Important dates
  • Application
  • Scholarships
  • Fees
  • Refund policy
  • Flyer
  • Child Care

2015

  • Home
  • Schedule
  • Workshops
  • Lectures
  • Projects
  • Posters
  • Panel
  • Teaser / Special sessions
  • Workshop Slams
  • Experts
  • Lecturers
  • Scientific Committee
  • Important dates
  • Application
  • Scholarships
  • Fees
  • Refund policy
  • Child Care
  • T-Shirt 2015
  • Flyer and Poster
  • Sponsorship
  • Questions

2014

  • Home
  • Schedule
  • Workshops
    • XML-TEI
    • Query in Text Corpora
    • Comparing Corpora
    • Historical Text Corpora
    • Open Greek and Latin
    • Programming with Python
    • Stylometry
    • Editing in the Digital Age
    • Digital methods in archaeology
    • Spoken Language
    • Multimodal Corpora
    • Project Planning and Management
    • DH for Department Chairs and Deans
  • Lectures
  • Projects
  • Panel
  • Experts
  • Lecturers
  • Scientific Committee
  • Important dates
  • Application
  • Scholarships
  • Fees
  • Child care
  • Flyer
  • Sponsorship

2013

  • Home
  • Schedule
  • Workshops
  • Lectures
  • Projects & Posters
  • Panel
  • Experts
  • Lecturers
  • Project Presenters
  • Scientific Committee
  • Important dates
  • Application
  • Bursaries
  • Fees
  • Refund Policy
  • T-Shirt
  • Certificate
  • Sponsorship

2012

  • Home
  • Schedule
  • Workshops
  • Lectures
  • Project Presentations
  • Poster Slam & Session
  • Panel Discussions
  • Excursion
  • Lecturers
  • Certificate
  • Scientific Committee
  • Important Dates
  • Duration & Structure
  • Application
  • Registration Fees
  • Bursaries

2010

  • Home
  • Schedule
  • Workshops
  • Instructors
  • Lectures
  • Round table
  • Important dates
  • Application
  • Fees
  • Bursaries

2009

  • Home
  • Schedule
  • Workshops
  • Instructors
  • Lectures
  • Project presentations
  • Round tabel

Leipzig

  • Contact
  • Mailinglist
  • Host
  • Venue
  • Accommodation (updated)
  • City Map
  • Arrival
  • Weather

Experiences

What the ESU means to me
ESU 2022 (Dariah-EU)
ESU 2021 (Dariah-EU)
ESU 2019 Experiences (DARIAH-EU)
ESU 2018 Experiences (CLARIN-D)

ESU in the Media

ESU DH C&T in Zenodo
ESU 2017 (CLARIN-D Blog)
CLARIN-D at ESU 2015 (YouTube) english
CLARIN-D ESU 2015 (YouTube) deutsch
Mephisto 97.6 10.07.13
Campus Online 10.08.2012
Mephisto 97.6 26.07.2010
infotvleipzig 26.07.2010
In India 03.09.2010

Reviews

ESU 2021 (DiCultHer) How to Move a Summer University in Digital Humanities Online and Keep It Human
INFOtheka: Review of ESU DH 2009
INFOtheka: Review of ESU DH 2012
Infoclio.ch: Review of ESU DH 2013

Publications

Multimodal Analysis of “well”

Users

  • Login

DAAD

 

CLARIN ERIC

 

Sächsische Akademie der Wissenschaften

 

Universität Leipzig

 

BMBF

 

Electronic Textual Cultures Lab at the University of Victoria & Digital Humanities Summer Institute

CLARIN-D

 

DARIAH-EU

 

Slovenian Language Technologies Society (SDJT)

 

Parthenos

International Centre/AAA

 

Computational Humanities

 

Oxygen XML Editor

 

Universitätsbibliothek