Jump to Navigation

"Culture & Technology" European Summer University in Digital Humanities
University of Leipzig

Digital Annotation and Analysis of Literary Texts with CATMA 6.0

The Workshop introduces students of literature to CATMA 6.0 (Computer Assisted Text Markup and Analysis; www.catma.de), an open source tool developed at and hosted by the University of Hamburg since 2008. CATMA is currently used by over 60 research projects worldwide. Its new version 6.0 forms part of the DFG-funded forTEXT project (www.fortext.net) and offers a unique combination of three main features:

  • CATMA supports collaborative annotation and analysis – a text or text corpus can be investigated individually, but also jointly by a group of students or researchers.
  • CATMA supports explorative, non-deterministic practices of text annotation – a discursive, debate-oriented approach to text annotation based on the research practices of hermeneutic disciplines is the underlying conceptual model.
  • CATMA integrates text annotation and text analysis in a web-based working environment – which makes it possible to combine the identification of textual phenomena with their investigation in a seamless, iterative fashion.

What sets CATMA apart from other digital annotation methods is its ‘undogmatic’ approach: the system does neither prescribe defined annotation schemata or rules, nor does it force the user to apply rigid yes/no, right/wrong taxonomies to texts (even though it allows for more prescriptive schemata as well). In other words, CATMA’s logic invites users to explore the richness and multifacetedness of textual phenomena according to their needs: users can create, expand, and continuously modify their own individual tagsets – so if a text passage invites more than one interpretation, nothing in the system prevents assigning multiple, or even contradictory annotations.

Despite its flexibility CATMA does not produce idiosyncratic annotations: All markup data can be exported in TEI/XML-format and reused in other contexts. Since CATMA is a highly intuitive tool it is particularly suitable for humanists with little technical knowledge: the GUI allows for a quick kick-off, and CATMA’s query builder (a step-by-step dialogue-based widget) helps users retrieve complex information from texts without having to learn a query language. Moreover, CATMA’s easy-to-use automated distant-reading functions are continuously enhanced and extended – the current version 6.0 already features a number of automated annotation routines, among others the identification of basic narrative features in texts.

In our workshop we will introduce the core annotation and analysis functionalities of CATMA and show how they can be combined with the annotations provided automatically.

In week 1, participants will be taken in a step-by-step, hands-on approach through the full cycle of a CATMA-based text investigation and can work on their own texts/projects:

  • From text upload to initial text investigations,
  • then to annotation and specification of annotation categories,
  • from there to combined text queries that consult the source text and its annotations in combination,
  • and finally to the visual output of query results.

Participants will be able to test and apply the tool hands-on: they will annotate their own texts, create their own Tagsets, and define their tags in an annotation guideline. We would also like to engage participants in a critique of CATMA’s design and components as well as a general discussion about requirements for text analysis tools in their fields of interest.

In week 2 we will combine the work in CATMA with other methods and tools for digital text analysis like NER, (S)NA, and SA [1] in two steps. We will begin with the visual investigation and refinement of annotations created in week 1. Second, we will focus on the application of CATMA to the individual projects of the workshop participants: what is the outcome of the CATMA based annotation and analysis of texts as well as of the creation of genuine Tagsets relevant to these projects? Each participant will give a short presentation on their project, followed by a group discussion.

Target audience of the workshop

The primary users of CATMA are literary scholars, as well as graduate and undergraduate students in Literary Studies. In addition, this workshop is likely to be of interest to

  • humanities scholars in ALL fields concerned with text analysis (with and without experience in digital text analysis),
  • software developers in the humanities interested in non-deterministic text analysis and automated annotation.

Participants need no prior knowledge of digital text annotation and can work with their own laptop computers and their own digital texts. CATMA runs on Laptop or PC (Windows, Unix or MacOS) with a current web browser (MS Explorer or Edge, Firefox, Chrome, Safari) with a mouse or touchpad. Touchscreen navigation is not yet supported (but in the pipeline).

––––––––––––––––––––––––

[1] Named Entity Recognition (NER), (Social) Network Analysis ((S)NA), and Sentiment Analysis (SA).

  • Deutsch
  • The Name
  • Background
  • Mission
  • Audience
  • Workshops
  • Lectures
  • Projects
  • Round Tables
  • Working Languages
  • Impressum
  • Kontakt

2022

  • Important dates
  • Application
  • Workshops
  • Experts
  • ConfTool
  • Scholarships etc.
  • Participation fees
  • Moodle
  • Scientific Committee

2021

  • ESU DH C&T 2021
  • Important dates 2021
  • ConfTool
  • Programme
  • Workshops
  • Experts
  • Application
  • Lectures
  • Scholarships
  • Participation fees
  • Moodle
  • Scientific Committee

2020

  • Important dates
  • Schedule
  • Workshops
  • Lectures (public)
  • Panel (public)
  • Experts
  • Lecturers
  • Application
  • Scholarships
  • Participation fees

2019

  • Schedule
  • Workshops
    • XML-TEI document encoding, structuring, rendering and transformation
    • Hands on Humanities Data Workshop - Creation, Discovery and Analysis
    • Manuscripts in the Digital Age: XML-Based Catalogues and Editions
    • Digital Annotation and Analysis of Literary Texts with CATMA 6.0
    • Compilation, Annotation and Analysis of Written Text Corpora. Introduction to Methods and Tools
    • Searching Linguistic Patterns in Text Corpora for Digital Humanities Research
    • All About Data – Exploratory Data Modelling and Practical Database Access
    • Stylometrie
    • Humanities Data and Mapping Environments
    • Images of Image Machines. Theory and Practice of Interpretable Machine Learning for the Digitial Humanities
    • An Introduction to Neural Networks for Natural Language Processing - Applications and Implementation
  • Lectures (public)
  • Projects (public)
  • Poster Session (public)
  • Panel (public)
  • Teasers (public)
  • Cultural programme
  • Experts
  • Lecturers
  • Scientific Committee
  • Important dates (new)
  • Application
  • Scholarships (updated)
  • Participation fees
  • Refund policy
  • T-Shirts
  • Child care
  • Birthday thoughts

2018

  • Schedule
  • Workshops
  • Lectures (public)
  • Projects (public)
  • Posters (public)
  • Panel discussion (public)
  • Teasers (public)
  • Cultural Programme
  • Experts
  • Lecturers
  • Scientific Committee
  • Important dates
  • Application
  • Scholarships
  • Fees
  • Refund policy
  • T-Shirt
  • The logo riddle
  • Child Care

2017

  • Schedule
  • Workshops
  • Lectures (public)
  • Projects (public)
  • Panel (public)
  • Teasers / Specials
  • Cultural Programme
  • Experts
  • Lecturers
  • Scientific Committee
  • Important dates
  • Application
  • Scholarships
  • Fees
  • Refund Policy
  • T-Shirt
  • Flyer
  • Child care

2016

  • Schedule
  • Workshops
  • Lectures (public)
  • Projects & Posters (public)
  • Panel
  • Teasers (public)
  • Slams
  • Experts
  • Lecturers
  • Scientific Committee
  • Important dates
  • Application
  • Scholarships
  • Fees
  • Refund policy
  • Flyer
  • Child Care

2015

  • Schedule
  • Workshops
  • Lectures
  • Projects
  • Posters
  • Panel
  • Teaser / Special sessions
  • Workshop Slams
  • Experts
  • Lecturers
  • Scientific Committee
  • Important dates
  • Application
  • Scholarships
  • Fees
  • Refund policy
  • Child Care
  • T-Shirt 2015
  • Flyer and Poster
  • Sponsorship
  • Questions

2014

  • Schedule
  • Workshops
  • Lectures
  • Projects
  • Panel
  • Experts
  • Lecturers
  • Scientific Committee
  • Important dates
  • Application
  • Scholarships
  • Fees
  • Child care
  • Flyer
  • Sponsorship

2013

  • Schedule
  • Workshops
  • Lectures
  • Projects & Posters
  • Panel
  • Experts
  • Lecturers
  • Project Presenters
  • Scientific Committee
  • Important dates
  • Application
  • Bursaries
  • Fees
  • Refund Policy
  • T-Shirt
  • Certificate
  • Sponsorship

2012

  • Home
  • Schedule
  • Workshops
  • Lectures
  • Project Presentations
  • Poster Slam & Session
  • Panel Discussions
  • Excursion
  • Lecturers
  • Certificate
  • Scientific Committee
  • Important Dates
  • Duration & Structure
  • Application
  • Registration Fees
  • Bursaries

2010

  • Schedule
  • Workshops
  • Instructors
  • Lectures
  • Round table
  • Important dates
  • Application
  • Fees
  • Bursaries

2009

  • Schedule
  • Workshops
  • Instructors
  • Lectures
  • Project presentations
  • Round tabel

Leipzig

  • Contact
  • Mailinglist
  • Host
  • Venue
  • Moodle
  • Accommodation (updated)
  • City Map
  • Arrival
  • Events
  • Weather

What the ESU means to me

ESU in the Media

ESU 2019 Experiences (DARIAH-EU)
ESU 2018 Experiences (CLARIN-D)
ESU 2017 (CLARIN-D Blog)
CLARIN-D at ESU 2015 (YouTube)
CLARIN-D ESU 2015 (YouTube)
Mephisto 97.6 10.07.13
Campus Online 10.08.2012
Mephisto 97.6 26.07.2010
infotvleipzig 26.07.2010
In India 03.09.2010

Reviews

INFOtheka: Review of ESU DH 2009
INFOtheka: Review of ESU DH 2012
Infoclio.ch: Review of ESU DH
2013

Publications

Multimodal Analysis of “well”

Users

  • Login

DAAD

 

CLARIN ERIC

 

Sächsische Akademie der Wissenschaften

 

Universität Leipzig

 

BMBF

 

Electronic Textual Cultures Lab at the University of Victoria & Digital Humanities Summer Institute

CLARIN-D

 

DARIAH-EU

 

Slovenian Language Technologies Society (SDJT)

 

Parthenos

International Centre/AAA

 

Computational Humanities

 

Oxygen XML Editor

 

Universitätsbibliothek