XML-TEI document encoding, structuring, rendering and transformation
The vast majority of humanities research data is marked-up and stored as XML-TEI documents. We can say that TEI (Text Encoding Initiative) has become the de-facto standard technology within the digital humanities.
This workshop aims at introducing the attendees to the practical aspects of encoding XML documents marked-up according to the TEI guidelines, and then making use of these documents by applying other technolgies like Xpath, CSS, XSLT, and Xquery.
This workshop consists of a mix of talks and hands-on sessions, and the teaching approach is problem-based learning by means of exercises of increasing complexity.
It will be divided into two parts, lasting a week each, which can be taken independently.
- Week 1: will provide a gentle basic introduction to the production of digital text documents using XML and the TEI encoding scheme. It also deals with designing and validating document structures. This is a theoretical-practical, introductory-level course.
- Week 2: will move beyond encoding and will introduce Xpath, CSS, XSLT and XQuery. XML-TEI document rendering and transformation, deals with producing nice renderings by means of CSS stylesheets and XSLT transformations, and selecting/quering specific data using Xpath and Xquery. This is a hands-on, medium-level course.
Since its creation, TEI development has been driven by the needs of a large user community. The introduction to the features of TEI during the workshop will give participants a starting point to develop their own TEI-based projects.
Syllabus of week 1: Introduction to XML-TEI text encoding
INTRODUCTION:
- Introduction
- Ways of representing documents electronically
- Markup basics (SGML/XML)
DOCUMENT ENCODING:
- Text markup using XML and TEI
- High level tags
- Low level tags
- Markup of different literary styles
- Special-purpose tagsets
DOCUMENT STRUCTURING:
- Controlling document structure (DTDs and Schemas)
Syllabus of week 2: XML-TEI document rendering and transformation
INTRODUCTION
- The XML family of technologies
- Namespaces
DOCUMENT RENDERING:
- CSS sylesheets applied to XML and HTML documents
DOCUMENT TRANSFORMATION:
- Selecting nodes with Xpath
- XSLT transformations
- Visual modelling of document structures
OTHER ISSUES (if time allows)
2022
2021
2020
- Important dates
- Schedule
- Workshops
- OCR4all – An Open Source Tool Providing a Full OCR Workflow For Creating Digital Corpus From Printed Sources
- XML-TEI document encoding, structuring, rendering and transformation
- Hands on Humanities Data Workshop - Creation, Discovery and Analysis
- Recording, Transcription and Analysis of Spoken Language Data
- Digital Annotation and Analysis of Literary Texts with CATMA 6
- Corpus Linguistics for Digital Humanities. Introduction to Methods and Tools
- Institutional Communication: Corpora, Analysis, Application
- Neural Networks for Natural Language Processing - An Introduction
- Stylometry
- Distant Reading in R. Analyse the text & visualize the Data
- Image Processing and Machine Learning for the Digital Humanities
- Humanities Data and Mapping Environments
- Manuscripts in the Digital Age: XML-Based Catalogues and Editions
- Digital Archives: Reading and Manipulating Large-Scale Catalogues, Curating and Creating Small-Scale Archives
- Making an edition of a text in many versions
- Lectures (public)
- Panel (public)
- Experts
- Lecturers
- Application
- Scholarships
- Participation fees