Jump to Navigation

"Culture & Technology" European Summer University in Digital Humanities
University of Leipzig

Distant Reading in R. Analyse the text & visualize the Data

1.0 The Workshop

Distant reading is one of the most famous methodological approaches that has been constantly taking place in digital humanities, since its formalisation by Franco Moretti in the article Conjectures on World Literature (2000). Distant reading benefits greatly from the use of computational tools. For this reason, we are proposing a course based on the use of R, one of the most popular programming languages used today by the scientific community.

The course is suitable for beginners who want to start digital humanities training with a complete overview of the most common tools used for distant reading.

The philosophy of the course is to analyse the text & visualize the data and the course is structured on this dichotomy.

The objective of the course is to provide the participants with methodological and practical tools that they can utilise for their own research. At the end of the two weeks, they will be able to use R and RStudio in order to apply semantic analysis, stylometry, and spatial mapping. R analysis displays results that can be easily presented by graphical representations such as graphs, trees, or maps. As a result, part of the course will be dedicated to open source programs like Gephi, Gimp and Inkscape, specific to the reworking of vectorial and graphical files. 

2.0 Schedule

The course takes place over two weeks in order to allow the participants to choose to attend one or both parts. However, participation to the entire course is strongly advised. 

The first week is dedicated to three of the most common methods used for distant reading: sentiment analysis, topic modelling, and stylometry. The objective of this first week is to provide a basic theoretical / methodological understanding of distant reading techniques, together with the practical tools to analyse texts in an R environment.

The second week is dedicated to data visualization. In this module the participants will focus on mapping, network analysis and graphics. The objective of this week is to give participants the tools to organise the visualisation of data graphically, chronologically, and spatially. If a participant is interested in the second week only, we will assume that s/he has a more than basic knowledge of R programming language.

Week 1: Analyse the text

Week 2: Visualize the data

Day 1

Day 2

Day 3

Day 4

Day 5

Day 6

Day 7

Day 8

Day 9

Day 10

1st 

hour

Introduction to the course

Sentiment analysis

Topic modelling

Stylometry

Sentiment analysis

Topic modelling

Stylometry

Hands On

Network analysis

(Gephi)

Network analysis

(Data scrapping)

Named-entity Recognition

Mapping

Mapping

2nd 

hour

Introduction to R and RStudio

3rd 

hour

Sentiment analysis

Topic modelling

Stylometry

Projects

Network analysis

(Gephi)

Inkscape

&

Gimp

Mapping

(Coordinates)

Mapping

Projects

4th 

hour

At the beginning of the course, the workshop leaders will divide the class in two groups according to their research interests. Each group will carry out some research to be presented on the last day of the workshop, using one of the methodologies introduced during the week.

3.0 Technical Requirements

  • Participants should have their own computer with at least 5-10GB of available space.
  • Operating System: Windows (preferably 7+), Linux or Mac OSX.
  • Java 8 for the operating system. You may need to create an Oracle account to download Java 8.
  • Zip / unzip programs (these are programs that you normally have by default in your computer, like 7-Zip or WinZip for Windows, to manage compressed folders).
  • Browser: Mozilla Firefox and Google Chrome.
  • Simple text reading program (for txt and csv) like Sublime Text Editor 3 for Windows, Linux and Mac.
  • Google account 
  • R version 3.5.1 (2018-07-02) -- "Feather Spray"
  • RStudio and Xquartz (the latter for Mac)
  • Openoffice
  • Gephi
  • Inkscape
  • Gimp
  • Italiano
  • The Name
  • Background
  • Mission
  • Audience
  • Workshops
  • Lectures
  • Projects
  • Round Tables
  • Working Languages
  • Impressum
  • Kontakt

2022

  • Important dates
  • Application
  • Workshops
  • Experts
  • ConfTool
  • Scholarships etc.
  • Participation fees
  • Moodle
  • Scientific Committee

2021

  • ESU DH C&T 2021
  • Important dates 2021
  • ConfTool
  • Programme
  • Workshops
  • Experts
  • Application
  • Lectures
  • Scholarships
  • Participation fees
  • Moodle
  • Scientific Committee

2020

  • Important dates
  • Schedule
  • Workshops
    • OCR4all – An Open Source Tool Providing a Full OCR Workflow For Creating Digital Corpus From Printed Sources
    • XML-TEI document encoding, structuring, rendering and transformation
    • Hands on Humanities Data Workshop - Creation, Discovery and Analysis
    • Recording, Transcription and Analysis of Spoken Language Data
    • Digital Annotation and Analysis of Literary Texts with CATMA 6
    • Corpus Linguistics for Digital Humanities. Introduction to Methods and Tools
    • Institutional Communication: Corpora, Analysis, Application
    • Neural Networks for Natural Language Processing - An Introduction
    • Stylometry
    • Distant Reading in R. Analyse the text & visualize the Data
    • Image Processing and Machine Learning for the Digital Humanities
    • Humanities Data and Mapping Environments
    • Manuscripts in the Digital Age: XML-Based Catalogues and Editions
    • Digital Archives: Reading and Manipulating Large-Scale Catalogues, Curating and Creating Small-Scale Archives
    • Making an edition of a text in many versions
  • Lectures (public)
  • Panel (public)
  • Experts
  • Lecturers
  • Application
  • Scholarships
  • Participation fees

2019

  • Schedule
  • Workshops
  • Lectures (public)
  • Projects (public)
  • Poster Session (public)
  • Panel (public)
  • Teasers (public)
  • Cultural programme
  • Experts
  • Lecturers
  • Scientific Committee
  • Important dates (new)
  • Application
  • Scholarships (updated)
  • Participation fees
  • Refund policy
  • T-Shirts
  • Child care
  • Birthday thoughts

2018

  • Schedule
  • Workshops
  • Lectures (public)
  • Projects (public)
  • Posters (public)
  • Panel discussion (public)
  • Teasers (public)
  • Cultural Programme
  • Experts
  • Lecturers
  • Scientific Committee
  • Important dates
  • Application
  • Scholarships
  • Fees
  • Refund policy
  • T-Shirt
  • The logo riddle
  • Child Care

2017

  • Schedule
  • Workshops
  • Lectures (public)
  • Projects (public)
  • Panel (public)
  • Teasers / Specials
  • Cultural Programme
  • Experts
  • Lecturers
  • Scientific Committee
  • Important dates
  • Application
  • Scholarships
  • Fees
  • Refund Policy
  • T-Shirt
  • Flyer
  • Child care

2016

  • Schedule
  • Workshops
  • Lectures (public)
  • Projects & Posters (public)
  • Panel
  • Teasers (public)
  • Slams
  • Experts
  • Lecturers
  • Scientific Committee
  • Important dates
  • Application
  • Scholarships
  • Fees
  • Refund policy
  • Flyer
  • Child Care

2015

  • Schedule
  • Workshops
  • Lectures
  • Projects
  • Posters
  • Panel
  • Teaser / Special sessions
  • Workshop Slams
  • Experts
  • Lecturers
  • Scientific Committee
  • Important dates
  • Application
  • Scholarships
  • Fees
  • Refund policy
  • Child Care
  • T-Shirt 2015
  • Flyer and Poster
  • Sponsorship
  • Questions

2014

  • Schedule
  • Workshops
  • Lectures
  • Projects
  • Panel
  • Experts
  • Lecturers
  • Scientific Committee
  • Important dates
  • Application
  • Scholarships
  • Fees
  • Child care
  • Flyer
  • Sponsorship

2013

  • Schedule
  • Workshops
  • Lectures
  • Projects & Posters
  • Panel
  • Experts
  • Lecturers
  • Project Presenters
  • Scientific Committee
  • Important dates
  • Application
  • Bursaries
  • Fees
  • Refund Policy
  • T-Shirt
  • Certificate
  • Sponsorship

2012

  • Home
  • Schedule
  • Workshops
  • Lectures
  • Project Presentations
  • Poster Slam & Session
  • Panel Discussions
  • Excursion
  • Lecturers
  • Certificate
  • Scientific Committee
  • Important Dates
  • Duration & Structure
  • Application
  • Registration Fees
  • Bursaries

2010

  • Schedule
  • Workshops
  • Instructors
  • Lectures
  • Round table
  • Important dates
  • Application
  • Fees
  • Bursaries

2009

  • Schedule
  • Workshops
  • Instructors
  • Lectures
  • Project presentations
  • Round tabel

Leipzig

  • Contact
  • Mailinglist
  • Host
  • Venue
  • Moodle
  • Accommodation (updated)
  • City Map
  • Arrival
  • Events
  • Weather

What the ESU means to me

ESU in the Media

ESU 2019 Experiences (DARIAH-EU)
ESU 2018 Experiences (CLARIN-D)
ESU 2017 (CLARIN-D Blog)
CLARIN-D at ESU 2015 (YouTube)
CLARIN-D ESU 2015 (YouTube)
Mephisto 97.6 10.07.13
Campus Online 10.08.2012
Mephisto 97.6 26.07.2010
infotvleipzig 26.07.2010
In India 03.09.2010

Reviews

INFOtheka: Review of ESU DH 2009
INFOtheka: Review of ESU DH 2012
Infoclio.ch: Review of ESU DH
2013

Publications

Multimodal Analysis of “well”

Users

  • Login

DAAD

 

CLARIN ERIC

 

Sächsische Akademie der Wissenschaften

 

Universität Leipzig

 

BMBF

 

Electronic Textual Cultures Lab at the University of Victoria & Digital Humanities Summer Institute

CLARIN-D

 

DARIAH-EU

 

Slovenian Language Technologies Society (SDJT)

 

Parthenos

International Centre/AAA

 

Computational Humanities

 

Oxygen XML Editor

 

Universitätsbibliothek