Collocations from a multilingual perspective: theory, tools, and applications
Collocations such as “French window”, “heavy smoker”, and “to pass a law” are multiword expressions whose constituents show a high degree of statistical association and whose meaning is not entirely semantically transparent in the sense that one of the constituents carries a special meaning found only in this combination. Collocations have been extensively studied by lexicographers, lexical semanticists, and computational linguists. With the increased availability of digital resources for language studies, collocations can be extracted by automatic means from large digital text collections and then used for a variety of purposes, such as second language acquisition, dictionary making, and machine translation.
In this course we will take a multilingual perspective on collocations, including at a minimum English, German, and Russian. We will introduce practical tools for collocation extraction, for measuring the association strength between the members of a collocation, and for semantic modeling of collocations. Course participants will be encouraged to apply these tools to their native language. Prerequisites for the course: basic knowledge of lexical semantics, some experience with using electronic corpora, and some basic knowledge of lexical statistics.
2022
2021
2020
2019
2018
- Schedule
- Workshops
- XML-TEI document encoding, structuring, rendering and transformation
- Hands on Humanities Data Workshop - Creation, Discovery and Analysis
- Collocations from a multilingual perspective: theory, tools, and applications
- Reflected Text Analysis in the Digital Humanities
- Humanities Data and Mapping Environments
- Building and analysing multimodal corpora
- Stylometry
- Asking questions to data in the humanities: right, correct, efficient (Introducing and comparing XQuery, SQL, SPARQL for data from the humanities)
- Computer Vision Intervention. How digital methods help to visually understand corpora of art and cultural heritage
- Integrating Human Science Data using CIDOC-CRM as Formal Ontology: a practical approach
- The humanities scholar's perspective on rule based machine translation
- Word Vectors and Corpus Text Mining with Python
- Text Mining with Canonical Text Services
- How Research Infrastructures empower eHumanities and eHeritage Research(ers)
- Introduction to Project Management
- Lectures (public)
- Projects (public)
- Posters (public)
- Panel discussion (public)
- Teasers (public)
- Cultural Programme
- Experts
- Lecturers
- Scientific Committee
- Important dates
- Application
- Scholarships
- Fees
- Refund policy
- T-Shirt
- The logo riddle
- Child Care