itat banner itat banner

Invited speakers

Šárka Zikánová (Charles University, Czech Republic): Text Structure and Its Ambiguities: Corpus Annotation as a Helpful Guide

The availability of digital language resources marks an important step forward in linguistic research, both for its theoretical as well as applicational orientation. The originally collected data gave an impulse to enrich them by various more sophisticated annotation systems dealing with most different phenomena and adding more levels of granularity.
Human data annotation is a process based on interpretation of observed phenomena. Human annotators may disagree in the evaluation of language expressions and structure; this variation may be seen as a negative feature lowering the quality of the data, which can be solved by unification of the output. In our talk, we follow a different approach, understanding the variation in annotation as an expression of a possible actual vagueness and ambiguity of language. We concentrate on the disagreement on understanding textual phenomena, such as discourse relations, coreference and information structure. The results of the analysis, i.e. identification of typical ambiguous points of the language structure, can serve as a basis for psycholinguistic experiments and e.g. for a later formulation of recommendations how to decrease ambiguity and increase intelligibility of administrative texts, schoolbooks, law texts etc.

Šárka Zikánová is a Czech linguist, a research associate at the Institute of Formal and Applied Linguistics (Charles University, Faculty of Mathematics and Physics, Prague, Czech Republic). She stayed on foreign scientific internships at the universities of Leipzig, Krakow, Philadelphia and Edinburgh. Her research interests evolved from the word order in Older Czech (position of the clitic se, position of a verb – a monograph in 2009, information structure, Latin influences) to the information structure and discourse relations in new Czech, compared to other languages. She is especially interested in the interplay of syntax, information structure, discourse relations and coreference (cf. a collective monograph of the Prague discourse team in 2015). She took part in the development of the Prague Dependency Treebank, the Prague Discourse Treebank, the Czech RST Discourse Treebank and other corpora. Her specific research area is implicit discourse relations (a monograph in 2021). Nowadays, she is turning her attention to the psycholinguistic studies of discourse and other coherence phenomena.