RUS  ENG
Full version
JOURNALS // Informatika i Ee Primeneniya [Informatics and its Applications] // Archive

Inform. Primen., 2017 Volume 11, Issue 4, Pages 118–125 (Mi ia509)

This article is cited in 4 papers

Approaches to annotation of discourse relations in linguistic corpora

M. G. Kruzhkov

Institute of Informatics Problems, Federal Research Center “Computer Science and Control” of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation

Abstract: This paper examines the Supracorpora Database of Connectives (SCDB-Connectives) that is based on data from parallel corpora. The SCDB-Connectives provides structural and semantic annotation of Russian connectives and their translation correspondences in French (and, eventually, in other languages). The SCDB-Connectives annotation approach is compared to the latest developments in the area of annotation of discourse relations — the annotated corpus of discourse relations Penn Discourse Treebank (PDTB) and the proposed standard for annotation of semantic relations ISO 24617-8, some of the important differences are discussed. Penn Discourse Treebank and ISO 24617-8 support annotation of both explicit and implicit discourse relations while SCDB-Connectives only annotates explicit relations, i. e., those expressed by connectives. Furthermore, PDTB and ISO 24617-8 provide a superior framework for annotating text spans as relation arguments, which allows annotating attribution for these arguments, such as source and type of the linked propositions. In addition, ISO 24617-8 specifies argument roles for asymmetrical discourse relations. On the other hand, the principle advantage of the SCDB-Connectives is that it supports annotation of both connectives and their translation correspondences in parallel corpora, opening up new possibilities for contrastive studies. The SCDB-Connectives is based on a relational database rather than on the XML format, which helps to manage complex cross-linguistic data efficiently. Benefits of semantic annotation of connectives for both theoretical and practical purposes are also discussed.

Keywords: discourse relations; discourse connectives; corpus linguistics; parallel corpora; supracorpora databases.

Received: 07.09.2017

Language: English

DOI: 10.14357/19922264170415



Bibliographic databases:


© Steklov Math. Inst. of RAS, 2026