RUS  ENG
Full version
JOURNALS // Sistemy i Sredstva Informatiki [Systems and Means of Informatics] // Archive

Sistemy i Sredstva Inform., 2015 Volume 25, Issue 2, Pages 140–159 (Mi ssi412)

This article is cited in 15 papers

Information resources for contrastive studies: electronic text corpora

M. G. Kruzhkov

Institute of Informatics Problems, Federal Research Center "Computer Science and Control", Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation

Abstract: This article presents information resources used in contrastive linguistic studies and their principle features. There are two main types of such information resources: typological databases and electronic text corpora. This paper is focused on the latter. There are two types of corpora, which are particularly relevant for contrastive studies: comparable corpora — balanced collections of original texts in the languages compared and parallel (translation) corpora — collections of original texts in one of the compared languages aligned with their translations into other compared language(s). In addition to description of the existing information resources of contrastive linguistic studies, this paper introduces a new type of such resources, which are termed here as corpus extension databases. The article outlines features of such databases in comparison to electronic corpora and justifies the necessity for creating them.

Keywords: linguistic studies; databases; typological databases; comparable corpora; parallel corpora; corpus extension databases.

Received: 05.03.2015

DOI: 10.14357/08696527150209



Bibliographic databases:


© Steklov Math. Inst. of RAS, 2026