ACCURAT corpus of comparable sentences

Taxonomy :

The corpus contains comparable sentence pairs extracted from news comparable corpora for 12 language pairs.


- Other info -

Language(s) :

Romanian - English
Lithuanian - Romanian
Estonian - English
Croatian - English
Greek - English
Latvian - English
German - Romanian
Lithuanian - English
Lithuanian - Latvian
German - English
Slovenian - English
Romanian - Greek

Types : comparable corpus
Domain : News
Size : Each language in a pair has the same size. Romanian - English (23 820 Sentences) Lithuanian - Romanian (9,470 Sentences) Estonian - English (19 048 Sentences) Croatian - English (36 663 Sentences) Greek - English (6,641 Sentences) Latvian - English (112,398 Sentences) German - Romanian (10,227 Sentences) Lithuanian - English (33,219 Sentences) Lithuanian - Latvian (7,163 Sentences) German - English (13,782 Sentences) Slovenian - English (67,508 Sentences) Romanian - Greek (1,783 Sentences)
Developer : Tilde
Availability : Free
Update: 06/30/2012