ACCURAT corpus of Wikipedia texts
Taxonomy : Corpus
The corpus contains comparable texts from Wikipedia for 12 language pairs.
- Other info -
Language(s) :
English - Croatian
English - Greek
English - Estonian
English - Latvian
English - Lithuanian
English - Romanian
English - Slovenian
Greek - Romanian
Latvian - Lithuanian
Romanian - German
Romanian - Lithuanian
German - English
Types : comparable corpus
Domain : Wikipedia
Size : Each language in a pair has the same size.
English - Croatian (22,137 Texts)
English - Greek (4,230 Texts)
English - Estonian (20,621 Texts)
English - Latvian (6,455 Texts)
English - Lithuanian (13,906 Texts)
English - Romanian (58,622 Texts)
English - Slovenian (28,004 Texts)
Greek - Romanian (841 Texts)
Latvian - Lithuanian (1,541 Texts)
Romanian - German (16,246 Texts)
Romanian - Lithuanian (2,209 Texts)
German - English (149,891 Texts)
Developer : Tilde
Availability : Free
Update: 06/30/2012