DGT-Acquis

Taxonomy :

The DGT-Acquis is a family of several multingual parallel corpora extracted from the Official Journal of the European Union (OJ) in Formex 4 (XML) format, consisting of documents from the middle of 2004 to the end of 2011 in up to 23 languages.


- Other info -

Language(s) :

Bulgarian
Spanish
Czech
Danish
German
Estonian
Greek
English
French
Irish
Croatian
Italian
Latvian
Lithuanian
Hungarian
Maltese
Dutch
Polish
Portuguese
Romanian
Slovak
Slovenian
Finnish
Swedish

Types : multilingual corpus
Domain : Official Journal of the European Union
Size : Original data - 3,901,048 files Paragraph level in column-file format - 4,900,254 segments
Developer : European Commission - Directorate-General for Translation (DGT)
Availability : Free
Update: 2004-2011