Digital Corpus of the European Parliament (DCEP)
Taxonomy : Corpus
The Digital Corpus of the European Parliament (DCEP) contains the majority of the documents published on the European Parliament’s official website. It comprises a variety of document types, from press releases to session and legislative documents related to European Parliament’s activities and bodies. The current version of the corpus contains documents that were produced between 2001 and 2012.
- Other info -
Types : multilingual corpus
Domain : European Parliament's documents
Size : Total number of documents : 1.5 million Total number of words: 1.37 billion Total number of English segments: 7.7 million The best-represented language in terms of number of words is English (103,458,996); French and Spanish miss less than 10%.
Developer : Machine Translation team of the European Parliament's Directorate-General for Translation (DGTRAD)
Availability : Free