Books (OPUS)
Taxonomy : Corpus
This is a collection of copyright free books aligned by Andras Farkas.
- Other info -
Language(s) :
English
Catalan
Dutch
Esperanto
Finnish
French
German
Greek
Hungarian
Italian
Norwegian
Portugese
Russian
Spanish
Swedish
Types : multilingual corpus
Domain : copyright free books
Size : total number of files: 158
total number of tokens: 19,500,000
total number of sentence fragments: 910,000
Developer : Jo ̈rg Tiedemann (OPUS)
Availability : Free
Update: 2012