Hong Kong Bilingual Corpus of Legal & Documentary Texts

Taxonomy :

Hong Kong Parallel Text was developed by the Linguistic Data Consortium (LDC) and contains data from three sub-corpora, namely Hong Kong Hansards Parallel Text, Hong Kong Laws Parallel Text and Hong Kong News Parallel Text.

- Other info -

Language(s) :

English, Chinese

Types : parallel corpus
Domain : government documents
Size : For the Chinese data, there are approximately 49M-words, while for the English translation, there are approximately 59M-words in total, and 466K unique words.
Developer : Xiaoyi Ma
Availability : Registration required
Update: September 1, 2004