Open American National Corpus (OANC)

Taxonomy :

The Open American National Corpus (OANC) is a massive electronic collection of American English, including texts of all genres and transcripts of spoken data produced from 1990 onward. All data and annotations are fully open and unrestricted for any use.


- Other info -

Language(s) :

American English

Types : monolingual corpus
Domain : court transcript, debate transcript, email, blog, joke, essay, fiction, non-fiction, government documents, face-to-face conversation transcript, telephone conversation transcript, letter, newspaper, journal, movie script, travel guide, twitter
Size : 15 million
Developer : Vassar College (PI: Nancy Ide, ANC), Columbia University (Rebecca Passonneau), International Computer Science Institute (ICSI) (Collin Baker, FrameNet), Princeton University (Christiane Fellbaum, WordNet)
Availability : Free
Update: Project started in 2008 with incoming contributions to date