Resource Index: Recent submissions
Now showing items 171-180 of 386
-
African Wordnet: Tshivenda 1.0
(UNISA, 2017-06-20) ~Resource Catalogue Developed using the expand model with Princeton WordNet 2.0 as basis. Each wordnet contains synsets with at least the following fields:\nWord form (lemma; ... -
African Wordnet: Sesotho sa Leboa 1.0
(UNISA, 2017-06-20) ~Resource Catalogue Developed using the expand model with Princeton WordNet 2.0 as basis. Each wordnet contains synsets with at least the following fields:\nWord form (lemma; ... -
NCHLT Optical Character Recognition for South African Languages
(North-West University; Centre for Text Technology (CTexT), 2017-02-23) ~Resource Catalogue An OCR system is an application that enables one to convert scanned paper documents into editable and searchable texts. The engine analyses the structure ... -
Lwazi Setswana TTS corpus
(Meraka Institute, CSIR, 2013-03-27) ~Resource Catalogue Orthographic and phonemically aligned transcriptions -
Autshumato English-Setswana Parallel Corpora
(North-West University; Centre for Text Technology (CTexT), 2016-10-28) ~Resource Catalogue Aligned English-Setswana parallel corpus. This set contains data that was translated by professional translators, data that was sourced as translated ... -
NCHLT English Text Corpora
(North-West University; Centre for Text Technology (CTexT), 2016-09-09) ~Resource Catalogue Collection consisting of a clean corpus, lexicon, frequency list and named-entity lists developed during the NCHLT Text project. -
Bukantswe Sesotho-English Bilingual Dictionary
(North-West University, 2016-07-07) ~Resource Catalogue Bilingual English-Sesotho dictionary. This dataset represents a basic Sesotho dictionary compiled in the creation of a Sesotho language resource. The ... -
Autshumato Setswana Monolingual Corpora
(North-West University; Centre for Text Technology (CTexT), 2016-10-28) ~Resource Catalogue Setswana monolingual corpus as a deliverable of the Autshumato project. The data is given as a UTF-8 text file; with each sentence on a new line. -
NCHLT South African Language Identifier
(North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~Resource Catalogue A graphical user interface and command line tool to automatically classify a document, paragraph, sentence or phrase as one of the eleven official South ... -
NCHLT Sepedi Phrase Chunk Annotated Corpus
(North-West University; Centre for Text Technology (CTexT), 2016-04-29) ~Resource Catalogue Phrase chunk annotated data for the NCHLT Text Resource Development: Phase II Project. The phrase chunk annotated data is a subset of the 50,000 tokens ...