NCHLT Tshivenḓa RoBERTa language model
License agreement
By downloading this resource I accept and agree to the terms of use and the associated license conditions under which the resource is distributed.
Download
MD5: 4b563aee86877224ff9654fc4273ff0e
License agreement
By downloading this resource I accept and agree to the terms of use and the associated license conditions under which the resource is distributed.
Collections
- Resource Catalogue [335]
Author(s)
Roald Eiselen
Metadata
Show full item recordDescription
Contextual masked language model based on the RoBERTa architecture (Liu et al., 2019). The model is trained as a masked language model and not fine-tuned for any downstream process. The model can be used both as a masked LM or as an embedding model to provide real-valued vectorised respresentations of words or string sequences for Tshivenḓa text.
Contact person
Roald EiselenContact person's e-mail address
Roald.Eiselen@nwu.ac.zaPublisher(s)
North-West University; Centre for Text Technology (CTexT)