Downloads


Compound Noun Compositionality Dataset

Compositionality Dataset described in Reddy, McCarthy and Manandhar (2011, IJCNLP).
Alternate download link from Diana McCarthy


POS Taggers, Corpora, Lemmatizers, Morph Analyzers for Indian Languages

Most of these tools are developed by the methods described in Reddy and Sharoff (2011, CLIA @ IJCNLP). Some of the taggers are built using cross-lingual resources and some using mono-lingual resources. Please read corresponding README's of each tool for additional information.

This work is supported by Sketch Engine and Intellitext project.

If you need resources for any other Indian languages, please contact me.


Kannada Tools

Download v2.0
Sample Output of the tagger
For the complete corpus described in the paper, please contact me.

Alternate download link from Serge Sharoff


Telugu Tools

Download v2.0
Sample Output of the tagger


Hindi Tools

Download v2.0
Sample Output of the tagger


Indonesian and Malaysian morphological analyzer, part-of-speech (POS) tagger, Machine Translation System

With support from Sketch Engine, I have made few contributions to the Apertium Indonesian-Malaysian language pair. All the tools can be downloaded from svn repository https://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium...

To download use the command

"svn co https://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-id-ms/"


Comments

Admin

I would like to hear from you. Users are welcome to add comments on the tools, provide suggestions, and report bugs.

Siva

HI,do you have any resource

HI,do you have any resource for tulu language?
thank you.

Tulu POS Tagger

You may try Kannada resources for Tulu. To collect Tulu corpus, you can try BooTCat http://bootcat.sslmit.unibo.it/

Site Counter