List of natural language processing toolkits

The following Natural language processing toolkits are popular collections of natural language processing software. They are suites of libraries, frameworks, and applications for symbolic, statistical natural language and speech processing. NLP tools usually perform sentence detection, tokenization, POS-tagging, text chunking, lemmatisation, coreference analysis and resolution, and named-entity detection among others.

Name

Language

License

Creators

Website

AlchemyAPI

C, C++, C#, Java, Python, Perl, Ruby

Free or Commercial

Orchestr8

1

Cogito

Commercial

Expert System S.p.A.

2

Cognitive Computation Group

Java,C++

Open for researchers

the Cognitive Computation Group at the University of Illinois at Urbana Champaign

3

Carabao Language Kit

Any COM+ compliant language. Customization is via data entry

Commercial with free development tools

Digital Sonata Pty Ltd

4

Distinguo

C++

Commercial

Ultralingua Inc.

5

Ellogon

C / C++

LGPL

Georgios Petasis

6

FreeLing

C++

GPL

Universitat Politècnica de Catalunya

7

General Architecture for Text Engineering

Java

LGPL

GATE open source communtiy

8

LingPipe

Java

royalty free or commercial

Alias-i

9

LinguaStream

Java

Free for research

University of Caen, France

10

Mallet

Java

Common Public License

University of Massachusetts Amherst

11

MII nlp toolkit

Java

LGPL

UCLA Medical Imaging Informatics (MII) Group

12

Modular Audio Recognition Framework

Java

BSD

The MARF Research and Development Group, Concordia University

13

MontyLingua

Python, Java

Free for research

MIT

14

Natural Language Toolkit (NLTK)

Python

Apache 2.0

15

NooJ (based on INTEX)

.NET Framework-based

Free for research

University of Franche-Comté, France

16

OpenNLP

Java

LGPL

Online community

17

Rosette

Commercial

Basis Technology

18

Stanford NLP

Java

GPL

The Stanford Natural Language Processing Group

19

UIMA

Java / C++

Apache 2.0

Apache

20

WebLab

Java

EADS open source

OW2

21

ru:Программное обеспечение для обработки естественного языка