Adventures in Natural Language Processing – No 1. Setup

Natural Language Processing

So this will be a series of articles on natural language processing, this is based on Python 3, and the NL Toolkit, NLTK 3.0, which can be found at


The current version of NLTK is based on Python 2,8, As we are using Python 3, and there is an alpha version available, but requires a bit of a manual process. First we need to install PyYaml 3.10 as NLTK requires a version greater than 3.9 and 3.10 is the latest version.

This can be installed at the command line using pip

> sudo pip install pyyaml

We can then check its installed with ‘pip list’

> pip list

Next head over to, download one of the distributions, then unzip and open command line in root of unzipped files

> sudo python install

This will install NLTK into your current version of Python. You can test the install by starting the Python command line and importing the nltk module

> python
 Python 2.7.5 (default, Oct 11 2013, 15:52:19)
 [GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.2.75)] on darwin
 Type "help", "copyright", "credits" or "license" for more information.
 >>> import nltk

If this failed or throws an error, you install failed.

If there are no errors, then download the corpas ( this also tests nltk is install ), remaining in the Python command line.

>>> import nltk
>>> ()

A dialog will appear, select all to download all packages and then click Download

Any errors in the above and its likely nltk installation failed

We are ready to begin….. watch out for No 2. NLP Basics

Your email address will not be published. Required fields are marked *