· nlp python polyglot

Python: polyglot - ModuleNotFoundError: No module named 'icu'

I wanted to use the polyglot NLP library that my colleague Will Lyon mentioned in his analysis of Russian Twitter Trolls but had installation problems which I thought I’d share in case anyone else experiences the same issues.

I started by trying to install polyglot:

$ pip install polyglot

ImportError: No module named 'icu'

Hmmm I’m not sure what icu is but luckily there’s a GitHub issue covering this problem. That led me to Toby Fleming’s blog post that suggests the following steps:

brew install icu4c
export ICU_VERSION=58
export PYICU_INCLUDES=/usr/local/Cellar/icu4c/58.2/include
export PYICU_LFLAGS=-L/usr/local/Cellar/icu4c/58.2/lib
pip install pyicu

I already had icu4c installed so I just had to make sure that I had the same version of that library as Toby did. I ran the following command to check that:

$ ls -lh /usr/local/Cellar/icu4c/
total 0
drwxr-xr-x  12 markneedham  admin   408B 28 Nov 06:12 58.2

That still wasn’t enough though! I had to install these two libraries as well:

pip install pycld2
pip install morfessor

I was then able to install polyglot, but had to then run the following commands to download the files needed for entity extraction:

polyglot download embeddings2.de
polyglot download ner2.de
polyglot download embeddings2.en
polyglot download ner2.en
  • LinkedIn
  • Tumblr
  • Reddit
  • Google+
  • Pinterest
  • Pocket