Python: polyglot - ModuleNotFoundError: No module named 'icu'
I wanted to use the polyglot NLP library that my colleague Will Lyon mentioned in his analysis of Russian Twitter Trolls but had installation problems which I thought I’d share in case anyone else experiences the same issues.
I started by trying to install polyglot:
$ pip install polyglot
ImportError: No module named 'icu'
Hmmm I’m not sure what icu is but luckily there’s a GitHub issue covering this problem. That led me to Toby Fleming’s blog post that suggests the following steps:
brew install icu4c
export ICU_VERSION=58
export PYICU_INCLUDES=/usr/local/Cellar/icu4c/58.2/include
export PYICU_LFLAGS=-L/usr/local/Cellar/icu4c/58.2/lib
pip install pyicu
I already had icu4c installed so I just had to make sure that I had the same version of that library as Toby did. I ran the following command to check that:
$ ls -lh /usr/local/Cellar/icu4c/
total 0
drwxr-xr-x 12 markneedham admin 408B 28 Nov 06:12 58.2
That still wasn’t enough though! I had to install these two libraries as well:
pip install pycld2
pip install morfessor
I was then able to install polyglot, but had to then run the following commands to download the files needed for entity extraction:
polyglot download embeddings2.de
polyglot download ner2.de
polyglot download embeddings2.en
polyglot download ner2.en
About the author
I'm currently working on short form content at ClickHouse. I publish short 5 minute videos showing how to solve data problems on YouTube @LearnDataWithMark. I previously worked on graph analytics at Neo4j, where I also co-authored the O'Reilly Graph Algorithms Book with Amy Hodler.