Support Questions
Find answers, ask questions, and share your expertise
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Python streaming is not working

Python streaming is not working

New Contributor

 Hi! I'm working with Cloudera Manager 5.11.2 installed on Oracle VirtualBox (4 nodes). I wrote simple mapreduce python streaming example. It should count the words. All works fine. But adding string "import pymorphy2" leads to failed streaming on map stage. Looks like streaming interpreter cant work with this library. What can I do to fix this issue?


It's Ubuntu 14.4, Python 2.7 installed on VM.  


PS pymorphy2 works with russian words, I need it to get word's initial forms.


Re: Python streaming is not working

Master Guru
Could you illustrate what error you observe on your failing map tasks?

Also, how are you ensuring that the 3rd party Python module is available across the cluster. Have you pre-installed it cluster-wide, or are you shipping it along as an egg/etc. via your job's distributed cache?
Don't have an account?
Coming from Hortonworks? Activate your account here