Support Questions

Find answers, ask questions, and share your expertise

Python streaming is not working

New Contributor

 Hi! I'm working with Cloudera Manager 5.11.2 installed on Oracle VirtualBox (4 nodes). I wrote simple mapreduce python streaming example. It should count the words. All works fine. But adding string "import pymorphy2" leads to failed streaming on map stage. Looks like streaming interpreter cant work with this library. What can I do to fix this issue?

 

It's Ubuntu 14.4, Python 2.7 installed on VM.  

 

PS pymorphy2 works with russian words, I need it to get word's initial forms.

1 REPLY 1

Master Guru
Could you illustrate what error you observe on your failing map tasks?

Also, how are you ensuring that the 3rd party Python module is available across the cluster. Have you pre-installed it cluster-wide, or are you shipping it along as an egg/etc. via your job's distributed cache?
Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.