New Contributor
Posts: 1
Registered: ‎02-21-2017

Cloudera Quickstart VM to run python mapreduce scripts.



I'm new to the Cloudera world. I have setup the Cloudera Quickstart VM and it looks good. I have written some mapreduce jobs in python and I have those scripts and input files.

How can I run my python scripts in Clouder Quickstart VM? Is there any tutorial or step by step instructions?


Once I test this in my Cloudera Quickstart VM, I want to set up a 3 to 5 node Cloudera Cluster and run the job using multiple nodes however all my scripts are are written in python. I have been looking for material to help how it can be done on a cloudera cluster but so far I had no luck.


Really appreciate your help.

Posts: 519
Topics: 14
Kudos: 92
Solutions: 45
Registered: ‎09-02-2016

Re: Cloudera Quickstart VM to run python mapreduce scripts.



You can create a Jar file and scp to your cluster and run as follows



$hadoop jar <jar> [mainClass] args...



$hadoop jar myJar.jar training.wordcount /user/root/inputfile.txt /user/root/output/


Some links: