Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Cloudera Quickstart VM to run python mapreduce scripts.

Cloudera Quickstart VM to run python mapreduce scripts.

New Contributor

Hi,

 

I'm new to the Cloudera world. I have setup the Cloudera Quickstart VM and it looks good. I have written some mapreduce jobs in python and I have those scripts and input files.

How can I run my python scripts in Clouder Quickstart VM? Is there any tutorial or step by step instructions?

 

Once I test this in my Cloudera Quickstart VM, I want to set up a 3 to 5 node Cloudera Cluster and run the job using multiple nodes however all my scripts are are written in python. I have been looking for material to help how it can be done on a cloudera cluster but so far I had no luck.

 

Really appreciate your help.

1 REPLY 1

Re: Cloudera Quickstart VM to run python mapreduce scripts.

Champion

@dino11092

 

You can create a Jar file and scp to your cluster and run as follows

 

Syntax:

$hadoop jar <jar> [mainClass] args...

 

Ex: 

$hadoop jar myJar.jar training.wordcount /user/root/inputfile.txt /user/root/output/

 

Some links:

https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/CommandsManual.html

 

http://stackoverflow.com/questions/13012511/how-to-run-a-jar-file-in-hadoop