Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How Run an MapReduce program in a VM from a Java app?

avatar
New Contributor

Hi I'm new in hadoop and I'm trying do some stuff but I don´t know from where start.

 

I'm trying do a simple Java app that start and send data to a MapReduce program.

I read aboout webHdfs and HttpFS, but I don´t sure if it is what I need.

 

 

Can someone from where I can start or give me some advices?

 

Thank you so much

1 ACCEPTED SOLUTION

avatar
Rising Star

Ok, you have to build a MapReduce job and run it.

Basically you need to implement a few interfaces (maper, reducer), create a jar with that code and submit it either command line or from Java create, configure and start a Job.

 

See this tutorial:

https://hadoop.apache.org/docs/r2.6.0/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduce...

 

View solution in original post

3 REPLIES 3

avatar
Rising Star

I think you should do a bit of reading to get some basic concepts first. There are many introductory videos in youtube, for instance.

 

To answer you question. You don't "send data" to a MapReduce job, you store data in hdfs (usually via command line) and then run a job that uses that data as input.

 

Start with the "hello world" of Hadoop, which consists of storing a text file on hdfs and then run a job to count words. 

See https://hadoop.apache.org/docs/r2.6.0/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduce...

 

avatar
New Contributor

Thanks for the answer.

 

I don´t explained well... sorry

 

I know that the data is store in the hdfs, my goal is that a java app generate some data, then the java app send this data to my virtual machine with hadoop installled and the data is stored in hdfs. In parallel the java app need tell to the map reduce to start to computing the data store in the hdfs that was sent from java app.

 

 

avatar
Rising Star

Ok, you have to build a MapReduce job and run it.

Basically you need to implement a few interfaces (maper, reducer), create a jar with that code and submit it either command line or from Java create, configure and start a Job.

 

See this tutorial:

https://hadoop.apache.org/docs/r2.6.0/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduce...