Created on 04-20-2015 12:21 AM - edited 09-16-2022 02:26 AM
Hi I'm new in hadoop and I'm trying do some stuff but I don´t know from where start.
I'm trying do a simple Java app that start and send data to a MapReduce program.
I read aboout webHdfs and HttpFS, but I don´t sure if it is what I need.
Can someone from where I can start or give me some advices?
Thank you so much
Created 04-23-2015 03:10 AM
Ok, you have to build a MapReduce job and run it.
Basically you need to implement a few interfaces (maper, reducer), create a jar with that code and submit it either command line or from Java create, configure and start a Job.
See this tutorial:
Created 04-21-2015 07:11 AM
I think you should do a bit of reading to get some basic concepts first. There are many introductory videos in youtube, for instance.
To answer you question. You don't "send data" to a MapReduce job, you store data in hdfs (usually via command line) and then run a job that uses that data as input.
Start with the "hello world" of Hadoop, which consists of storing a text file on hdfs and then run a job to count words.
Created on 04-23-2015 03:00 AM - edited 04-23-2015 03:02 AM
Thanks for the answer.
I don´t explained well... sorry
I know that the data is store in the hdfs, my goal is that a java app generate some data, then the java app send this data to my virtual machine with hadoop installled and the data is stored in hdfs. In parallel the java app need tell to the map reduce to start to computing the data store in the hdfs that was sent from java app.
Created 04-23-2015 03:10 AM
Ok, you have to build a MapReduce job and run it.
Basically you need to implement a few interfaces (maper, reducer), create a jar with that code and submit it either command line or from Java create, configure and start a Job.
See this tutorial: