About NerdcoreSteve

NerdcoreSteve · ‎09-17-2013

I got it! But it seems overly complicated to grab one crummy line from a file... 375 String dateString = ""; 376 FileSystem fileSystem = FileSystem.get(configuration); 377 FileStatus[] fileStatus = fileSystem.listStatus(new Path("/temp/query2job2temp")); 378 for (FileStatus status : fileStatus) { 379 Path path = status.getPath(); 380 if(path.toString().matches(".*part.*")) { 381 BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(fileSystem.open(path))); 382 dateString = bufferedReader.readLine(); 383 Pattern pattern = Pattern.compile("([0-9]{2}/[0-9]{2}/[0-9]{2})"); 384 Matcher matcher = pattern.matcher(dateString); 385 if(matcher.find()) { 386 dateString = matcher.group(0); 387 } 388 bufferedReader.close(); 389 } 390 }

NerdcoreSteve · ‎09-17-2013

Now if only I could read the ouput of a previous job in the setup of another....

NerdcoreSteve · ‎09-17-2013

I just needed my second job mapper to output 1 as the key and the file line as the value and take care of things in the reducer. IT WORKS!!! That was a lot of work. Now onto part 2 of 3. (due tomorrow, I'm hopefull though 🙂 )

NerdcoreSteve · ‎09-17-2013

Now I have to output a single top 20 list from the reducer. It seems there are multiple reducers. How can I limit it to just one? I can't change the configuration since my professor will be running my code on his own hadoop installation.

NerdcoreSteve · ‎09-17-2013

My professor helped me with this one. My assignment requires multiple jobs and I had a hard-coded absolute path for FileOutputFormat.setOutputPath(). I suppose an absolute path would work if I got it right, but a relative path works. So now my first job executes! Now I just need to figure out what's wrong with my second, but I think that is much more a straight-forward programming problem.

NerdcoreSteve · ‎09-15-2013

I apologize if this is not correct. I am taking a Big Data course and am trying to get my first hadoop program running. I have the file I want to look at uploaded to the hdfs. I can see it at http://localhost:50075 when I browse the directory. I set the input path for my mapper like so: FileInputFormat.addInputPath(job1, new Path(args[0])); with args as either "hdfs::/localhost:9000/path/to/file" or "path/to/file" I get the same result. I get a lot of these: "java.lang.IllegalArgumentException: Wrong FS: hdfs://localhost:50075/path/to/file, expected: hdfs://localhost:9000" At the end I get this: "13/09/15 22:22:32 INFO mapred.JobClient: Job complete: job_201309151109_0016 13/09/15 22:22:32 INFO mapred.JobClient: Counters: 7 13/09/15 22:22:32 INFO mapred.JobClient: Job Counters 13/09/15 22:22:32 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=109261 13/09/15 22:22:32 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0 13/09/15 22:22:32 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0 13/09/15 22:22:32 INFO mapred.JobClient: Launched map tasks=8 13/09/15 22:22:32 INFO mapred.JobClient: Data-local map tasks=8 13/09/15 22:22:32 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=0 13/09/15 22:22:32 INFO mapred.JobClient: Failed map tasks=1" What am I doing wrong?

Online	Offline
Last Visited	‎09-17-2013 06:41 PM

Member Since	‎09-15-2013 09:25 PM
Last Visited	‎09-17-2013 06:41 PM
Posts	6

Cloudera Community

Re: It was suggested a non-cloudera hadoop user co...

Re: It was suggested a non-cloudera hadoop user co...

Re: It was suggested a non-cloudera hadoop user co...

Re: It was suggested a non-cloudera hadoop user co...

Re: It was suggested a non-cloudera hadoop user co...

Re: It was suggested a non-cloudera hadoop user co...

It was suggested a non-cloudera hadoop user could ...