Member since
09-15-2013
6
Posts
0
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5132 | 09-17-2013 11:08 AM |
09-17-2013
02:30 PM
I got it! But it seems overly complicated to grab one crummy line from a file... 375 String dateString = "";
376 FileSystem fileSystem = FileSystem.get(configuration);
377 FileStatus[] fileStatus = fileSystem.listStatus(new Path("/temp/query2job2temp"));
378 for (FileStatus status : fileStatus) {
379 Path path = status.getPath();
380 if(path.toString().matches(".*part.*")) {
381 BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(fileSystem.open(path)));
382 dateString = bufferedReader.readLine();
383 Pattern pattern = Pattern.compile("([0-9]{2}/[0-9]{2}/[0-9]{2})");
384 Matcher matcher = pattern.matcher(dateString);
385 if(matcher.find()) {
386 dateString = matcher.group(0);
387 }
388 bufferedReader.close();
389 }
390 }
... View more
09-17-2013
01:15 PM
Now if only I could read the ouput of a previous job in the setup of another....
... View more
09-17-2013
11:08 AM
I just needed my second job mapper to output 1 as the key and the file line as the value and take care of things in the reducer. IT WORKS!!! That was a lot of work. Now onto part 2 of 3. (due tomorrow, I'm hopefull though 🙂 )
... View more
09-17-2013
10:49 AM
Now I have to output a single top 20 list from the reducer. It seems there are multiple reducers. How can I limit it to just one? I can't change the configuration since my professor will be running my code on his own hadoop installation.
... View more
09-17-2013
08:32 AM
My professor helped me with this one. My assignment requires multiple jobs and I had a hard-coded absolute path for FileOutputFormat.setOutputPath(). I suppose an absolute path would work if I got it right, but a relative path works. So now my first job executes! Now I just need to figure out what's wrong with my second, but I think that is much more a straight-forward programming problem.
... View more
09-15-2013
09:25 PM
I apologize if this is not correct. I am taking a Big Data course and am trying to get my first hadoop program running. I have the file I want to look at uploaded to the hdfs. I can see it at http://localhost:50075 when I browse the directory. I set the input path for my mapper like so: FileInputFormat.addInputPath(job1, new Path(args[0])); with args as either "hdfs::/localhost:9000/path/to/file" or "path/to/file" I get the same result. I get a lot of these: "java.lang.IllegalArgumentException: Wrong FS: hdfs://localhost:50075/path/to/file, expected: hdfs://localhost:9000" At the end I get this: "13/09/15 22:22:32 INFO mapred.JobClient: Job complete: job_201309151109_0016 13/09/15 22:22:32 INFO mapred.JobClient: Counters: 7 13/09/15 22:22:32 INFO mapred.JobClient: Job Counters 13/09/15 22:22:32 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=109261 13/09/15 22:22:32 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0 13/09/15 22:22:32 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0 13/09/15 22:22:32 INFO mapred.JobClient: Launched map tasks=8 13/09/15 22:22:32 INFO mapred.JobClient: Data-local map tasks=8 13/09/15 22:22:32 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=0 13/09/15 22:22:32 INFO mapred.JobClient: Failed map tasks=1" What am I doing wrong?
... View more
Labels:
- Labels:
-
Apache Hadoop
-
HDFS