Reply
Explorer
Posts: 16
Registered: ‎09-10-2015

How do we pass a white space as parameter for mapreduce job?

I have a input data which looks something ike below

 

abcda     123     def    32423.432    123213 

 

There are multiple records of such type.

Using mapreduce I have to replace the value with new value in a specific field

 

I am using 3 extra parameters while submitting the MR Job to do that task

columnum, search item, replace time

eg : hadoop jar <jarname> <clasnname> <input-path> <output-path> 1 a z

which gives output as

zbcdz      123     def    32423.432    123213 

 

This works pretty fine. But the problem occurs if i have to pass a whitespace. Suppose if I run the job as below

eg: hadoop jar <jarname> <clasnname> <input-path> <output-path> 1  a <whitespace>

output should be

 bcd      123     def    32423.432    123213 

 

how can I pass any empty space.

Highlighted
Cloudera Employee
Posts: 65
Registered: ‎09-11-2015

Re: How do we pass a white space as parameter for mapreduce job?

Using double quotes is the way we can make two words separated by a
space(s) one argument in the Java world.

java my.MainClass "first argument" second third "fourth argument"

So if you want a white space to be an argument try

1 a " " <<< should give you 3 args

or

1 "a " <<< should give you 2 args with the second arg having a whitespace
at the end.

Evan

Explorer
Posts: 16
Registered: ‎09-10-2015

Re: How do we pass a white space as parameter for mapreduce job?

Unfortunately it didnt work !!!

I tried in windows commandprompt with a standalone java program using with replaceAll(arg1,arg2) function to replace.

 

I tried to replace the letter with space, so passed "" as an argument.

The word pepsi changed to peps""

 

When I tried to pass the argument as "" to a hadopp job in UNIX environment. I got the exception as 

 

 

Spoiler

15/09/25 03:22:16 INFO mapreduce.Job: Task Id : attempt_1440579785423_62258_m_000000_0, Status : FAILED
Error: java.lang.NullPointerException
at java.util.regex.Matcher.appendReplacement(Matcher.java:758)
at java.util.regex.Matcher.replaceAll(Matcher.java:906)
at java.lang.String.replaceAll(String.java:2162)
at com.nielsen.grfe.replaceValue.ReplaceValueRule$ReplaceRuleMapper.map(ReplaceValueRule.java:69)
at com.nielsen.grfe.replaceValue.ReplaceValueRule$ReplaceRuleMapper.map(ReplaceValueRule.java:56)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

 

15/09/25 03:22:21 INFO mapreduce.Job: Task Id : attempt_1440579785423_62258_m_000000_1, Status : FAILED
Error: java.lang.NullPointerException
at java.util.regex.Matcher.appendReplacement(Matcher.java:758)
at java.util.regex.Matcher.replaceAll(Matcher.java:906)
at java.lang.String.replaceAll(String.java:2162)
at com.nielsen.grfe.replaceValue.ReplaceValueRule$ReplaceRuleMapper.map(ReplaceValueRule.java:69)
at com.nielsen.grfe.replaceValue.ReplaceValueRule$ReplaceRuleMapper.map(ReplaceValueRule.java:56)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

 

15/09/25 03:22:25 INFO mapreduce.Job: Task Id : attempt_1440579785423_62258_m_000000_2, Status : FAILED
Error: java.lang.NullPointerException
at java.util.regex.Matcher.appendReplacement(Matcher.java:758)
at java.util.regex.Matcher.replaceAll(Matcher.java:906)
at java.lang.String.replaceAll(String.java:2162)
at com.nielsen.grfe.replaceValue.ReplaceValueRule$ReplaceRuleMapper.map(ReplaceValueRule.java:69)
at com.nielsen.grfe.replaceValue.ReplaceValueRule$ReplaceRuleMapper.map(ReplaceValueRule.java:56)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

 

15/09/25 03:22:31 INFO mapreduce.Job: map 100% reduce 0%
15/09/25 03:22:31 INFO mapreduce.Job: Job job_1440579785423_62258 failed with state FAILED due to: Task failed task_1440579785423_62258_m_000000
Job failed as tasks failed. failedMaps:1 failedReduces:0