Support Questions
Find answers, ask questions, and share your expertise

Implementing WordCount with Cascading on HDP 2.1 Sandbox

Solved Go to solution
Highlighted

Implementing WordCount with Cascading on HDP 2.1 Sandbox

Explorer

Hi,

Could you help me to resolve this error.

[root@sandbox ~]# yarn jar /tmp/MRJar/WordCount.jar com.denmark.danskeBank.vo.WordCount /tmp/data/hamlet.txt /tmp/output                             
Not a valid JAR: /tmp/MRJar/WordCount.jar

This is what I have done: 1. Created the WordCount.jar file in eclipse with Hadoop1.x jars 2. Uploaded to HDFS dir - /tmp/MRJar 3. I got this error. Then I tried -

[root@sandbox ~]# hadoop fs -copyToLocal /tmp/MRJar/WordCount.jar /MapReduce                                                                         

16/02/21 05:46:07 WARN hdfs.DFSClient: DFSInputStream has been closed already

I also tried the steps given to run through gradle.

1. While executing - ~/gradle-1.9/bin/gradle clean jar, I got an error:

[cascade@sandbox part2]$ ~/gradle-1.9/bin/gradle clean jar
ERROR: JAVA_HOME is set to an invalid directory: /usr/lib/jvm/java-1.7.0-openjdk-1.7.0.91.x8

I would wanted to create my MapReduce and try executing for practising - I am not a java developer.

Could you guide me!!!

Thanks.

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Implementing WordCount with Cascading on HDP 2.1 Sandbox

@Revathy Mourouguessane You can ignore this warning WARN hdfs.DFSClient: DFSInputStream has been closed already. There is a jira opened already to address this

I am sure you are following this http://hortonworks.com/hadoop-tutorial/cascading-hortonworks-data-platform-2-1/

Check your JAVA_HOME setting.

What's is the output of echo $JAVA_HOME?

View solution in original post

7 REPLIES 7
Highlighted

Re: Implementing WordCount with Cascading on HDP 2.1 Sandbox

@Revathy Mourouguessane, See this thread. It provides guidance for running "word count" program on HDP Sandbox.

https://www.youtube.com/watch?v=5MYv8usiMnE

Hope this will help you solving this error.

Highlighted

Re: Implementing WordCount with Cascading on HDP 2.1 Sandbox

Mentor

Your jae needs to be on local filesysyem and preferably not /tmp. If you are logged on as root leavw the jar in /root and just run it like so

yarn jar jarname
Highlighted

Re: Implementing WordCount with Cascading on HDP 2.1 Sandbox

Mentor

For Java homecerror run one of my jdk scripts in administration folder https://github.com/dbist/scripts

Highlighted

Re: Implementing WordCount with Cascading on HDP 2.1 Sandbox

@Revathy Mourouguessane You can ignore this warning WARN hdfs.DFSClient: DFSInputStream has been closed already. There is a jira opened already to address this

I am sure you are following this http://hortonworks.com/hadoop-tutorial/cascading-hortonworks-data-platform-2-1/

Check your JAVA_HOME setting.

What's is the output of echo $JAVA_HOME?

View solution in original post

Highlighted

Re: Implementing WordCount with Cascading on HDP 2.1 Sandbox

2304-screen-shot-2016-02-21-at-70701-am.png

[guest@sandbox dataprocessing]$ find / -name java

[guest@sandbox dataprocessing]$ export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.95.x86_64/jre

[guest@sandbox dataprocessing]$ ~/gradle-1.9/bin/gradle clean jar

Build file '/home/guest/examples/dataprocessing/build.gradle': line 9

The RepositoryHandler.mavenRepo() method has been deprecated and is scheduled to be removed in Gradle 2.0. Please use the maven() method instead.

:clean UP-TO-DATE

:compileJava

Highlighted

Re: Implementing WordCount with Cascading on HDP 2.1 Sandbox

2305-screen-shot-2016-02-21-at-71036-am.png

BUILD SUCCESSFUL

Highlighted

Re: Implementing WordCount with Cascading on HDP 2.1 Sandbox

Explorer

I was successful in executing a MapReduce Job. Since the method Job.setBy.JarName(WordCount.class) was missing it was unable to find out the Mapper class. Thanks!!!