About Mythobeast

Mythobeast · ‎07-06-2015

This error was called during the execution of the job controller within the MapReduce job. Here's a similar one with the same root problem. Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hive/ql/io/orc/OrcNewOutputFormat at com.who.bgt.logloader.schema.OrcFileLoader.run(OrcFileLoader.java:94) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at com.who.bgt.logloader.schema.OrcFileLoader.main(OrcFileLoader.java:45) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.ql.io.orc.OrcNewOutputFormat at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) ... 8 more The specific line it is complaining about is here: job.setOutputFormatClass(OrcNewOutputFormat.class); The obvious problem is that it's failing to find the OrcNewOutputFormat class definition, which is in hive-exec-0.13.1-cdh5.3.5.jar I pushed the jar to hdfs://lib/hive-exec..., and within my main function, I call the following before I run the job: DistributedCache.addFileToClassPath(new Path("/lib/hive-exec-0.13.1-cdh5.3.5.jar"), lConfig); Can you be more explicit on how I go about making sure my distributed-cache configs work? Optimally, I shouldn't have to stuff this one in the distributed cache since it sits in /opt/cloudera/parcels/CDH-5.3.5-1.cdh5.3.5.p0.4/jars/hive-exec-0.13.1-cdh5.3.5.jar on all of my slave nodes, but I also can't figure out how to tell MapReduce to look there.

Mythobeast · ‎05-28-2015

I'm using Java mapreduce job to write data to a directory which will be interpreted as a Hive table in RCFile format. In order to do this, I need to include org.apache.hadoop.hive.serde2.columnar.BytesRefArrayWritable object, which can be found in hive-serde-0.13.1-cdh5.3.3.jar. So far, so good. I've included the jar in my command line like this: /usr/bin/hadoop jar /path/lib/awesome-mapred-0.9.6.jar com.awesome.HiveLoadController -libjars /path/lib/postgresql-8.4-702.jdbc4.jar,/path/lib/hive-serde-0.13.1-cdh5.3.3.jar I know for certain that it is loading the postgres library because it prints correctly retrieved information before it throws the error. I know that it is grabbing and transferring that jar file because it throws a fit if I move it from the /path/lib directory. I know that the object exists in the jar because I've unpacked it and looked. Is there something in the rest of the lib path that might be interfering with it finding that object in the jar?

Mythobeast · ‎05-26-2015

Thank you, mfox. My problem was that the basic install set all of HDFS's groups to "superuser" instead of "hadoop". Changing it to "hadoop" allowed mapreduce to write its history logs to the correct location.

Mythobeast · ‎05-18-2015

I think that the "drilling down into the individual map/reduce tasks" is where this falls apart for me. When I click on the task (e.g. application_1431658373269_0170) it shows me a list of application masters. From there I can click on the node id (e.g. hadoopslave0011p1mdw1.sendgrid.net:8042) This takes me to a page where I can see all of the containers currently running, and see logs for the node itself, which isn't what I need. I can also click on "logs" for the application master. This takes me to a page that says Error getting logs for container_e23_1431658373269_0170_01_000001 Which tells me that the cluster is mis-configured in some way, and isn't even producing them. Can you recommend a next step?

Mythobeast · ‎05-15-2015

This may be a complete noob question, but we're shifting from CDH4 MR1 to CDH5 MR2. I had no problem navigating the menus in the previous version to find the stdout and stderr output from individual mappers and/or reducers, but I can't find them anywhere, either through the interface, on the Yarn node's disks, or on HDFS. Could someone point me in the right direction?

Mythobeast · ‎05-11-2015

I'm on Cloudera 5.3.3. Here's my command line and output: [hadoop]$ hdfs dfs -du / 2298676940886 6896030822658 /output 21297905593 63893716779 /tmp 6072184915396 18216555409976 /user

Mythobeast · ‎05-08-2015

I just switched from Cloudera 4 to Cloudera 5, and the output format for hdfs dfs -du has changed. It now has two columns instead of just one. I'm guessing that the first is the actual content size and the second is the block storage consumption, but I can't find any documentation about this. Can anyone clarify and/or point me the right direction?

Mythobeast · ‎05-01-2015

That was it! Thank you!

Mythobeast · ‎05-01-2015

I'm transferring files using distcp on Cloudera 5.3.x, and I can't get it to distribute the transfer using MR2. I don't have MR1 installed at all. I'd rather not because it'll hide issues. My command line looks like this, and it runs just fine, it just copies every file in series: mapred distcp s3n://key:secret@logs.space.com/source/2015/04/28/ hdfs://nameservice/target/dir/2015/04/28 Is there a configuration item that I missed?

Mythobeast · ‎04-28-2015

This turned out to be a configuration setting issue. "dfs.namenode.shared.edits.dir" had a directory value in it, and needed to be cleared.

Online	Offline
Last Visited	‎07-21-2015 01:55 PM

Member Since	‎06-02-2014 12:47 PM
Last Visited	‎07-21-2015 01:55 PM
Posts	17
Kudos received	1

Cloudera Community

Re: How do I switch HDFS to using Quorum based sto...

Re: How do I delete MultipleOutput files from a ma...

Re: Can't get rid of NoClassDefFoundError: org/apa...

Can't get rid of NoClassDefFoundError: org/apache/...

Re: Where does MR2 output stderr to?

Re: Where does MR2 output stderr to?

Where does MR2 output stderr to?

Re: hdfs du format change

hdfs du format change

Re: distcp not distributing

distcp not distributing

Re: How do I switch HDFS to using Quorum based sto...