Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

How to execute mr job with parquet ?

avatar
New Member

I'm trying to execute test mr job on Sandbox 2.6

#export HADOOP_CLASSPATH=/usr/hdp/2.6.0.3-8/hadoop/lib
# yarn jar testmr.jar TestReadParquet /testdir/dir1  out_fileException in thread "main" java.lang.NoClassDefFoundError: parquet/Log  at TestReadParquet.<clinit>(TestReadParquet.java:24)  at java.lang.Class.forName0(Native Method)  at java.lang.Class.forName(Class.java:348)  at org.apache.hadoop.util.RunJar.run(RunJar.java:226)  at org.apache.hadoop.util.RunJar.main(RunJar.java:148)Caused by: java.lang.ClassNotFoundException: parquet.Log  at java.net.URLClassLoader.findClass(URLClassLoader.java:381)  at java.lang.ClassLoader.loadClass(ClassLoader.java:424)  at java.lang.ClassLoader.loadClass(ClassLoader.java:357)  ... 5 more

Should I put parquet-common jars in HADOOP_CLASSPATH ?

find /usr/hdp/2.6.0.3-8/hadoop/lib/ -name "parquet*.jar" no files at this moment

1 ACCEPTED SOLUTION

avatar
Super Guru

@Triffids G if you have hive installed on the sandbox you can copy

/usr/hdp/2.6.0.3-8/hive/lib/parquet-hadoop-bundle-1.8.1.jar and paste it to /usr/hdp/2.6.0.3-8/hadoop/lib and try running the job again and let me know how it goes.

View solution in original post

3 REPLIES 3

avatar
Super Guru

@Triffids G if you have hive installed on the sandbox you can copy

/usr/hdp/2.6.0.3-8/hive/lib/parquet-hadoop-bundle-1.8.1.jar and paste it to /usr/hdp/2.6.0.3-8/hadoop/lib and try running the job again and let me know how it goes.

avatar
New Member

The jar copied, but I've got the same exception:

[root@sandbox target]# export HADOOP_CLASSPATH=/usr/hdp/2.6.0.3-8/hadoop/lib
[root@sandbox target]# find /usr/hdp/2.6.0.3-8/hadoop/lib/ -name "parquet*.jar"
/usr/hdp/2.6.0.3-8/hadoop/lib/parquet-hadoop-bundle-1.8.1.jar
[root@sandbox target]# yarn jar testmr-1.0-SNAPSHOT.jar TestReadParquet /testdir/dir1  /testdir/out_file
Exception in thread "main" java.lang.NoClassDefFoundError: parquet/Log
  at TestReadParquet.<clinit>(TestReadParquet.java:24)
  at java.lang.Class.forName0(Native Method)
  at java.lang.Class.forName(Class.java:348)
  at org.apache.hadoop.util.RunJar.run(RunJar.java:226)
  at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
Caused by: java.lang.ClassNotFoundException: parquet.Log
  at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
  ... 5 more

avatar
New Member

I've recompiled sources with parquet.version 1.6 and copied parquet-hadoop-bundle-1.6.0.jar. The job is working now, thanks !