Support Questions

Find answers, ask questions, and share your expertise

How to execute mr job with parquet ?

avatar
Contributor

I'm trying to execute test mr job on Sandbox 2.6

#export HADOOP_CLASSPATH=/usr/hdp/2.6.0.3-8/hadoop/lib
# yarn jar testmr.jar TestReadParquet /testdir/dir1  out_fileException in thread "main" java.lang.NoClassDefFoundError: parquet/Log  at TestReadParquet.<clinit>(TestReadParquet.java:24)  at java.lang.Class.forName0(Native Method)  at java.lang.Class.forName(Class.java:348)  at org.apache.hadoop.util.RunJar.run(RunJar.java:226)  at org.apache.hadoop.util.RunJar.main(RunJar.java:148)Caused by: java.lang.ClassNotFoundException: parquet.Log  at java.net.URLClassLoader.findClass(URLClassLoader.java:381)  at java.lang.ClassLoader.loadClass(ClassLoader.java:424)  at java.lang.ClassLoader.loadClass(ClassLoader.java:357)  ... 5 more

Should I put parquet-common jars in HADOOP_CLASSPATH ?

find /usr/hdp/2.6.0.3-8/hadoop/lib/ -name "parquet*.jar" no files at this moment

1 ACCEPTED SOLUTION

avatar
Super Guru

@Triffids G if you have hive installed on the sandbox you can copy

/usr/hdp/2.6.0.3-8/hive/lib/parquet-hadoop-bundle-1.8.1.jar and paste it to /usr/hdp/2.6.0.3-8/hadoop/lib and try running the job again and let me know how it goes.

View solution in original post

3 REPLIES 3

avatar
Super Guru

@Triffids G if you have hive installed on the sandbox you can copy

/usr/hdp/2.6.0.3-8/hive/lib/parquet-hadoop-bundle-1.8.1.jar and paste it to /usr/hdp/2.6.0.3-8/hadoop/lib and try running the job again and let me know how it goes.

avatar
Contributor

The jar copied, but I've got the same exception:

[root@sandbox target]# export HADOOP_CLASSPATH=/usr/hdp/2.6.0.3-8/hadoop/lib
[root@sandbox target]# find /usr/hdp/2.6.0.3-8/hadoop/lib/ -name "parquet*.jar"
/usr/hdp/2.6.0.3-8/hadoop/lib/parquet-hadoop-bundle-1.8.1.jar
[root@sandbox target]# yarn jar testmr-1.0-SNAPSHOT.jar TestReadParquet /testdir/dir1  /testdir/out_file
Exception in thread "main" java.lang.NoClassDefFoundError: parquet/Log
  at TestReadParquet.<clinit>(TestReadParquet.java:24)
  at java.lang.Class.forName0(Native Method)
  at java.lang.Class.forName(Class.java:348)
  at org.apache.hadoop.util.RunJar.run(RunJar.java:226)
  at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
Caused by: java.lang.ClassNotFoundException: parquet.Log
  at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
  ... 5 more

avatar
Contributor

I've recompiled sources with parquet.version 1.6 and copied parquet-hadoop-bundle-1.6.0.jar. The job is working now, thanks !