Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

hive complains about hadoop version if spark is configured in yarn cluster mode

Highlighted

hive complains about hadoop version if spark is configured in yarn cluster mode

New Contributor

Dear hortonworks community,


I submit jobs to spark using zeppelin and I would like to use the cluster nodes as a spark drivers so I don't need a fat machine to run zeppelin and the user code.


I went to zeppelin documentation and did the changes required. However I get the error below upon spark submission.

java.lang.IllegalArgumentException: Unrecognized Hadoop major version number: 3.1.1.3.1.0.0-78
    at org.apache.hadoop.hive.shims.ShimLoader.getMajorVersion(ShimLoader.java:174)
    at org.apache.hadoop.hive.shims.ShimLoader.loadShims(ShimLoader.java:139)
    at org.apache.hadoop.hive.shims.ShimLoader.getHadoopShims(ShimLoader.java:100)
    at org.apache.hadoop.hive.conf.HiveConf$ConfVars.<clinit>(HiveConf.java:368)
    at org.apache.hadoop.hive.conf.HiveConf.<clinit>(HiveConf.java:105)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:348)
    at org.apache.spark.util.Utils$.classForName(Utils.scala:230)
    at org.apache.spark.sql.SparkSession$.hiveClassesArePresent(SparkSession.scala:1063)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.zeppelin.spark.BaseSparkScalaInterpreter.spark2CreateContext(BaseSparkScalaInterpreter.scala:230)
    at org.apache.zeppelin.spark.BaseSparkScalaInterpreter.createSparkContext(BaseSparkScalaInterpreter.scala:165)
    at org.apache.zeppelin.spark.SparkScala211Interpreter.open(SparkScala211Interpreter.scala:87)
    at org.apache.zeppelin.spark.NewSparkInterpreter.open(NewSparkInterpreter.java:102)
    at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:62)
    at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
    at org.apache.zeppelin.spark.PySparkInterpreter.getSparkInterpreter(PySparkInterpreter.java:664)
    at org.apache.zeppelin.spark.PySparkInterpreter.createGatewayServerAndStartScript(PySparkInterpreter.java:260)
    at org.apache.zeppelin.spark.PySparkInterpreter.open(PySparkInterpreter.java:194)
    at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
    at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:617)
    at org.apache.zeppelin.scheduler.Job.run(Job.java:188)
    at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:140)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)


Any advice in regards how can I make the changes requests?


thank you very much