Community Articles
Find and share helpful community-sourced technical articles
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.
Labels (1)
Super Guru

Due to conflict in Jackson jar versions, Oozie job with spark2 action(spark action with spark2 sharelib) may get failed with below error:

  2018-06-05 16:53:04,567 [Thread-20] INFO  org.apache.spark.SparkContext  - Created broadcast 0 from showString at NativeMethodAccessorImpl.java:0
  Traceback (most recent call last):
    File "/grid/9/hadoop/yarn/local/usercache/XXXX/appcache/application_1528131553123_0280/container_e81_1528131553123_0280_01_000002/stg_gl_account_classification_master.py", line 9, in <module>
      gacm.show()
    File "/grid/9/hadoop/yarn/local/usercache/XXXX/appcache/application_1528131553123_0280/container_e81_1528131553123_0280_01_000002/python/lib/pyspark.zip/pyspark/sql/dataframe.py", line 318, in show
    File "/grid/9/hadoop/yarn/local/usercache/XXXX/appcache/application_1528131553123_0280/container_e81_1528131553123_0280_01_000002/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1133, in __call__
    File "/grid/9/hadoop/yarn/local/usercache/XXXX/appcache/application_1528131553123_0280/container_e81_1528131553123_0280_01_000002/python/lib/pyspark.zip/pyspark/sql/utils.py", line 63, in deco
    File "/grid/9/hadoop/yarn/local/usercache/XXXX/appcache/application_1528131553123_0280/container_e81_1528131553123_0280_01_000002/python/lib/py4j-0.10.4-src.zip/py4j/protocol.py", line 319, in get_return_value
  py4j.protocol.Py4JJavaError: An error occurred while calling o35.showString.
  : java.lang.ExceptionInInitializerError
  at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:132)
  at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:113)
  at org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:225)
  at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:308)
  at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:38)
  at org.apache.spark.sql.Dataset$anonfun$org$apache$spark$sql$Dataset$execute$1$1.apply(Dataset.scala:2386)
  at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57)
  at org.apache.spark.sql.Dataset.withNewExecutionId(Dataset.scala:2788)
  at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$execute$1(Dataset.scala:2385)
  at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$collect(Dataset.scala:2392)
  at org.apache.spark.sql.Dataset$anonfun$head$1.apply(Dataset.scala:2128)
  at org.apache.spark.sql.Dataset$anonfun$head$1.apply(Dataset.scala:2127)
  at org.apache.spark.sql.Dataset.withTypedCallback(Dataset.scala:2818)
  at org.apache.spark.sql.Dataset.head(Dataset.scala:2127)
  at org.apache.spark.sql.Dataset.take(Dataset.scala:2342)
  at org.apache.spark.sql.Dataset.showString(Dataset.scala:248)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:498)
  at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
  at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
  at py4j.Gateway.invoke(Gateway.java:280)
  at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
  at py4j.commands.CallCommand.execute(CallCommand.java:79)
  at py4j.GatewayConnection.run(GatewayConnection.java:214)
  at java.lang.Thread.run(Thread.java:745)
  Caused by: com.fasterxml.jackson.databind.JsonMappingException: Jackson version is too old 2.4.4
  at com.fasterxml.jackson.module.scala.JacksonModule$class.setupModule(JacksonModule.scala:56)
  at com.fasterxml.jackson.module.scala.DefaultScalaModule.setupModule(DefaultScalaModule.scala:19)
  at com.fasterxml.jackson.databind.ObjectMapper.registerModule(ObjectMapper.java:549)
  at org.apache.spark.rdd.RDDOperationScope$.<init>(RDDOperationScope.scala:82)
  at org.apache.spark.rdd.RDDOperationScope$.<clinit>(RDDOperationScope.scala)
  ... 27 more

.

Why this error?

By default, 'oozie' directory in Oozie sharelib has jackson jars with 2.4.4 version and spark2 sharelib has latest versions of jackson jars.

.

To fix this error, please follow below steps:

Step 1: Move older jackson jars from default oozie sharelib to other directory:

hadoop fs -mv /user/oozie/share/lib/lib_<ts>/oozie/jackson*/user/oozie/share/lib/lib_<ts>/oozie.old

.

Step 2: Update oozie sharelib:

oozie admin -oozie http://<oozie-server-hostname>:11000/oozie -sharelibupdate

.

Please check this article for more details about oozie spark2 action.

.

Please comment if you have any feedback/questions/suggestions. Happy Hadooping!! :)

1,057 Views
0 Kudos
Don't have an account?
Coming from Hortonworks? Activate your account here
Version history
Revision #:
1 of 1
Last update:
‎06-11-2018 11:58 PM
Updated by:
 
Contributors
Top Kudoed Authors