Community Articles
Find and share helpful community-sourced technical articles.
Announcements
Check out our newest addition to the community, the Cloudera Innovation Accelerator group hub.
Labels (1)
Super Guru

Due to conflict in Jackson jar versions, Oozie job with spark2 action(spark action with spark2 sharelib) may get failed with below error:

  2018-06-05 16:53:04,567 [Thread-20] INFO  org.apache.spark.SparkContext  - Created broadcast 0 from showString at NativeMethodAccessorImpl.java:0
  Traceback (most recent call last):
    File "/grid/9/hadoop/yarn/local/usercache/XXXX/appcache/application_1528131553123_0280/container_e81_1528131553123_0280_01_000002/stg_gl_account_classification_master.py", line 9, in <module>
      gacm.show()
    File "/grid/9/hadoop/yarn/local/usercache/XXXX/appcache/application_1528131553123_0280/container_e81_1528131553123_0280_01_000002/python/lib/pyspark.zip/pyspark/sql/dataframe.py", line 318, in show
    File "/grid/9/hadoop/yarn/local/usercache/XXXX/appcache/application_1528131553123_0280/container_e81_1528131553123_0280_01_000002/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1133, in __call__
    File "/grid/9/hadoop/yarn/local/usercache/XXXX/appcache/application_1528131553123_0280/container_e81_1528131553123_0280_01_000002/python/lib/pyspark.zip/pyspark/sql/utils.py", line 63, in deco
    File "/grid/9/hadoop/yarn/local/usercache/XXXX/appcache/application_1528131553123_0280/container_e81_1528131553123_0280_01_000002/python/lib/py4j-0.10.4-src.zip/py4j/protocol.py", line 319, in get_return_value
  py4j.protocol.Py4JJavaError: An error occurred while calling o35.showString.
  : java.lang.ExceptionInInitializerError
  at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:132)
  at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:113)
  at org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:225)
  at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:308)
  at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:38)
  at org.apache.spark.sql.Dataset$anonfun$org$apache$spark$sql$Dataset$execute$1$1.apply(Dataset.scala:2386)
  at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57)
  at org.apache.spark.sql.Dataset.withNewExecutionId(Dataset.scala:2788)
  at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$execute$1(Dataset.scala:2385)
  at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$collect(Dataset.scala:2392)
  at org.apache.spark.sql.Dataset$anonfun$head$1.apply(Dataset.scala:2128)
  at org.apache.spark.sql.Dataset$anonfun$head$1.apply(Dataset.scala:2127)
  at org.apache.spark.sql.Dataset.withTypedCallback(Dataset.scala:2818)
  at org.apache.spark.sql.Dataset.head(Dataset.scala:2127)
  at org.apache.spark.sql.Dataset.take(Dataset.scala:2342)
  at org.apache.spark.sql.Dataset.showString(Dataset.scala:248)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:498)
  at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
  at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
  at py4j.Gateway.invoke(Gateway.java:280)
  at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
  at py4j.commands.CallCommand.execute(CallCommand.java:79)
  at py4j.GatewayConnection.run(GatewayConnection.java:214)
  at java.lang.Thread.run(Thread.java:745)
  Caused by: com.fasterxml.jackson.databind.JsonMappingException: Jackson version is too old 2.4.4
  at com.fasterxml.jackson.module.scala.JacksonModule$class.setupModule(JacksonModule.scala:56)
  at com.fasterxml.jackson.module.scala.DefaultScalaModule.setupModule(DefaultScalaModule.scala:19)
  at com.fasterxml.jackson.databind.ObjectMapper.registerModule(ObjectMapper.java:549)
  at org.apache.spark.rdd.RDDOperationScope$.<init>(RDDOperationScope.scala:82)
  at org.apache.spark.rdd.RDDOperationScope$.<clinit>(RDDOperationScope.scala)
  ... 27 more

.

Why this error?

By default, 'oozie' directory in Oozie sharelib has jackson jars with 2.4.4 version and spark2 sharelib has latest versions of jackson jars.

.

To fix this error, please follow below steps:

Step 1: Move older jackson jars from default oozie sharelib to other directory:

hadoop fs -mv /user/oozie/share/lib/lib_<ts>/oozie/jackson*/user/oozie/share/lib/lib_<ts>/oozie.old

.

Step 2: Update oozie sharelib:

oozie admin -oozie http://<oozie-server-hostname>:11000/oozie -sharelibupdate

.

Please check this article for more details about oozie spark2 action.

.

Please comment if you have any feedback/questions/suggestions. Happy Hadooping!! :)

2,484 Views
0 Kudos
Don't have an account?
Version history
Last update:
‎06-11-2018 11:58 PM
Updated by:
Contributors
Top Kudoed Authors