Member since
01-31-2017
3
Posts
1
Kudos Received
0
Solutions
08-23-2021
06:07 PM
I am using Spark 2.4.0 CDH 6.3.4. I got the issue of java.lang.ClassCastException: cannot assign instance of org.apache.commons.lang3.time.FastDateFormat to field org.apache.spark.sql.catalyst.csv.CSVOptions.dateFormat of type org.apache.commons.lang3.time.FastDateFormat in instance of org.apache.spark.sql.catalyst.csv.CSVOptions Caused by: java.lang.ClassCastException: cannot assign instance of org.apache.commons.lang3.time.FastDateFormat to field org.apache.spark.sql.catalyst.csv.CSVOptions.dateFormat of type org.apache.commons.lang3.time.FastDateFormat in instance of org.apache.spark.sql.catalyst.csv.CSVOptions at java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2301) at java.io.ObjectStreamClass.setObjFieldValues(ObjectStreamClass.java:1431) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2371) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2289) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2147) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1646) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2365) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2289) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2147) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1646) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2365) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2289) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2147) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1646) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2365) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2289) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2147) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1646) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2365) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2289) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2147) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1646) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:482) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:440) at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75) at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:83) at org.apache.spark.scheduler.Task.run(Task.scala:121) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$11.apply(Executor.scala:407) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1408) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:413) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Finally I able to resolve the issue. I was using org.apache.spark:spark-core_2.11:jar:2.4.0-cdh6.3.4:provided. Even though it is mentioned as provided, but it includes some of the transitive dependencies as scope compile. org.apache.commons:commons-lang3:jar:3.7 is one of those. If you provide commons-lang3 from outside it will create the problem as it gets packaged inside your fat jar. Therefore I forced few of the jars scope as provided explicitly as listed below. org.apache.commons:commons-lang3:3.7 org.apache.zookeeper:zookeeper:3.4.5-cdh6.3.4 io.dropwizard.metrics:metrics-core:3.1.5 com.fasterxml.jackson.core:jackson-databind:2.9.10.6 org.apache.commons:commons-crypto:1.0.0 By doing this application is forced to use the commons-lang3 jar provided by the platform. Pom snippet to solve the issue <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_${scala.binary.version}</artifactId> <version>${spark.core.version}</version> <scope>provided</scope> </dependency> <!-- Declaring following dependencies explicitly as provided as they are not declared as provide as part of spark-core --> <!-- Start --> <dependency> <groupId>org.apache.commons</groupId> <artifactId>commons-lang3</artifactId> <version>3.7</version> <scope>provided</scope> </dependency> <dependency> <groupId>org.apache.zookeeper</groupId> <artifactId>zookeeper</artifactId> <version>3.4.5-cdh6.3.4</version> <scope>provided</scope> </dependency> <dependency> <groupId>io.dropwizard.metrics</groupId> <artifactId>metrics-core</artifactId> <version>3.1.5</version> <scope>provided</scope> </dependency> <dependency> <groupId>com.fasterxml.jackson.core</groupId> <artifactId>jackson-databind</artifactId> <version>2.9.10.6</version> <scope>provided</scope> </dependency> <dependency> <groupId>org.apache.commons</groupId> <artifactId>commons-crypto</artifactId> <version>1.0.0</version> <scope>provided</scope> </dependency> <!-- End -->
... View more