Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

lastest HDP 2.6.5.0-292 DataFrame show() throws an error

Solved Go to solution

lastest HDP 2.6.5.0-292 DataFrame show() throws an error

Explorer

Hi ,

I'm using latest HDP ,version is 2.6.5.0-292. spark version is 2.3.0

when I'm trying to run show() from any DataFrame ,it always throw error :

scala> spark.read.csv("/user/a.txt").show()

java.lang.NoSuchMethodError: net.jpountz.lz4.LZ4BlockInputStream.<init>(Ljava/io/InputStream;Z)V at org.apache.spark.io.LZ4CompressionCodec.compressedInputStream(CompressionCodec.scala:122) at org.apache.spark.sql.execution.SparkPlan.org$apache$spark$sql$execution$SparkPlan$decodeUnsafeRows(SparkPlan.scala:274) at org.apache.spark.sql.execution.SparkPlan$anonfun$executeTake$1.apply(SparkPlan.scala:366) at org.apache.spark.sql.execution.SparkPlan$anonfun$executeTake$1.apply(SparkPlan.scala:366) at scala.collection.TraversableLike$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.TraversableLike$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) at scala.collection.mutable.ArrayOps$ofRef.flatMap(ArrayOps.scala:186) at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:366) at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:38) at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$collectFromPlan(Dataset.scala:3272) at org.apache.spark.sql.Dataset$anonfun$head$1.apply(Dataset.scala:2484) at org.apache.spark.sql.Dataset$anonfun$head$1.apply(Dataset.scala:2484) at org.apache.spark.sql.Dataset$anonfun$52.apply(Dataset.scala:3253) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77) at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3252) at org.apache.spark.sql.Dataset.head(Dataset.scala:2484) at org.apache.spark.sql.Dataset.take(Dataset.scala:2698) at org.apache.spark.sql.execution.datasources.csv.TextInputCSVDataSource$.infer(CSVDataSource.scala:148) at org.apache.spark.sql.execution.datasources.csv.CSVDataSource.inferSchema(CSVDataSource.scala:63) at org.apache.spark.sql.execution.datasources.csv.CSVFileFormat.inferSchema(CSVFileFormat.scala:57) at org.apache.spark.sql.execution.datasources.DataSource$anonfun$8.apply(DataSource.scala:202) at org.apache.spark.sql.execution.datasources.DataSource$anonfun$8.apply(DataSource.scala:202) at scala.Option.orElse(Option.scala:289) at org.apache.spark.sql.execution.datasources.DataSource.getOrInferFileFormatSchema(DataSource.scala:201) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:392) at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:239) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:227) at org.apache.spark.sql.DataFrameReader.csv(DataFrameReader.scala:596) at org.apache.spark.sql.DataFrameReader.csv(DataFrameReader.scala:473)


I've tried both pyspark and spark-shell on 3 sets of newly installed hdp 2.6.5.0-292.

the DataFrame writing function works well ,only show() throws the error.

are there anyone encountered same issue as I had? how to fix this problem?

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: lastest HDP 2.6.5.0-292 DataFrame show() throws an error

@dalin qin this type of errors are due multiple versions of same jar in classpath. Could you run

lsof -P -p <pid> | grep lz4

this will hopefully show the places from where the lz4 jar is being used and probably the incorrect version is being picked. Note: pid is the spark shell pid

HTH

View solution in original post

2 REPLIES 2
Highlighted

Re: lastest HDP 2.6.5.0-292 DataFrame show() throws an error

@dalin qin this type of errors are due multiple versions of same jar in classpath. Could you run

lsof -P -p <pid> | grep lz4

this will hopefully show the places from where the lz4 jar is being used and probably the incorrect version is being picked. Note: pid is the spark shell pid

HTH

View solution in original post

Highlighted

Re: lastest HDP 2.6.5.0-292 DataFrame show() throws an error

Explorer

thank you very much ,that' my bad ,I had added some other jars in my class path leading to this error.

Don't have an account?
Coming from Hortonworks? Activate your account here