Support Questions

Find answers, ask questions, and share your expertise

Spark Weird Error

avatar
Master Guru

import org.apache.spark.SparkContext

import org.apache.spark.sql.SQLContext

val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)

val df = sqlContext.table("tablename")

df.select("location").show(5)

java.lang.AssertionError: assertion failed at scala.Predef$.assert(Predef.scala:165) at org.apache.spark.sql.execution.datasources.LogicalRelation$anonfun$1.apply(LogicalRelation.scala:39) at org.apache.spark.sql.execution.datasources.LogicalRelation$anonfun$1.apply(LogicalRelation.scala:38) at scala.Option.map(Option.scala:145) at org.apache.spark.sql.execution.datasources.LogicalRelation.<init>(LogicalRelation.scala:38) at org.apache.spark.sql.execution.datasources.LogicalRelation.copy(LogicalRelation.scala:31) at org.apache.spark.sql.hive.HiveMetastoreCatalog.org$apache$spark$sql$hive$HiveMetastoreCatalog$convertToOrcRelation(HiveMetastoreCatalog.scala:588) at org.apache.spark.sql.hive.HiveMetastoreCatalog$OrcConversions$anonfun$apply$2.applyOrElse(HiveMetastoreCatalog.scala:647) at org.apache.spark.sql.hive.HiveMetastoreCatalog$OrcConversions$anonfun$apply$2.applyOrElse(HiveMetastoreCatalog.scala:643) at org.apache.spark.sql.catalyst.trees.TreeNode$anonfun$transformUp$1.apply(TreeNode.scala:335) at org.apache.spark.sql.catalyst.trees.TreeNode$anonfun$transformUp$1.apply(TreeNode.scala:335) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:69) at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:334) at org.apache.spark.sql.hive.HiveMetastoreCatalog$OrcConversions$.apply(HiveMetastoreCatalog.scala:643) at org.apache.spark.sql.hive.HiveMetastoreCatalog$OrcConversions$.apply(HiveMetastoreCatalog.scala:637) at org.apache.spark.sql.catalyst.rules.RuleExecutor$anonfun$execute$1$anonfun$apply$1.apply(RuleExecutor.scala:83) at org.apache.spark.sql.catalyst.rules.RuleExecutor$anonfun$execute$1$anonfun$apply$1.apply(RuleExecutor.scala:80) at scala.collection.LinearSeqOptimized$class.foldLeft(LinearSeqOptimized.scala:111) at scala.collection.immutable.List.foldLeft(List.scala:84) at org.apache.spark.sql.catalyst.rules.RuleExecutor$anonfun$execute$1.apply(RuleExecutor.scala:80) at org.apache.spark.sql.catalyst.rules.RuleExecutor$anonfun$execute$1.apply(RuleExecutor.scala:72) at scala.collection.immutable.List.foreach(List.scala:318) at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:72) at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:36) at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:36) at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:34) at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:133) at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:52) at org.apache.spark.sql.SQLContext.table(SQLContext.scala:831) at org.apache.spark.sql.SQLContext.table(SQLContext.scala:827)

1 ACCEPTED SOLUTION

avatar

pretty low level. Looking into the source, it looks like the assertion is that

assert(expectedAttrs.length == attrs.length)

What does that mean? I'm not entirely sure. Looking through google shows up

1. Stack overflow http://stackoverflow.com/questions/38740862/not-able-to-fetch-result-from-hive-transaction-enabled-t...

2. SPARK-18355:Spark SQL fails to read data from a ORC hive table that has a new column added to it

If #2 is the cause, there's no obvious workaround right now. There's some details on #1 on maybe how to avoid the problem

View solution in original post

6 REPLIES 6

avatar

pretty low level. Looking into the source, it looks like the assertion is that

assert(expectedAttrs.length == attrs.length)

What does that mean? I'm not entirely sure. Looking through google shows up

1. Stack overflow http://stackoverflow.com/questions/38740862/not-able-to-fetch-result-from-hive-transaction-enabled-t...

2. SPARK-18355:Spark SQL fails to read data from a ORC hive table that has a new column added to it

If #2 is the cause, there's no obvious workaround right now. There's some details on #1 on maybe how to avoid the problem

avatar

@Timothy Spann Can you confirm if #2 is your case? if yes, I've got a workaround.

avatar
Master Guru

it's number 1

avatar
New Contributor

Sandeep, We're facing similar issues while using Zeppelin and believe it's because of #2:

%sql select col1, col2 from db.orc_table where col2 = "val" and col1 = 12345 java.lang.AssertionError: assertion failed at scala.Predef$.assert(Predef.scala:165) at org.apache.spark.sql.execution.datasources.LogicalRelation$anonfun$1.apply(LogicalRelation.scala:39) at org.apache.spark.sql.execution.datasources.LogicalRelation$anonfun$1.apply(LogicalRelation.scala:38) at scala.Option.map(Option.scala:145) at org.apache.spark.sql.execution.datasources.LogicalRelation.<init>(LogicalRelation.scala:38) at org.apache.spark.sql.execution.datasources.LogicalRelation.copy(LogicalRelation.scala:31) at org.apache.spark.sql.hive.HiveMetastoreCatalog.org$apache$spark$sql$hive$HiveMetastoreCatalog$convertToOrcRelation(HiveMetastoreCatalog.scala:588) at org.apache.spark.sql.hive.HiveMetastoreCatalog$OrcConversions$anonfun$apply$2.applyOrElse(HiveMetastoreCatalog.scala:647) at org.apache.spark.sql.hive.HiveMetastoreCatalog$OrcConversions$anonfun$apply$2.applyOrElse(HiveMetastoreCatalog.scala:643)

Can you please share your workaround?

avatar
@James Barney

Just realised that workaround for 1 and 2 are same set "spark.sql.hive.convertMetastoreOrc", "false"

avatar
New Contributor

Thank you! for future reference in zeppelin: you set this attribute in the interpreter configuration, not in the paragraph where the sql is being executed.