Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Spark Weird Error

avatar
Master Guru

import org.apache.spark.SparkContext

import org.apache.spark.sql.SQLContext

val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)

val df = sqlContext.table("tablename")

df.select("location").show(5)

java.lang.AssertionError: assertion failed at scala.Predef$.assert(Predef.scala:165) at org.apache.spark.sql.execution.datasources.LogicalRelation$anonfun$1.apply(LogicalRelation.scala:39) at org.apache.spark.sql.execution.datasources.LogicalRelation$anonfun$1.apply(LogicalRelation.scala:38) at scala.Option.map(Option.scala:145) at org.apache.spark.sql.execution.datasources.LogicalRelation.<init>(LogicalRelation.scala:38) at org.apache.spark.sql.execution.datasources.LogicalRelation.copy(LogicalRelation.scala:31) at org.apache.spark.sql.hive.HiveMetastoreCatalog.org$apache$spark$sql$hive$HiveMetastoreCatalog$convertToOrcRelation(HiveMetastoreCatalog.scala:588) at org.apache.spark.sql.hive.HiveMetastoreCatalog$OrcConversions$anonfun$apply$2.applyOrElse(HiveMetastoreCatalog.scala:647) at org.apache.spark.sql.hive.HiveMetastoreCatalog$OrcConversions$anonfun$apply$2.applyOrElse(HiveMetastoreCatalog.scala:643) at org.apache.spark.sql.catalyst.trees.TreeNode$anonfun$transformUp$1.apply(TreeNode.scala:335) at org.apache.spark.sql.catalyst.trees.TreeNode$anonfun$transformUp$1.apply(TreeNode.scala:335) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:69) at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:334) at org.apache.spark.sql.hive.HiveMetastoreCatalog$OrcConversions$.apply(HiveMetastoreCatalog.scala:643) at org.apache.spark.sql.hive.HiveMetastoreCatalog$OrcConversions$.apply(HiveMetastoreCatalog.scala:637) at org.apache.spark.sql.catalyst.rules.RuleExecutor$anonfun$execute$1$anonfun$apply$1.apply(RuleExecutor.scala:83) at org.apache.spark.sql.catalyst.rules.RuleExecutor$anonfun$execute$1$anonfun$apply$1.apply(RuleExecutor.scala:80) at scala.collection.LinearSeqOptimized$class.foldLeft(LinearSeqOptimized.scala:111) at scala.collection.immutable.List.foldLeft(List.scala:84) at org.apache.spark.sql.catalyst.rules.RuleExecutor$anonfun$execute$1.apply(RuleExecutor.scala:80) at org.apache.spark.sql.catalyst.rules.RuleExecutor$anonfun$execute$1.apply(RuleExecutor.scala:72) at scala.collection.immutable.List.foreach(List.scala:318) at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:72) at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:36) at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:36) at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:34) at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:133) at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:52) at org.apache.spark.sql.SQLContext.table(SQLContext.scala:831) at org.apache.spark.sql.SQLContext.table(SQLContext.scala:827)

1 ACCEPTED SOLUTION

avatar

pretty low level. Looking into the source, it looks like the assertion is that

assert(expectedAttrs.length == attrs.length)

What does that mean? I'm not entirely sure. Looking through google shows up

1. Stack overflow http://stackoverflow.com/questions/38740862/not-able-to-fetch-result-from-hive-transaction-enabled-t...

2. SPARK-18355:Spark SQL fails to read data from a ORC hive table that has a new column added to it

If #2 is the cause, there's no obvious workaround right now. There's some details on #1 on maybe how to avoid the problem

View solution in original post

6 REPLIES 6

avatar

pretty low level. Looking into the source, it looks like the assertion is that

assert(expectedAttrs.length == attrs.length)

What does that mean? I'm not entirely sure. Looking through google shows up

1. Stack overflow http://stackoverflow.com/questions/38740862/not-able-to-fetch-result-from-hive-transaction-enabled-t...

2. SPARK-18355:Spark SQL fails to read data from a ORC hive table that has a new column added to it

If #2 is the cause, there's no obvious workaround right now. There's some details on #1 on maybe how to avoid the problem

avatar

@Timothy Spann Can you confirm if #2 is your case? if yes, I've got a workaround.

avatar
Master Guru

it's number 1

avatar
New Contributor

Sandeep, We're facing similar issues while using Zeppelin and believe it's because of #2:

%sql select col1, col2 from db.orc_table where col2 = "val" and col1 = 12345 java.lang.AssertionError: assertion failed at scala.Predef$.assert(Predef.scala:165) at org.apache.spark.sql.execution.datasources.LogicalRelation$anonfun$1.apply(LogicalRelation.scala:39) at org.apache.spark.sql.execution.datasources.LogicalRelation$anonfun$1.apply(LogicalRelation.scala:38) at scala.Option.map(Option.scala:145) at org.apache.spark.sql.execution.datasources.LogicalRelation.<init>(LogicalRelation.scala:38) at org.apache.spark.sql.execution.datasources.LogicalRelation.copy(LogicalRelation.scala:31) at org.apache.spark.sql.hive.HiveMetastoreCatalog.org$apache$spark$sql$hive$HiveMetastoreCatalog$convertToOrcRelation(HiveMetastoreCatalog.scala:588) at org.apache.spark.sql.hive.HiveMetastoreCatalog$OrcConversions$anonfun$apply$2.applyOrElse(HiveMetastoreCatalog.scala:647) at org.apache.spark.sql.hive.HiveMetastoreCatalog$OrcConversions$anonfun$apply$2.applyOrElse(HiveMetastoreCatalog.scala:643)

Can you please share your workaround?

avatar
@James Barney

Just realised that workaround for 1 and 2 are same set "spark.sql.hive.convertMetastoreOrc", "false"

avatar
New Contributor

Thank you! for future reference in zeppelin: you set this attribute in the interpreter configuration, not in the paragraph where the sql is being executed.