Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Spark SQL failure handling

avatar
Explorer

I am running Spark SQL on spark V 1.6 in Scala by calling it thru shell script.

When any of the step failed during creation of dataframe or inserting data into hive table, still the steps followed by that are executing.

Below are the errors:

org.apache.spark.sql.AnalysisException: Partition column batchdate not found in existing columns

org.apache.spark.sql.AnalysisException: cannot resolve 'batchdate' given input columns:

error: not found: value DF1

org.apache.spark.sql.AnalysisException: Table not found: locationtable;

How to make my spark-SQL job fail when it returns error and it won't execute subsequent queries, so that control goes to calling shell script.

Thanks!!

2 REPLIES 2

avatar
Super Collaborator

Hi @Vijay Kumar,

You can use Try and catch the exception and control the execution flow depends up on the error.

for instance

try {
//code to be executed 
  } catch {
  case e: <Exception> => {
    println(s"error occurred for while processing the data frame ");
    System.exit(1);
  }
    None
}

Hope this helps!!

avatar
Explorer

Hey @bkosaraju , thanks for sharing your thoughts.

I had this as one of the alternatives but looking is there any other way in Spark SQL like transaction level control (like commit or rollback) if a data frame is not created or for any exception.

Thanks you.