Support Questions

vijaynew92 · ‎10-04-2017

I am running Spark SQL on spark V 1.6 in Scala by calling it thru shell script.

When any of the step failed during creation of dataframe or inserting data into hive table, still the steps followed by that are executing.

Below are the errors:

org.apache.spark.sql.AnalysisException: Partition column batchdate not found in existing columns

org.apache.spark.sql.AnalysisException: cannot resolve 'batchdate' given input columns:

error: not found: value DF1

org.apache.spark.sql.AnalysisException: Table not found: locationtable;

How to make my spark-SQL job fail when it returns error and it won't execute subsequent queries, so that control goes to calling shell script.

Thanks!!

bkosaraju · ‎10-05-2017

Hi @Vijay Kumar,

You can use Try and catch the exception and control the execution flow depends up on the error.

for instance

try {
//code to be executed 
  } catch {
  case e: <Exception> => {
    println(s"error occurred for while processing the data frame ");
    System.exit(1);
  }
    None
}

Hope this helps!!

vijaynew92 · ‎10-05-2017

Hey @bkosaraju , thanks for sharing your thoughts.

I had this as one of the alternatives but looking is there any other way in Spark SQL like transaction level control (like commit or rollback) if a data frame is not created or for any exception.

Thanks you.

Cloudera Community

Support Questions

Spark SQL failure handling