Created 03-19-2024 03:04 AM
I am running a spark job on oozie. This spark job processes some data on S3 and then loads the data into snowflake DWH.
At the end of the code I am calling a spark stop.
import org.apache.log4j.LogManager import org.apache.spark.SparkConf import org.apache.spark.sql.SparkSession import scala.util.control.NonFatal object SnowflakeDriver { @transient lazy val logger = LogManager.getLogger(SnowflakeDriver.getClass) // starting point of the application def main(args: Array[String]): Unit = { val sparkConf = new SparkConf() val runtimeEnvironment = sparkConf.get("spark.eigi.dap.runtime.environment") val spark = SparkSession.builder().config(sparkConf).appName(s"${ JavaUtils.getConfigProps(runtimeEnvironment).getProperty("appName") }-SnowflakeSitelockDriver").enableHiveSupport().getOrCreate() val jdbcConnection = JavaUtils.getSFJDBCConnection(runtimeEnvironment, args(0), 0) try { // here is the code to process data and put this data into snowflake } catch { case NonFatal(ex) => { jdbcConnection.rollback() logger.error(s"Failed to run SF Operation due to ${ex.getMessage}", ex) JavaUtils.awsSNSOut(runtimeEnvironment, JavaUtils.getConfigProps(runtimeEnvironment).getProperty("aws.sns.fatal.topic"), s" ${JavaUtils.getConfigProps(runtimeEnvironment).getProperty("appName")} - on $runtimeEnvironment, Failed to SF Operation due to ${ex.getMessage}") throw ex } } finally { jdbcConnection.close() logger.info("Stopping Spark...") spark.stop } } private def printUsage: Unit = { System.err.println(s"Usage: ${getClass.getSimpleName} sfPassword [sfVDWSize]") System.exit(-1) } }
Here are the logs:
2024-03-19 07:06:31,968 [main] INFO com.cc.bigdata.dailyreports.sitelock.SnowflakeDriver$ - Stopping Spark... 2024-03-19 07:06:31,984 [main] INFO org.sparkproject.jetty.server.AbstractConnector - Stopped Spark@23d5d9fc{HTTP/1.1, (http/1.1)}{0.0.0.0:4040} 2024-03-19 07:06:31,986 [main] INFO org.apache.spark.ui.SparkUI - Stopped Spark web UI at http://ip-10-13-25-16.ec2.internal:4040 2024-03-19 07:06:31,991 [YARN application state monitor] INFO org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend - Interrupting monitor thread 2024-03-19 07:06:32,016 [main] INFO org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend - Shutting down all executors 2024-03-19 07:06:32,016 [dispatcher-CoarseGrainedScheduler] INFO org.apache.spark.scheduler.cluster.YarnSchedulerBackend$YarnDriverEndpoint - Asking each executor to shut down 2024-03-19 07:06:32,022 [main] INFO org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend - YARN client scheduler backend Stopped 2024-03-19 07:06:32,043 [dispatcher-event-loop-9] INFO org.apache.spark.MapOutputTrackerMasterEndpoint - MapOutputTrackerMasterEndpoint stopped! 2024-03-19 07:06:32,063 [main] INFO org.apache.spark.storage.memory.MemoryStore - MemoryStore cleared 2024-03-19 07:06:32,064 [main] INFO org.apache.spark.storage.BlockManager - BlockManager stopped 2024-03-19 07:06:32,077 [main] INFO org.apache.spark.storage.BlockManagerMaster - BlockManagerMaster stopped 2024-03-19 07:06:32,083 [dispatcher-event-loop-15] INFO org.apache.spark.scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint - OutputCommitCoordinator stopped! 2024-03-19 07:06:32,093 [main] INFO org.apache.spark.SparkContext - Successfully stopped SparkContext <<< Invocation of Spark command completed <<< Hadoop Job IDs executed by Spark: job_1710272981670_0506 <<< Invocation of Main class completed <<< Oozie Launcher, uploading action data to HDFS sequence file: hdfs://ip-<ip>:8020/user/hadoop/oozie-oozi/0000204-240312195311040-oozie-oozi-W/SnowflakeIntegration--spark/action-data.seq 2024-03-19 07:06:32,152 [main] INFO org.apache.hadoop.io.compress.CodecPool - Got brand-new compressor [.deflate] Stopping AM 2024-03-19 07:06:32,188 [main] INFO org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl - Waiting for application to be successfully unregistered. Callback notification attempts left 0 Callback notification trying http://ip-<ip>.ec2.internal:11000/oozie/callback?id=0000204-240312195311040-oozie-oozi-W@SnowflakeIntegration&status=SUCCEEDED Callback notification to http://ip-<ip>.ec2.internal:11000/oozie/callback?id=0000204-240312195311040-oozie-oozi-W@SnowflakeIntegration&status=SUCCEEDED succeeded Callback notification succeeded 2024-03-19 07:06:32,972 [shutdown-hook-0] INFO org.apache.spark.util.ShutdownHookManager - Shutdown hook called 2024-03-19 07:06:32,973 [shutdown-hook-0] INFO org.apache.spark.util.ShutdownHookManager - Deleting directory /mnt/yarn/usercache/hadoop/appcache/application_1710272981670_0505/spark-0072b5a6-b8f5-4ed2-9fbf-b295bd878711 2024-03-19 07:06:32,977 [shutdown-hook-0] INFO org.apache.spark.util.ShutdownHookManager - Deleting directory /mnt/tmp/spark-a12c27e8-ec49-4e75-a8f9-2355693611a2 End of LogType:stdout *********************************************************************** Container: container_1710272981670_0505_01_000001 on ip-10-13-25-16.ec2.internal_8041 LogAggregationType: AGGREGATED ===================================================================================== LogType:syslog LogLastModifiedTime:Tue Mar 19 07:06:33 +0000 2024 LogLength:700 LogContents: 2024-03-19 06:51:08,166 WARN [main] org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2024-03-19 06:51:08,487 INFO [main] org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at ip-<ip>.ec2.internal/<ip>:8030 2024-03-19 06:51:08,734 INFO [main] org.apache.hadoop.conf.Configuration: resource-types.xml not found 2024-03-19 06:51:08,735 INFO [main] org.apache.hadoop.yarn.util.resource.ResourceUtils: Unable to find 'resource-types.xml'. 2024-03-19 06:51:09,578 INFO [main] org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at ip-<ip>/10.13.25.58:8032 End of LogType:syslog ***********************************************************************
You can see in the logs that "Stopping Spark..." was printed, which means that the job has executed until the last step. The logs also mentions that spark was shutdown. The logs end here. But the oozie workflow is still in RUNNING state.
Why is this happening? How can I fix this?
Created 03-19-2024 09:16 AM
Hi @MrBeasr
Review the oozie logs for this workflow if there is anything suspicious and you can paste here.
oozie job -oozie http://<oozie-server-host>:11000 -log <workflow-id>
Regards,
Chethan YM