Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Spark produces no logs

avatar
Explorer

Hello,

 

I am running CDH 5.12 QuickStart VM with package installation (no parcels, and no CM).

 

I can't get Spark to produce application logs in the designated HDFS directory, and consequently nothing is displayed by Spark History Server.  My Spark jobs run as part of an Oozie workflow, but no Spark logs are produced.

 

My  /etc/spark/conf/spark-defaults.conf contains:

 

spark.eventLog.enabled            true
spark.eventLog.dir                hdfs:///user/spark/applicationHistory
spark.history.fs.logDirectory     hdfs:///user/spark/applicationHistory
spark.yarn.historyServer.address  http://quickstart.cloudera:18088

 

The HDFS log directory has the following permissions:

$sudo -u hdfs  hadoop fs -ls /user/spark
Found 1 items
drwxrwxrwt   - spark spark          0 2017-09-06 13:31 /user/spark/applicationHistory

The Oozie Spark Task runs on Yarn, and it is defined as:

<spark xmlns="uri:oozie:spark-action:0.1">

		<job-tracker>${jobTracker}</job-tracker>
		<name-node>${nameNode}</name-node>
		<master>yarn</master>
		<mode>cluster</mode>
		....
</spark>

The Oozie workflow runs correctly, and I can see the logs in the Yarn History Server, and in Hue's Oozie Dashboard. However the Spark History Server shows this:

 

History Server

    Event log directory: hdfs:///user/spark/applicationHistory

No completed applications found!

Did you specify the correct logging directory? Please verify your setting of spark.history.fs.logDirectory and whether you have the permissions to access it.
It is also possible that your application did not run to completion or did not stop the SparkContext. 

The HDFS directory /user/spark/applicationHistory is empty.

 

I have looked everywhere in the documentation, specifically here: https://www.cloudera.com/documentation/enterprise/5-11-x/topics/admin_spark_history_server.html,  but I have not been able to find a solution, please help.

 

Thanks in advance,

Alex Soto

 

 

1 ACCEPTED SOLUTION

avatar
Explorer

In case it helps others:

 

The file /etc/spark/conf/spark-defaults.conf  is not used by Oozie Spark Actions by default.  In order to tell Oozie Spark Action to use this file, I had to add this to /etc/oozie/conf/oozie-site.xml 

 

<property>
   <name>oozie.service.SparkConfigurationService.spark.configurations</name>
   <value>*=/etc/spark/conf/</value>
</property>

Now I can see the logs in the Spark History Server.  I wonder why this should be the default.



View solution in original post

3 REPLIES 3

avatar
Explorer

In case it helps others:

 

The file /etc/spark/conf/spark-defaults.conf  is not used by Oozie Spark Actions by default.  In order to tell Oozie Spark Action to use this file, I had to add this to /etc/oozie/conf/oozie-site.xml 

 

<property>
   <name>oozie.service.SparkConfigurationService.spark.configurations</name>
   <value>*=/etc/spark/conf/</value>
</property>

Now I can see the logs in the Spark History Server.  I wonder why this should be the default.



avatar
Explorer

Not sure if this it the correct solution.  I am not able to see my Tasks logs, I only see the Spark logs (driver and tasks) but not my application logs.  Anything I log from within a closure is not whowing.  I tried configuring the log4j.properties file in the /etc/spark/conf/log4j.properties but it doesn't seem to make a difference.  The only sucess I had so far was to get the History Server to show something.

avatar
Cloudera Employee

Hi Alex,

 

Did u checked in Oozie Configuration or in oozie logs like whether the Event logs are writing in some other path apart from the path that was configured in CM?

 

Thanks

AKR