Member since
10-18-2016
6
Posts
1
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
14262 | 09-08-2017 08:59 AM |
09-25-2017
01:29 PM
I am running Cloudera Express 5.12.1 with CDH-5.12.1. I need to read/write data from HBase from my Spark jobs. I set thefollowing settings: spark.driver.extraLibraryPath=/etc/hbase/conf/hbase-site.xml
spark.executor.extraLibraryPath=/etc/hbase/conf/hbase-site.xml In CM "Spark Client Advanced Configuration Snippet (Safety Valve) for spark-conf/spark-defaults.conf." in the Spark Service. I can see my values being written in /etc/spark/conf/spark-defaults.conf correctly. Now I submit a Spark job (part of Oozie workflow). I expect that the hbase config is picked up from the class path, but it is not. I can see the correct setting in the Spark Histroy Server for the job: spark.driver.extraLibraryPath /etc/hbase/conf/hbase-site.xml:/var/cloudera/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/lib/hadoop/lib/native but the connection to the Zookeeper is not honoring the host that is specified in the hbase.zookeeper.quorum setting from the hbase-site.xml file, but is trying to connect to localhost. I wish I can specify the hbase-site.xml globally, and not for each of the Spark jobs, as I have many, so my intention is to confgure this globally. What am I doing wrong? What is the best practice to follow in this case? Thanks
... View more
Labels:
- Labels:
-
Apache HBase
-
Apache Spark
-
Apache YARN
09-08-2017
11:57 AM
Not sure if this it the correct solution. I am not able to see my Tasks logs, I only see the Spark logs (driver and tasks) but not my application logs. Anything I log from within a closure is not whowing. I tried configuring the log4j.properties file in the /etc/spark/conf/log4j.properties but it doesn't seem to make a difference. The only sucess I had so far was to get the History Server to show something.
... View more
09-08-2017
08:59 AM
1 Kudo
In case it helps others: The file /etc/spark/conf/spark-defaults.conf is not used by Oozie Spark Actions by default. In order to tell Oozie Spark Action to use this file, I had to add this to /etc/oozie/conf/oozie-site.xml <property>
<name>oozie.service.SparkConfigurationService.spark.configurations</name>
<value>*=/etc/spark/conf/</value>
</property> Now I can see the logs in the Spark History Server. I wonder why this should be the default.
... View more
09-06-2017
10:47 AM
Hello, I am running CDH 5.12 QuickStart VM with package installation (no parcels, and no CM). I can't get Spark to produce application logs in the designated HDFS directory, and consequently nothing is displayed by Spark History Server. My Spark jobs run as part of an Oozie workflow, but no Spark logs are produced. My /etc/spark/conf/spark-defaults.conf contains: spark.eventLog.enabled true
spark.eventLog.dir hdfs:///user/spark/applicationHistory
spark.history.fs.logDirectory hdfs:///user/spark/applicationHistory
spark.yarn.historyServer.address http://quickstart.cloudera:18088 The HDFS log directory has the following permissions: $sudo -u hdfs hadoop fs -ls /user/spark
Found 1 items
drwxrwxrwt - spark spark 0 2017-09-06 13:31 /user/spark/applicationHistory The Oozie Spark Task runs on Yarn, and it is defined as: <spark xmlns="uri:oozie:spark-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<master>yarn</master>
<mode>cluster</mode>
....
</spark> The Oozie workflow runs correctly, and I can see the logs in the Yarn History Server, and in Hue's Oozie Dashboard. However the Spark History Server shows this: History Server
Event log directory: hdfs:///user/spark/applicationHistory
No completed applications found!
Did you specify the correct logging directory? Please verify your setting of spark.history.fs.logDirectory and whether you have the permissions to access it.
It is also possible that your application did not run to completion or did not stop the SparkContext. The HDFS directory /user/spark/applicationHistory is empty. I have looked everywhere in the documentation, specifically here: https://www.cloudera.com/documentation/enterprise/5-11-x/topics/admin_spark_history_server.html, but I have not been able to find a solution, please help. Thanks in advance, Alex Soto
... View more
Labels: