Member since
03-01-2016
609
Posts
12
Kudos Received
7
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1490 | 02-20-2024 10:42 PM | |
1928 | 10-26-2023 05:40 PM | |
1258 | 06-13-2023 07:55 PM | |
2049 | 04-28-2019 12:21 AM | |
1372 | 04-28-2019 12:12 AM |
08-14-2018
11:11 PM
Okay, since the process owner is cloudera-scm, one way to fix the issue is adding cloudera-scm user to 'spark' group on all nodes.
... View more
08-14-2018
01:51 AM
You may need to make sure the process owner of the Spark2 history server (by default it is spark user as well), belongs to the group "spark". So that the spark2 history server process would be able to read all the spark2 event log files. You can check the process owner with " ps -ef |grep java| grep SPARK2" on the node where spark2 history server runs on.
... View more
08-14-2018
01:47 AM
As the error message states, the object, either a DataFrame or List does not have the saveAsTextFile() method. result.write.save() or result.toJavaRDD.saveAsTextFile() shoud do the work, or you can refer to DataFrame or RDD api: https://spark.apache.org/docs/2.1.0/api/scala/index.html#org.apache.spark.sql.DataFrameWriter https://spark.apache.org/docs/2.1.0/api/scala/index.html#org.apache.spark.rdd.RDD
... View more
07-12-2018
11:29 PM
This is usually caused by not having proper HADOOP or SPARK CONF on the node. You need to assign spark2 gateway role to this node, and deploy spark2 client configureations, then re-launch spark2-shell.
... View more
07-12-2018
11:20 PM
There is a complete sample in the Cloudera engineering blog [1], note the requirements mentioned there. You will need to provide the jaas file with java options, see [2], notice the options used in spark2-submit: --driver-java-options "-Djava.security.auth.login.config=./spark_jaas.conf"... --conf "spark.executor.extraJavaOptions=-Djava.security.auth.login.config=./spark_jaas.conf" along with distributed files via "--files". You will also need to set the SSL parameters when initlizing the kafka client, see [3]. [1] https://blog.cloudera.com/blog/2017/05/reading-data-securely-from-apache-kafka-to-apache-spark/ [2] https://github.com/markgrover/spark-secure-kafka-app [3] https://github.com/markgrover/spark-secure-kafka-app/blob/master/src/main/java/com/cloudera/spark/examples/DirectKafkaWordCount.scala#L60
... View more
12-14-2017
11:39 AM
It is usually caused by a wrong solr url configured somewhere in HUE, you can try to correct it: 1) Check if any one has a safety value configured in HUE -> Configuration -> "Hue Service Advanced Configuration Snippet (Safety Valve) for hue_safety_valve.ini" 2) Make sure Solr is selected as a depency service in HUE -> Configuration -> Solr. 3) If above still does not resovle your issue, try to overide solr url with a correct (protocal, host, port) by following : a) Verify the solr URL, Solr -> Instance -> any one of the Solr instance -> Solr Server -> Solr Server Web UI. Note the url, and we will use it in next step. b) Use a safety value for solr url in HUE -> Configuration -> search for "Hue Service Advanced Configuration Snippet (Safety Valve) for hue_safety_valve.ini", add: [search]
## URL of the Solr Server
solr_url=http://hostname:port/solr Restart your HUE server.
... View more
09-16-2017
10:43 PM
Regarding how to make Spark work with Kerberos enabled Kafka, please refer to Cloudera engineering blog: https://blog.cloudera.com/blog/2017/05/reading-data-securely-from-apache-kafka-to-apache-spark/ There are explainations on prerequisites, solution and sample code.
... View more
09-16-2017
10:27 PM
2 Kudos
It's a spark side configuraion. So you can always specify it via "--conf" option with spark-submit, or you can set the property globally on CM via "Spark Client Advanced Configuration Snippet (Safety Valve) for spark-conf/spark-defaults.conf", so CM will include such setting for you via spark gateway client configuration.
... View more
08-28-2017
12:03 AM
#1) Do you have spark gateway installed and client configuration deployed on this host? #2) Do you have Spark selected as one of the dependent service on CM oozie configuration page? Usgually the client should be able to source the log4j file from /etc/spark/conf/log4j.properties, instead of the NodeManager process dir.
... View more
08-13-2017
03:54 AM
1 Kudo
Two points: 1) in cluster mode, you should use "--conf spark.driver.extraJavaOptions=" instead of "--driver-java-options" 2) you only provide application.conf in --file list, there's no log4.properties. So either you have this log4.properties distributed on each YARN node, or you should add this log4.properties file to --file list, and reference it with "-Dlog4j.configuration=./log4.properties" For cluster mode, the full command should look like the following: spark-submit \
--master yarn \
--deploy-mode cluster \
--class myCLASS \
--properties-file /home/abhig/spark.conf \
--files /home/abhig/application.conf,/home/abhig/log4.propertie \
--conf "spark.executor.extraJavaOptions=-Dconfig.resource=application.conf -Dlog4j.configuration=./log4.properties" \
--conf spark.driver.extraJavaOptions="-Dconfig.file=./application.conf -Dlog4j.configuration=./log4.properties" \
/loca/project/gateway/mypgm.jar
... View more
- « Previous
-
- 1
- 2
- Next »