Member since
09-29-2015
122
Posts
159
Kudos Received
26
Solutions
02-14-2022
11:59 PM
Hi @vshukla, Thank you for the article. Could you help us with the insights around the Deployment Choices reasons, please? My customer wants to know why to justify deploying Memory: Minimum of 64 GB node and Cores: Minimum of 8 cores, especially. Thank you!
... View more
09-30-2019
10:55 AM
I am also looking for something like this. I need to convert all my date datatypes to varchar in a dataframe having more than 300 columns. Any suggestions??
... View more
05-25-2017
09:46 PM
This is applicable in a Kerberos enabled HDP 2.5.x cluster with Zeppelin, Livy & Spark. Post successful Kerberos setup, log in to Zeppelin and run Spark note, the note runs file. But running simple sc.version from livy interpreter gives "Cannot start spark" in the Zeppelin UI. In the Livy log at /var/log/livy/livy-livy-server.out you may find a message similar to the following. INFO: 17/05/25 21:24:12 INFO metastore: Trying to connect to metastore with URI thrift://vinay-hdp25-2.field.hortonworks.com:9083 May 25, 2017 9:24:12 PM org.apache.spark.launcher.OutputRedirector redirect INFO: 17/05/25 21:24:12 ERROR TSaslTransport: SASL negotiation failure May 25, 2017 9:24:12 PM org.apache.spark.launcher.OutputRedirector redirect INFO: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] This happens when Livy tried to connect to Hive Metastore and fails with above message. The fix is to configure Zeppelin's Livy interpreter to run in yarn-cluster mode, instead of the default yarn-client mode. After you change any interpreter configuration, you will need to restart the interpreter. Below works. livy.spark.master yarn-cluster Starting HDP 2.6.x this configuration is changed OOB to yarn-cluster.
... View more
Labels:
09-09-2016
05:33 PM
This was very helpful, but Livy did not pick up /etc/livy/conf/livy-defaults.conf!
I changed the name to /etc/livy/conf/livy.conf and impersonation worked.
... View more
12-03-2015
07:59 PM
Also see Practical Data Science with Apache Spark & Apache Zeppelin https://hadoopsummit.uservoice.com/forums/332055-data-science-applications-for-hadoop/suggestions/10847007-practical-data-science-with-apache-spark-apache Running Spark in Production https://hadoopsummit.uservoice.com/forums/332061-hadoop-governance-security-deployment-and-operat/suggestions/10848240-running-spark-in-production Cover topics of Spark Perf Tuning, Security & Spark on YARN Please consider voting if you want to hear more on these topics.
... View more