About smartninja723

smartninja723 · ‎05-19-2016

Using Ambari,

smartninja723 · ‎05-19-2016

Hi Guys, I am following the document https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.0/bk_installing_manually_book/content/starting_sts.html which talks about setting up of STS and starting the service. In our Keerberized cluster, after successful service addition, when I try to start the service, I observed that it fails. 16/05/19 10:27:00 INFO AbstractService: Service:HiveServer2 is started. 16/05/19 10:27:00 INFO HiveThriftServer2: HiveThriftServer2 started 16/05/19 10:27:00 WARN SparkConf: The configuration key 'spark.yarn.applicationMaster.waitTries' has been deprecated as of Spark 1.3 and and may be removed in the future. Please use the new key 'spark.yarn.am.waitTime' instead. 16/05/19 10:27:00 INFO Server: jetty-8.y.z-SNAPSHOT 16/05/19 10:27:00 WARN AbstractLifeCycle: FAILED SelectChannelConnector@0.0.0.0:10001: java.net.BindException: Address already in use java.net.BindException: Address already in use at sun.nio.ch.Net.bind0(Native Method) When I observed the logs it says, port 10001 is already in use BindingException. In HiveServer2 I notice that for HTTP transport mode, bound port is 10001. When I manually stopped HS2 on the same host and then started STS, then it works fine. Here I've a few questions. 1) in the log it says that HS2 is started.. Is it trying to start HS2 ? 2) Port issue: Why still 10001 and not 10015? For the STS, we have configured the port 10015 but when I try to connect using beeline : (After having a valid ticket) beeline -u "jdbc:hive2://STSHOST:10015/default;httpPath=cliservice;transportMode=http;principal=hive/_HOST@Realm" it fails but when I try to connect on port 10001 it works (previously I used to connect to HS2 using the same command). beeline -u "jdbc:hive2://STSHOST:10001/default;httpPath=cliservice;transportMode=http;principal=hive/_HOST@Realm" And I can use STS to submit the sqls. Could anyone please try to explain this behavior? Can I have HS2 and STS running on the same node? Thanks. Tagging experts: @vshukla, @Timothy Spann, @Jitendra Yadav, @Yuta Imai @Simon Elliston Ball

smartninja723 · ‎05-19-2016

Thanks @vshukla, @Timothy Spann, @Jitendra Yadav, @Yuta Imai

smartninja723 · ‎05-18-2016

@Yuta Imai, @Simon Elliston Ball, @Neeraj Sabharwal

smartninja723 · ‎05-18-2016

Hi Guys, We have successfully configured Spark on YARN using Ambari on HDP 2.4 with default parameters. However I would like to know what all parameters can we tune for best performance. Should we have separate queues for spark jobs? The use cases are yet to be decided but primarily to replace old MR jobs, experiment with Spark streaming and probably we will also use data frames. How many Spark Thrift Server instances recommended? Cluster is 20 nodes, each with 256 GB RAM, 36 cores each. Load is generally 5% for other jobs. Many thanks.

smartninja723 · ‎05-18-2016

Guys, I am referring the document : http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_spark-guide/content/spark-kerb-access-hive.html and got a bit confused. Wanted to check with experts who already configured spark thrift server on kerberized environment. If you are installing the Spark Thrift Server on a Kerberos-secured cluster, note the following requirements: The Spark Thrift Server must run in the same host as HiveServer2 , so that it can access the hiveserver2 keytab. OK. Install and run Spark TS on the same host as HS2. Install STS using Ambari. Edit permissions in /var/run/spark and /var/log/spark to specify read/write permissions to the Hive service account. Not very clear here. I see that in our cluster, we have a user spark. And I tried to do ls /var/run and ls /var/run/spark as spark user and as hive user (after su spark) I see the directory contents in both cases. Is it correct or am I supposed to something else because I didn't edit the permissions. What permissions to to be edited? ll /var/run drwxrwxr-x 3 spark hadoop 4096 May 17 10:47 spark ll /var/run/spark -rw-r--r-- 1 root root 6 May 17 11:18 spark-root-org.apache.spark.deploy.history.HistoryServer-1.pid ll /var/log/ drwxr-xr-x 2 spark spark 4096 Mar 9 10:06 spark ll /var/log/spark Use the Hive service account to start the thriftserver process. Does it mean, I got to do kinit with hive keytab or do su hive and start the thrift server.? Thanks.

smartninja723 · ‎05-18-2016

Thanks @Simon Elliston Ball I will try that BTW: I see in the documentation : http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_spark-guide/content/installing-kerb-spark.html which talks about creating a separate user : spark and keytab for it, and spark user will submit the jobs. ( Personally I don't like the idea to submit all the jobs by a single user)

smartninja723 · ‎05-17-2016

Hi Guys, On our Kerberized HDP, I tested that a valid A/D user once granted the TGT using kinit, is able to submit the spark job (using spark shell and also using spark-submit). However, I would like to restrict a few groups and users from submitting the job to the cluster. Is there a way around? I see in the documentation : http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_spark-guide/content/installing-kerb-spark.html which talks about creating a separate user : spark and keytab for it, and spark user will submit the jobs. ( Personally I don't like the idea to submit all the jobs by a single user) Thanks.

smartninja723 · ‎05-17-2016

Hi Guys, Sorry to sound dumb, but what is the use of Spark Thirft Server? We have Kerberized HDP 2.4.0 cluster. Recently installed Spark component on the HDP. Now when I am seeing the setup document, I see the option that talks about adding Spark Thrift Server component. I googled a bit, it talks about JDBC along with thrift spark server. Not very clearly understood though. I would like to understand more before making any changes to our Kerberized HDP 2.4 . Many thanks.

smartninja723 · ‎05-13-2016

Guys, I would like to export all the configurations of the HDP 2.3 cluster for reference ( not the blue print). Is there any command or utility which helps to export all the *-site*.xml and configurations? Thanks in advance

Online	Offline
Last Visited	‎08-14-2019 10:39 AM

Member Since	‎02-24-2016 02:02 PM
Last Visited	‎08-14-2019 10:39 AM
Posts	175
Kudos received	56

Cloudera Community

Re: HDPCA Practice Exam VM not able to connect

Re: Can we not have HS2 and Spark Thrift Server (S...

Re: Weird error while converting RDD[CaseClass] to...

Re: Can we not have HS2 and Spark Thrift Server (S...

Can we not have HS2 and Spark Thrift Server (STS) ...

Re: Spark YARN Configuration on HDP 2.4 Recommenda...

Re: Spark YARN Configuration on HDP 2.4 Recommenda...

Spark YARN Configuration on HDP 2.4 Recommendation...

Confusion in documentation : Configuring the Spark...

Re: How to restrict valid users from submitting Sp...

How to restrict valid users from submitting Spark ...

Why do we need to setup Spark Thrift Server?

How to export all HDP configuration files (xml/pro...