Member since 
    
	
		
		
		02-24-2016
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                175
            
            
                Posts
            
        
                56
            
            
                Kudos Received
            
        
                3
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 1928 | 06-16-2017 10:40 AM | |
| 16485 | 05-27-2016 04:06 PM | |
| 1632 | 03-17-2016 01:29 PM | 
			
    
	
		
		
		05-19-2016
	
		
		12:28 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Using Ambari, 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-19-2016
	
		
		11:58 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Hi Guys,  I am following the document https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.0/bk_installing_manually_book/content/starting_sts.html which talks about setting up of STS and starting the service.   In our Keerberized cluster, after successful service addition, when I try to start the service, I observed that it fails.  16/05/19 10:27:00 INFO AbstractService: Service:HiveServer2 is started.
16/05/19 10:27:00 INFO HiveThriftServer2: HiveThriftServer2 started
16/05/19 10:27:00 WARN SparkConf: The configuration key 'spark.yarn.applicationMaster.waitTries' has been deprecated as of Spark 1.3 and and may be removed in the future. Please use the new key 'spark.yarn.am.waitTime' instead.
16/05/19 10:27:00 INFO Server: jetty-8.y.z-SNAPSHOT
16/05/19 10:27:00 WARN AbstractLifeCycle: FAILED SelectChannelConnector@0.0.0.0:10001: java.net.BindException: Address already in use
java.net.BindException: Address already in use
        at sun.nio.ch.Net.bind0(Native Method)   When I observed the logs it says, port 10001 is already in use BindingException. In HiveServer2 I notice that for HTTP transport mode, bound port is 10001.   When I manually stopped HS2 on the same host and then started STS, then it works fine.
  Here I've a few questions.  1) in the log it says that HS2 is started.. Is it trying to start HS2 ?   2) Port issue: Why still 10001 and not 10015?  For the STS, we have configured the port 10015 but when I try to connect using beeline : (After having a valid ticket)  beeline -u "jdbc:hive2://STSHOST:10015/default;httpPath=cliservice;transportMode=http;principal=hive/_HOST@Realm" it fails but when I try to connect on port 10001 it works (previously I used to connect to HS2 using the same command).  beeline -u "jdbc:hive2://STSHOST:10001/default;httpPath=cliservice;transportMode=http;principal=hive/_HOST@Realm"  And I can use STS to submit the sqls.   Could anyone please try to explain this behavior?   Can I have HS2 and STS running on the same node?  Thanks.  Tagging experts:  @vshukla, @Timothy Spann, @Jitendra Yadav, @Yuta Imai @Simon Elliston Ball 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Spark
			
    
	
		
		
		05-19-2016
	
		
		11:37 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Thanks @vshukla, @Timothy Spann, @Jitendra Yadav, @Yuta Imai 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-18-2016
	
		
		06:44 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Yuta Imai, @Simon Elliston Ball, @Neeraj Sabharwal 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-18-2016
	
		
		06:42 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi Guys,  We have successfully configured Spark on YARN using Ambari on HDP 2.4 with default parameters. However I would like to know what all parameters can we tune for best performance. Should we have separate queues for spark jobs? The use cases are yet to be decided but primarily to replace old MR jobs, experiment with Spark streaming and probably we will also use data frames. How many Spark Thrift Server instances recommended?  Cluster is 20 nodes, each with 256 GB RAM, 36 cores each. Load is generally 5% for other jobs.   Many thanks. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Spark
- 
						
							
		
			Apache YARN
			
    
	
		
		
		05-18-2016
	
		
		11:40 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Guys,  I am referring the document : http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_spark-guide/content/spark-kerb-access-hive.html and got a bit confused. Wanted to check with experts who already configured spark thrift server on kerberized environment.   If you are installing the Spark Thrift Server on a Kerberos-secured cluster, note the following requirements:  
 The Spark Thrift Server must run in the same host as  HiveServer2 , so that it can access the  hiveserver2  keytab.  OK. Install and run Spark TS on the same host as HS2. Install STS using Ambari.    Edit permissions in  /var/run/spark  and  /var/log/spark  to specify read/write permissions to the Hive service account.  Not very clear here. I see that in our cluster, we have a user spark. And I tried to do ls /var/run and ls /var/run/spark as spark user and as hive user (after su spark) I see the directory contents in both cases. Is it correct or am I supposed to something else because I didn't edit the permissions. What permissions to to be edited?  ll /var/run  drwxrwxr-x 3 spark     hadoop    4096 May 17 10:47 spark
  
   ll /var/run/spark 
  -rw-r--r-- 1 root  root     6 May 17 11:18 spark-root-org.apache.spark.deploy.history.HistoryServer-1.pid  ll /var/log/  drwxr-xr-x 2 spark     spark             4096 Mar  9 10:06 spark  ll /var/log/spark  
    Use the Hive service account to start the  thriftserver  process.   Does it mean, I got to do kinit with hive keytab or do su hive and start the thrift server.?     Thanks. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Spark
			
    
	
		
		
		05-18-2016
	
		
		09:55 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Thanks @Simon Elliston Ball I will try that   BTW:   I see in the documentation : http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_spark-guide/content/installing-kerb-spark.html which talks about creating a separate user : spark and keytab for it, and spark user will submit the jobs. ( Personally I don't like the idea to submit all the jobs by a single user) 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-17-2016
	
		
		05:12 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi Guys,  On our Kerberized HDP, I tested that a valid A/D user once granted the TGT using kinit, is able to submit the spark job (using spark shell and also using spark-submit). However, I would like to restrict a few groups and users from submitting the job to the cluster. Is there a way around?  I see in the documentation : http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_spark-guide/content/installing-kerb-spark.html which talks about creating a separate user : spark and keytab for it, and spark user will submit the jobs. ( Personally I don't like the idea to submit all the jobs by a single user)   Thanks. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Spark
- 
						
							
		
			Apache YARN
			
    
	
		
		
		05-17-2016
	
		
		12:47 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		2 Kudos
		
	
				
		
	
		
					
							 Hi Guys,  Sorry to sound dumb, but what is the use of Spark Thirft Server? We have Kerberized HDP 2.4.0 cluster. Recently installed Spark component on the HDP. Now when I am seeing the setup document, I see the option that talks about adding Spark Thrift Server component.   I googled a bit, it talks about JDBC along with thrift spark server. Not very clearly understood though.   I would like to understand more before making any changes to our Kerberized HDP 2.4 .  Many thanks. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
	
					
			
		
	
	
	
	
				
		
	
	
			
    
	
		
		
		05-13-2016
	
		
		03:20 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		2 Kudos
		
	
				
		
	
		
					
							 Guys,  I would like to export all the configurations of the HDP 2.3 cluster for reference ( not the blue print). Is there any command or utility which helps to export all the *-site*.xml and configurations?  Thanks in advance 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Hortonworks Data Platform (HDP)
 
         
					
				













