Member since 
    
	
		
		
		12-28-2015
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                47
            
            
                Posts
            
        
                2
            
            
                Kudos Received
            
        
                4
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 8045 | 05-24-2017 02:14 PM | |
| 3396 | 05-01-2017 06:53 AM | |
| 6422 | 05-02-2016 01:11 PM | |
| 7716 | 02-09-2016 01:40 PM | 
			
    
	
		
		
		03-08-2019
	
		
		06:29 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 We are getting below errors when 15 or 16 spark jobs are running parallelly. We are having a 21 node cluster and running spark on yarn. Regardless of the number of nodes in the cluster does one cluster get to use only 17 ports or is it 17 ports per node in a cluster? How to avoid this when we run 50 or 100 spark jobs parallely?     WARN util.Utils: Service ‘SparkUI’ could not bind on port 4040. Attempting port 4041.  :::::  WARN util.Utils: Service ‘SparkUI’ could not bind on port 4055. Attempting port 4056.  Address alredy in use: Service ‘sparkUI’ failed after 16 retries! Consider explicitly setting the appropriate port for the service ‘SparkUI 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Spark
- 
						
							
		
			Apache YARN
			
    
	
		
		
		05-24-2017
	
		
		02:31 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							After recomission can just add the datanode back and Name node will identify all the blocks that were previously present in this datanode. Once Namenode identifies this information, It will wipe out the third replica that it created during the datanode decomission.    You may have to run hdfs balancer if you format the disks and then recomision it to the cluster which is not a best practise.
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-24-2017
	
		
		02:17 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							    Thanks for your reply.  Any links or docs for storage pools will be helpful for me. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-24-2017
	
		
		02:14 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							Hello Niranjan,    drwxr-xr-x - striim1 striim1  Above permissions will not let Joy to write a file inside the hdfs directory unless Joy is a hdfs superuser. Try to look at hdfs acls to solve your problem here. Apart from striim1 if Joy is the only user who creates files in /user/striim1 then try to run below command.  hdfs dfs -setfacl -m user:joy:rwx /user/striim1    HDFS ACLS  https://www.cloudera.com/documentation/enterprise/5-5-x/topics/cdh_sg_hdfs_ext_acls.html
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-24-2017
	
		
		02:03 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							NN of 5 GB should handle upwards of 5 millions blocks, which is actually 15 million total. A 10 node cluster should set the DN block threshold to 1.5 million.    -- This this hold good for a heterogeneous cluster where few data nodes have 40 TB space and others are 80TB space. I am sure having a datanode block threshold of 500,000 is not a good practise. This will cause smaller datandoes to fill up faster than the larger datanodes and send alerts at an early phase.
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-01-2017
	
		
		06:53 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 I had to check grant in hr_role instead of emp_role. This is the solution for this question. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		04-28-2017
	
		
		01:15 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							  I have a employee_database and under employee_database I have tables salary_table and bonus_table. Right now emp_role has full access on employee_database. I would also like to give select access to hr_role on bonus_table. How can I achieve this in sentry?     SHOW GRANT ROLE emp_role;  1 hdfs://localns/emp emp_role ROLE * false  2 employee_database emp_role ROLE * false  GRANT SELECT ON TABLE emp_database.bonus_table to role hr_role;  SHOW GRANT ROLE emp_role;  1 hdfs://localns/emp emp_role ROLE * false  2 employee_database emp_role ROLE * false     I don't get error when I run the above grant but i don't see the grant in the list.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Hive
- 
						
							
		
			Apache Impala
- 
						
							
		
			Apache Sentry
- 
						
							
		
			HDFS
			
    
	
		
		
		10-06-2016
	
		
		09:13 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							Ben, you gave me the same answer in my previous WebEx with you 🙂 Thank you!
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		10-06-2016
	
		
		07:45 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Some times these are false alerts. Can you check your sytem load during that  time when you received these allerts(CLOCK_OFFSET, DNS_HOST_RESOLUTON, WEB_METRIC etc).  sar -q -f /var/log/sa/sa10  Use above command  and modify sa10 with the date you have received alerts.  track down the load and check what unusal thing happened on that host during that widow.     If you see there is a bump up in the load then Check your system I/O disk utilization to see if the spindles are reaching 100%. If any of the spindles are reaching 100% then the system load is the cluprit here.  You may need to increase your thresholds on that perticular host in CM> host>all hosts> select on host name> configurations and look for Host Clock Offset Thresholds or Host DNS Resolution Duration Thresholds.     As per the present thresholds when the system experience high load it will pause for a while or send delayed response(response includes health check reports of the host when CM scm-agent is running) to the scm-server. When the SCM-server failed to receive these health check reports within the duration due to the host beeing busy then this will cause the alerts to flood in your inbox. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-02-2016
	
		
		01:11 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 I found the answer myself.  Using below command I can achieve ls -lt output in hdfs.     hdfs dfs -ls /test | sort -k6,7    
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		 
        







