Member since 
    
	
		
		
		12-21-2016
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                83
            
            
                Posts
            
        
                5
            
            
                Kudos Received
            
        
                2
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 44296 | 02-08-2017 05:56 AM | |
| 7292 | 01-02-2017 11:05 PM | 
			
    
	
		
		
		12-20-2020
	
		
		04:38 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 How to add a new column to an existing parquet table and how to update it ?  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Hive
			
    
	
		
		
		04-24-2020
	
		
		09:32 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Traceback (most recent call last):    File "consumer.py", line 8, in <module>      consumer = KafkaConsumer('test',    File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/kafka/consumer/group.py", line 355, in __init__      self._client = KafkaClient(metrics=self._metrics, **self.config)    File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/kafka/client_async.py", line 242, in __init__      self.config['api_version'] = self.check_version(timeout=check_timeout)    File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/kafka/client_async.py", line 907, in check_version      version = conn.check_version(timeout=remaining, strict=strict, topics=list(self.config['bootstrap_topics_filter']))    File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/kafka/conn.py", line 1228, in check_version      if not self.connect_blocking(timeout_at - time.time()):    File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/kafka/conn.py", line 337, in connect_blocking      self.connect()    File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/kafka/conn.py", line 426, in connect      if self._try_handshake():    File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/kafka/conn.py", line 505, in _try_handshake      self._sock.do_handshake()    File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/ssl.py", line 1309, in do_handshake      self._sslobj.do_handshake()  ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1108)     I am getting above error after running the program, Any inputs ? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		04-23-2020
	
		
		10:35 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Web Hdfs is disabled for our cluster.. Is there any other options ? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		04-21-2020
	
		
		04:41 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi, 
   
 I am trying to connect Kafka from my local machine to kafka kerberized cluster using python, but i am connect with below credentials. Could any guide me and you help is appreciated. 
 consumer = KafkaConsumer('test',bootstrap_servers='XXX:1234',     #client_id= kafka-python- + __version__,     request_timeout_ms=30000,     connections_max_idle_ms=9 * 60 * 1000,     reconnect_backoff_ms=50,     reconnect_backoff_max_ms=1000,     max_in_flight_requests_per_connection=5,     receive_buffer_bytes=None,     send_buffer_bytes=None,     #socket_options= [(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)],     sock_chunk_bytes=4096, # undocumented experimental option     sock_chunk_buffer_count=1000, # undocumented experimental option     retry_backoff_ms=100,     metadata_max_age_ms=300000,     security_protocol='SASL_SSL',     ssl_context=None,     ssl_check_hostname=True,     ssl_cafile=None,     ssl_certfile=None,     ssl_keyfile=None,     ssl_password=None,     ssl_crlfile=None,     api_version=None,     api_version_auto_timeout_ms=2000,     #selector=selectors.DefaultSelector,     sasl_mechanism='GSSAPI',     #sasl_plain_username= None,     #sasl_plain_password='XXX',     sasl_kerberos_service_name='XXX',     # metrics configs     metric_reporters=[],     metrics_num_samples=2,     metrics_sample_window_ms=30000) 
 for msg in consumer:      print(msg) 
   
 Please guide and you help is appreciated. 
   
 Thanks 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Kafka
- 
						
							
		
			Kerberos
			
    
	
		
		
		04-20-2020
	
		
		04:37 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi, 
   
 I am trying to connect and authenticate kerberized cluster using python program and read hdfs files. Could anyone help me to achieve it ? 
   
 Your help is appreciated. 
   
 Thanks 
   
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		04-20-2020
	
		
		03:17 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi,     I am trying to connect from local machine to a kerberized kafka cluster through python as python client, could you please let me know what all the properties to include along with bootstrap server ?  consumer = KafkaConsumer('test',bootstrap_servers='XXX.ORG:XXXX',  #client_id= kafka-python- + __version__,  request_timeout_ms=30000,  connections_max_idle_ms=9 * 60 * 1000,  reconnect_backoff_ms=50,  reconnect_backoff_max_ms=1000,  max_in_flight_requests_per_connection=5,  receive_buffer_bytes=None,  send_buffer_bytes=None,  #socket_options= [(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)],  sock_chunk_bytes=4096, # undocumented experimental option  sock_chunk_buffer_count=1000, # undocumented experimental option  retry_backoff_ms=100,  metadata_max_age_ms=300000,  security_protocol='SASL_SSL',  ssl_context=None,  ssl_check_hostname=True,  ssl_cafile=None,  ssl_certfile=None,  ssl_keyfile=None,  ssl_password=None,  ssl_crlfile=None,  api_version=None,  api_version_auto_timeout_ms=2000,  #selector=selectors.DefaultSelector,  sasl_mechanism='GSSAPI',  #sasl_plain_username= None,  #sasl_plain_password='XXXX',  sasl_kerberos_service_name='XXXX',  # metrics configs  metric_reporters=[],  metrics_num_samples=2,  metrics_sample_window_ms=30000)     Your help is appreciated.     Thanks    
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		04-20-2020
	
		
		03:07 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi All,     I am trying to connect from Local machine to kafka cluster(kerberized Cluster) through python. Can anyone help what are the properties to specify for the krb5.conf file and other properties.     your help is appreciated. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		12-12-2019
	
		
		04:10 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 We even ran the MSCK repair table, but still no luck. Any other options ? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		12-12-2019
	
		
		02:17 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 I am unable to create external hive table after manually deleting underlying hdfs location files of the table. 
   
 when table desc is statement is issued, it gives the describe of the table, but when select is performed on the table, then we are getting table doesn't exists. So we issued drop statement.  
   
 After issuing drop statement, then we again tried to create the table, but we are getting table already exists. Do we need to manually do a delete from the hive metastore ? or is there any way to forcefully re-create the table ? Please let me know.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Hive
			
    
	
		
		
		07-21-2017
	
		
		07:31 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hive - i would like to calculate percentage of column and based on the percentage i would like to load the data into another table(if the percentage of n is less 20%) or else not to load  colA   
y      
y      
y      
n      
------------------
Output: -- This is what i am expecting  
y 80%   n 20%
 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Hive
 
        













