Member since 
    
	
		
		
		05-02-2017
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                360
            
            
                Posts
            
        
                65
            
            
                Kudos Received
            
        
                22
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 15719 | 02-20-2018 12:33 PM | |
| 2049 | 02-19-2018 05:12 AM | |
| 2382 | 12-28-2017 06:13 AM | |
| 7925 | 09-28-2017 09:25 AM | |
| 13515 | 09-25-2017 11:19 AM | 
			
    
	
		
		
		05-12-2017
	
		
		05:24 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Number of mappers involved in a job= Number of input splits and number of input splits depends on your block size and file size .If file size is 256 mb and block size is 128mb it will involve 2mappers. @Bala Vignesh N V  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-11-2017
	
		
		12:23 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Ashnee Sharma  Good article. Thanks for sharing! 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-05-2017
	
		
		07:06 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Vinay R Glad it helped you. If you think it solves your problem then please accept the answer.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		04-20-2017
	
		
		05:56 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi @Simran Kaur,  There is no way that first column can be considered as column name. But if the structure changes the its better to load the data as AVRO or Parquet file in hive. Even if the structure changes there is no need for you to change the old data and new data can be inserted into the same hive table.  Points to be noted:  1.External table has to be used   2.You might need a stage table before loading into External hive table which should be in avro/parquet format  Steps:  1. Create external table with columns which you have as avro/parquet.  2. Load the csv into stage table and then load the stage data into external table hive table.  3. If the columns changes then drop the external table and re-create with additional fields.  4. Insert the new file by following steps 1-2  By this way there will not be any manually work needed to modify the existing data as avro by default will show 'null' for columns which are available in the table but not in the file. The only manually work is to drop and re-create the table ddl. Let me know if you needed any details. And if you feel it answers your question then please accept the answer 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		04-16-2017
	
		
		04:52 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 You can use one of the following  regexp_replace(s, "\\[\\d*\\]", "");
regexp_replace(s, "\\[.*\\]", "");  The former works only on digits inside the brackets, the latter on any text. Escapes are required because both square brackets ARE special characters in regular expressions. For example:  hive> select regexp_replace("7 September 2015[456]", "\\[\\d*\\]", "");
7 September 2015 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		04-13-2017
	
		
		03:05 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Michael Young   Thanks ! That worked like a charm. I still have no idea why it doesn't let me upload using the HDFS UI so if you know why then I would love to know.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		04-18-2017
	
		
		11:30 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 This is a longer regex, assumed the log_entry meets 2 ip address displayed. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		04-02-2017
	
		
		01:56 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Thanks @Scott Shaw. Does it mean I have to update the metadata each time after I truncate the partition? Even if the metadata exists it should not display wrong results. In my case select distinct country from mytable should display only India.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		04-03-2017
	
		
		06:18 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Are these tables External Tables? In the case of external tables you would have manually clean the folders by removing the files and folders that are referenced by the table ( using hadoop fs -rm command) 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		 
        













