Created on 
    
	
		
		
		11-02-2017
	
		
		08:23 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
 - last edited on 
    
	
		
		
		09-16-2022
	
		
		05:29 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
 by 
				
		
		
			kh-asen
		
		
		
		
		
		
		
		
	
			
		
Hi,
I use CDH
I have a partition table in Hive . Its a date partition. Its a month data. So there will be 30 to 31 partitions in hive.
Now the same table in hive can i move to Hbase?
I know to create a external table in hive which points to Hbase. and I know to create a partition table in Hive. Now how to integrate both. Partition I must use static partition for this use case? . any other suggestions?
I have a 100 millions records for a month data and i want to move to hbase and write a impala query for retrieval for good performance.
What i do is i create a staging table in hive and move to hbase .
Now i need to have partitions in hive. In this case how can i proceed.
For my use case i want to move the data and select it for display. I m not going to do any sort of processing.
Since i have millions of records in a month i wan to go for daily partitions and move records to date partitions so when i write a select query my response time would be fast.
Thanks
Created 11-03-2017 06:18 AM
Hi,
The concept of Hive partition do not map to HBase tables.
So if you want to have HBase as the storage then you will need to workaround your use case.
You could try to use "one HBase table" having a row key constructed with the partition value. That way you should be able to query your HBase table using the row key and avoid a full scan of the table.
Or you could have one HBase table per "partition" (this also mean one hive table per partition).
Or you could see that HBase do not answer your need and stay in Hive ?
regards,
Mathieu
Created 11-03-2017 06:18 AM
Hi,
The concept of Hive partition do not map to HBase tables.
So if you want to have HBase as the storage then you will need to workaround your use case.
You could try to use "one HBase table" having a row key constructed with the partition value. That way you should be able to query your HBase table using the row key and avoid a full scan of the table.
Or you could have one HBase table per "partition" (this also mean one hive table per partition).
Or you could see that HBase do not answer your need and stay in Hive ?
regards,
Mathieu
 
					
				
				
			
		
