Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Hbase design vs data warehouse. Please suggest

Hbase design vs data warehouse. Please suggest

New Contributor

I am new to Hbase and trying to get an understanding of how data is stored in Hbase compared to data warehouse which I know. Can someone please provide the right design or data structure for the below design in data warehouse ..

 

d_EmpTable:

Empkey

EmpNo

EmpName

EmpSalary

EmpDesc

EmpLocation  

 

d_Dept:

Dept_key

DeptNo

DeptDesc

DeptLocation

DeptBudget

 

d_Location:

Locartion_key

Location_id

LocationName

Location_address

Location_zip

 

f_fact_tb:

f_key

Emp_key

Dept_key

Location_key

 

Please suggest a best Hbase design which can address the following:

  • Generate report of Employees vs Dept
  • Report of Emp vs Location
  • Report of Dept Budget vs Emp Salary
1 REPLY 1

Re: Hbase design vs data warehouse. Please suggest

Master Guru
Use HBase exclusively for data that requires, dominantly, random access to
individual records or small ranges of sequentially related records.

For everything else, including tables you primarily require building
reports regularly on, you're better off using Kudu+Impala with appropriate
partitioning. Checkout https://kudu.apache.org/docs/schema_design.html for
a good reference on this.

Queries that perform large scans (hundreds of thousands of rows+ for ex.)
including full table scans are not the type of workload HBase is designed
for.