Member since
08-14-2014
2
Posts
0
Kudos Received
0
Solutions
08-26-2015
12:05 AM
Thanks . Block encoding and compression together helped to storage utilization.
... View more
07-28-2015
09:27 PM
HI , Need your expertise to understand the utilization of storage space in hbase . I am trying to load data from Oracle Table to Hbase directly using Sqoop by below command. Source Table Size : 20 GB . sqoop import --connect 'jdbc:oracle:thin:@(description=(address=(protocol=tcp)(host=)(port=))(connect_data=(sid=ORA10G)))' --username --password --query "SELECT /*+ parallel(a,8) full(a) */ * FROM TEST a WHERE \$CONDITIONS" -m 10 --hbase-create-table --hbase-table TEST_HB --column-family cf1 --hbase-row-key IDNUM --hive-drop-import-delims --split-by PARTITION_ID Facing two below issues . 1) its starting 10 Mappers , but only 3 were in Running status and remaining as scheduled . Basically only 3 were running . Do we have some parameters that limits this mappers while loading in hbase ? 2) HDFS storage getting filled more than 180 GB for 20GB worth oracle database . It should not be more than 60 GB worth ( considering 3 replication factor ) . when i checked i physicals block files in hdfs for this data , all rows are storing with column names . How to avoid this overhead of column names or iam missing something in above sqoop command .
... View more
Labels:
- Labels:
-
Apache HBase