Support Questions
Find answers, ask questions, and share your expertise

HIVE query takes very long time to fetch 20 gb records

HIVE query takes very long time to fetch 20 gb records

Rising Star

Hi I have a hive table on HBASE that has 200gb of records . I am running simple hive query to fetch 20 gb records . But this takes around 4 hours of time . I can not create partition on HIVE table cause it is integrated on HBASE. Please suggest any idea to improve performance This is my HIVE query INSERT OVERWRITE LOCAL DIRECTORY '/hadoop/user/m6034690/FSDI/FundamentalAnalytic/FundamentalAnalytic_2014.txt' ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE select * from hbase_table_FundamentalAnalytic where FilePartition='ThirdPartyPrivate' and FilePartitionDate='2014';

3 REPLIES 3

Re: HIVE query takes very long time to fetch 20 gb records

Use Phoenix instead of Hive. Phoenix will give better performance. If your HBase table is already there, you can create a view in Phoenix on the HBase table. You could also use Phoenix to create HBase tables.

Here is an example in which I'm creating a view on an HBase table for the column family named pageview:

create view "pageviews_hbase"(event_id varchar not null primary key,"pageview"."event_time" varchar,"pageview"."url" varchar, "pageview"."country" varchar,"pageview"."session_id" varchar,"pageview"."user_cookie" varchar,"pageview"."os" varchar,"pageview"."browser" varchar)

Re: HIVE query takes very long time to fetch 20 gb records

Rising Star

Phoenix is not installed on my cluster as it is not bundeled in CDH 5 distribution ,Any other idea

Re: HIVE query takes very long time to fetch 20 gb records

Spark is another option; however, this will require you to write either Java or Scala code using the Spark DataFrames API.

IMO, and I have tested this many time with high volumes of data, Hive on HBase is not a good solution for SQL analytics of your HBase data.

Phoenix does work with CDH. Here is a link on how to install:

http://blog.cloudera.com/blog/2015/05/apache-phoenix-joins-cloudera-labs/