Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Insert query taking a very long time

Highlighted

Insert query taking a very long time

Contributor

I have this table (copying DDL from Hive2 view):

CREATE TABLE `number_generator`(
  `last_number` int)
ROW FORMAT SERDE 
  'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' 
STORED AS INPUTFORMAT 
  'org.apache.hadoop.mapred.TextInputFormat' 
OUTPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
  'hdfs://hdfs_host:8020/apps/hive/warehouse/sandbox.db/number_generator'
TBLPROPERTIES (
  'COLUMN_STATS_ACCURATE'='{\"BASIC_STATS\":\"true\"}', 
  'last_modified_by'='admin', 
  'last_modified_time'='1508308017', 
  'numFiles'='1', 
  'numRows'='1', 
  'rawDataSize'='2', 
  'totalSize'='3', 
  'transient_lastDdlTime'='1508308020')


Each time I want to use this table for number generation I do the following:

TRUNCATE TABLE sandbox.number_generator

This finishes quite quickly.

After this I run:

INSERT INTO sandbox.number_generator (last_number) VALUES (4)

This take over 10 minutes ti complete.

Any ideas why?

3 REPLIES 3
Highlighted

Re: Insert query taking a very long time

Super Collaborator

Hi @Yair Ogen,

can you please look at the resource manager UI and TEZ UI (from ambari TEZ view).

this will give an idea where the query get blocked (either for lack of resources in cluster or from accessing the file system.

on the other hand you may look at the logs /var/log/hive/hiveserver2.log which will show any of the exceptions while executing the query.

Re: Insert query taking a very long time

Contributor

@bkosaraju

I tried running this in hive ui. Only error in log is:

2017-10-18 11:55:34,252 ERROR [Atlas Logger 3]: metadata.Hive (Hive.java:getTable(1215)) - Table values__tmp__table__1 not found: default.values__tmp__table__1 table not found
2017-10-18 11:55:34,252 ERROR [Atlas Logger 3]: hook.HiveHook (HiveHook.java:run(205)) - Atlas hook failed due to error
java.lang.reflect.UndeclaredThrowableException
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1884)
        at org.apache.atlas.hive.hook.HiveHook$2.run(HiveHook.java:195)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.atlas.hook.AtlasHookException: HiveHook.registerProcess() failed.
        at org.apache.atlas.hive.hook.HiveHook.registerProcess(HiveHook.java:701)
        at org.apache.atlas.hive.hook.HiveHook.collect(HiveHook.java:268)
        at org.apache.atlas.hive.hook.HiveHook.access$200(HiveHook.java:83)
        at org.apache.atlas.hive.hook.HiveHook$2$1.run(HiveHook.java:198)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
        ... 6 more
Caused by: org.apache.atlas.hook.AtlasHookException: HiveHook.processHiveEntity() failed.
        at org.apache.atlas.hive.hook.HiveHook.processHiveEntity(HiveHook.java:731)
        at org.apache.atlas.hive.hook.HiveHook.registerProcess(HiveHook.java:668)
        ... 12 more
Caused by: org.apache.atlas.hook.AtlasHookException: HiveHook.createOrUpdateEntities() failed.
        at org.apache.atlas.hive.hook.HiveHook.createOrUpdateEntities(HiveHook.java:597)
        at org.apache.atlas.hive.hook.HiveHook.processHiveEntity(HiveHook.java:711)
        ... 13 more
Caused by: org.apache.atlas.hook.AtlasHookException: HiveHook.createOrUpdateEntities() failed.
        at org.apache.atlas.hive.hook.HiveHook.createOrUpdateEntities(HiveHook.java:589)
        at org.apache.atlas.hive.hook.HiveHook.createOrUpdateEntities(HiveHook.java:595)



Not sure it's related. In Hive UI it runs much faster. So maybe - this is an issue with how hive-jdbc ineracts with hive server?

I don't see anything wrong in resource ui

In Tez UI for example, I see this query:

SELECT COUNT(*) FROM sandbox.t_14

Took more than 4 minutes. The table has 1000 rows only...

Highlighted

Re: Insert query taking a very long time

Contributor

It appears your timing out because Atlas hook is not working properly, if you remove the Atlas hook this would work faster.

@bkosaraju Look at this config and try removing the Altas config and then restart all and try again "org.apache.atlas.hive.hook.HiveHook"

<property>
      <name>hive.exec.post.hooks</name>
      <value>org.apache.hadoop.hive.ql.hooks.ATSHook, org.apache.atlas.hive.hook.HiveHook</value>
</property> 

https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.6/bk_command-line-installation/content/config...

Don't have an account?
Coming from Hortonworks? Activate your account here