Created 02-03-2018 03:49 AM
Simple insert taking more time
INSERT INTO test values ('2018-02-01',123,1);
INFO : Tez session hasn't been created yet. Opening session
taking (19.598 seconds) for single insert and next inserts taking 2 to 3 seconds.
could you please suggest ?
Created 02-05-2018 12:26 AM
Are you using Kerberos? post SHOW CREATE for the table in question..
Have you checked execution details on Tez UI? What version HDP are you running?
w.r.t INSERT is there a specific need to append single rows at a time?
Created 02-06-2018 10:17 AM
I am not using kerberos. Our setups are in VM with SAN(HP). yes checked Tez UI, allocation of resource taking so much time ( server with zero jobs running period )
Sample : took 7 seconds for single insertion.
hive> INSERT INTO fact values ('2018-02-01',123,1);
Query ID = hdfs_20180206050752_0b80fc68-b3a3-4804-8c1a-349d2f3674f3
Total jobs = 1
Launching Job 1 out of 1
Status: Running (Executing on YARN cluster with App id application_1517831625144_0018)
--------------------------------------------------------------------------------
VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
--------------------------------------------------------------------------------
Map 1 .......... SUCCEEDED 1 1 0 0 0 0
Reducer 2 ...... SUCCEEDED 3 3 0 0 0 0
--------------------------------------------------------------------------------
VERTICES: 02/02 [==========================>>] 100% ELAPSED TIME: 3.70 s
--------------------------------------------------------------------------------
Loading data to table default.fact
Table default.fact stats: [numFiles=660, numRows=0, totalSize=469636, rawDataSize=0]
OK
Time taken: 7.51 seconds
hive>
Created 02-07-2018 05:18 AM
Can you post SHOW CREATE for the table in question? What's the execution time if you use Hive View to add a single row to the table?
Are you using Hive CLI or Beeline to run your queries?
Created 02-07-2018 12:47 PM
hive> SHOW CREATE TABLE fact;
OK
CREATE TABLE `fact`(
`ldate` date,
`lid` int,
`afid` int)
CLUSTERED BY (
lid)
INTO 3 BUCKETS
ROW FORMAT SERDE
'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
LOCATION
'hdfs://test1.com:8020/apps/hive/warehouse/fact'
TBLPROPERTIES (
'numFiles'='914',
'numRows'='0',
'rawDataSize'='0',
'totalSize'='653965',
'transactional'='true',
'transient_lastDdlTime'='1518006954')
Time taken: 0.233 seconds, Fetched: 22 row(s)
Hive view taking more time than hive/beeline CLI.
i am using both hive/beeline cli.