Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Slow inserts in Impala - DML Metastore update

Highlighted

Slow inserts in Impala - DML Metastore update

Master Collaborator

Hi,

 we have a very very slow DML Metastore update on simple insert queries into HDFS table (select constant queries).  I am not sure if we are hitting https://issues.apache.org/jira/browse/IMPALA-1480, because the table is not partitioned, however I suspect that the number of files under the table can cause the issue (100k+)

 

After rebuilding the table the DML queries are running fine, is this a known limitation of Impala or a bug?

Thanks

 

 

----------------
Max Per-Host Resource Reservation: Memory=0B
Per-Host Resource Estimates: Memory=10.00MB
Codegen disabled by planner

F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
|  Per-Host Resources: mem-estimate=48B mem-reservation=0B
WRITE TO HDFS [base.import_processed, OVERWRITE=false]
|  partitions=1
|  mem-estimate=48B mem-reservation=0B
|
00:UNION
   constant-operands=1
   mem-estimate=0B mem-reservation=0B
   tuple-ids=0 row-size=48B cardinality=1
----------------

 

 

Query Timeline

  1. Query submitted: 0ns (0ns)
  2. Planning finished: 1ms (1ms)
  3. Submit for admission: 2ms (1ms)
  4. Queued: 2ms (0ns)
  5. Completed admission: 28.64s (28.64s)
  6. Ready to start on 1 backends: 28.64s (1ms)
  7. All 1 execution backends (1 fragment instances) started: 28.66s (12ms)
  8. DML data written: 28.88s (221ms)
  9. DML Metastore update finished: 3.8m (3.3m)
  10. Request finished: 3.8m (0ns)
  11. Unregister query: 3.8m (28ms)
2 REPLIES 2

Re: Slow inserts in Impala - DML Metastore update

Champion

@Tomas79

 

Please increase the below parameter value as needed and try again, it may help you

 

Java Heap Size of Catalog Server in Bytes

Re: Slow inserts in Impala - DML Metastore update

Master Collaborator
The java heap size of the Catalog was not the issue, so I am not sure about marking this as a solution. Unfortunately I dont have a time now to reproduce - because it would require to create a table with 100k+ files, each file having just one row. (This is a logging table in our environment)
Don't have an account?
Coming from Hortonworks? Activate your account here