Support Questions
Find answers, ask questions, and share your expertise

Slow inserts in Impala - DML Metastore update

Slow inserts in Impala - DML Metastore update

Master Collaborator


 we have a very very slow DML Metastore update on simple insert queries into HDFS table (select constant queries).  I am not sure if we are hitting, because the table is not partitioned, however I suspect that the number of files under the table can cause the issue (100k+)


After rebuilding the table the DML queries are running fine, is this a known limitation of Impala or a bug?




Max Per-Host Resource Reservation: Memory=0B
Per-Host Resource Estimates: Memory=10.00MB
Codegen disabled by planner

F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
|  Per-Host Resources: mem-estimate=48B mem-reservation=0B
WRITE TO HDFS [base.import_processed, OVERWRITE=false]
|  partitions=1
|  mem-estimate=48B mem-reservation=0B
   mem-estimate=0B mem-reservation=0B
   tuple-ids=0 row-size=48B cardinality=1



Query Timeline

  1. Query submitted: 0ns (0ns)
  2. Planning finished: 1ms (1ms)
  3. Submit for admission: 2ms (1ms)
  4. Queued: 2ms (0ns)
  5. Completed admission: 28.64s (28.64s)
  6. Ready to start on 1 backends: 28.64s (1ms)
  7. All 1 execution backends (1 fragment instances) started: 28.66s (12ms)
  8. DML data written: 28.88s (221ms)
  9. DML Metastore update finished: 3.8m (3.3m)
  10. Request finished: 3.8m (0ns)
  11. Unregister query: 3.8m (28ms)

Re: Slow inserts in Impala - DML Metastore update




Please increase the below parameter value as needed and try again, it may help you


Java Heap Size of Catalog Server in Bytes

Re: Slow inserts in Impala - DML Metastore update

Master Collaborator
The java heap size of the Catalog was not the issue, so I am not sure about marking this as a solution. Unfortunately I dont have a time now to reproduce - because it would require to create a table with 100k+ files, each file having just one row. (This is a logging table in our environment)