Support Questions

Find answers, ask questions, and share your expertise

Slow inserts in Impala - DML Metastore update

Master Collaborator

Hi,

 we have a very very slow DML Metastore update on simple insert queries into HDFS table (select constant queries).  I am not sure if we are hitting https://issues.apache.org/jira/browse/IMPALA-1480, because the table is not partitioned, however I suspect that the number of files under the table can cause the issue (100k+)

 

After rebuilding the table the DML queries are running fine, is this a known limitation of Impala or a bug?

Thanks

 

 

----------------
Max Per-Host Resource Reservation: Memory=0B
Per-Host Resource Estimates: Memory=10.00MB
Codegen disabled by planner

F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
|  Per-Host Resources: mem-estimate=48B mem-reservation=0B
WRITE TO HDFS [base.import_processed, OVERWRITE=false]
|  partitions=1
|  mem-estimate=48B mem-reservation=0B
|
00:UNION
   constant-operands=1
   mem-estimate=0B mem-reservation=0B
   tuple-ids=0 row-size=48B cardinality=1
----------------

 

 

Query Timeline

  1. Query submitted: 0ns (0ns)
  2. Planning finished: 1ms (1ms)
  3. Submit for admission: 2ms (1ms)
  4. Queued: 2ms (0ns)
  5. Completed admission: 28.64s (28.64s)
  6. Ready to start on 1 backends: 28.64s (1ms)
  7. All 1 execution backends (1 fragment instances) started: 28.66s (12ms)
  8. DML data written: 28.88s (221ms)
  9. DML Metastore update finished: 3.8m (3.3m)
  10. Request finished: 3.8m (0ns)
  11. Unregister query: 3.8m (28ms)
2 REPLIES 2

Champion

@Tomas79

 

Please increase the below parameter value as needed and try again, it may help you

 

Java Heap Size of Catalog Server in Bytes

Master Collaborator
The java heap size of the Catalog was not the issue, so I am not sure about marking this as a solution. Unfortunately I dont have a time now to reproduce - because it would require to create a table with 100k+ files, each file having just one row. (This is a logging table in our environment)