Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

compute incremental stats fails after creating partition - error "no such file or directory"

Highlighted

compute incremental stats fails after creating partition - error "no such file or directory"

Explorer

TL;DR

I have a script which I'm running using Impala (Impala Shell v2.3.0-cdh5.5.1) which sometimes (not always) fails. The script is of the form:

set sync_ddl = true;
insert overwrite table <tablename> partition <partition-spec> <select-statement>; compute incremental stats <tablename> partition <partition-spec>;

It fails on the compute incremental stats statement with an error (that I get from Cloudera Manager) of the form: 

Failed to open HDFS file hdfs://path/to/partition/folder/214425b85ad0c995-31e7091cc454319d_930819282_data.0.parq Error(2): No such file or directory

The script runs for <20seconds until it fails, most of that time is spent doing the insertion.

 

I find this bizarre. Impala cannot find files for a partition that it has just created. Why could this be?

 


Details:

Here is the output from running the script:

INFO - Impala engine specified, using implala.dbapi interface
INFO - Connection successful
INFO - > set sync_ddl = True
INFO - > set request_pool = "root.sse"
INFO - > -- ensure queries do not complete until metadata is propagated across all nodes
       > set sync_ddl = true
INFO - option,value
INFO - > insert overwrite table 
       >    met_ssesu.trans_prod_period
       > (
       >    prod_code
       >  , prodper_live_1w_flg
       >  , weight_uom_code
       >  , volume_uom_code
       >  , bsk_prodperpurch_1w_cnt  
       >  , tot_prodperspend_1w_amt 
       >  , prodper_sold_1w_cnt 
       >  , max_prodper_sell_price_1w_amt
       >  , min_prodper_sell_price_1w_amt
       >  , avg_prodper_sell_price_1w_amt
       >  , created_dttm
       >  , tot_prodperspendperitem_amt
       >  , tot_prodperdiscount_1w_amt
       >  , prodper_pctdiscount_1w
       >  , max_prodperpurch_1w_dttm
       > )   
       > partition
       > (
       >    period_code_type = 'fis_week_id'
       >  , period_code      = '201541'
       >  , prod_code_type   = 'prod_group_code'
       > ) 
       > select     
       >    t.prod_group_code        as prod_code
       >  , 1                             as prodper_live_1w_flg
       >  , max(t.weight_uom_code)        as weight_uom_code
       >  , max(t.volume_uom_code)        as volume_uom_code
       >  , min(cnt.bsk_prodpurch_1w_cnt) as bsk_prodperpurch_1w_cnt  
       >  , sum(t.spend_amt)              as tot_prodperspend_1w_amt 
       >  , sum(t.item_qty)               as prodper_sold_1w_cnt 
       >   -- find the sell price for a single item or measure of weight or volume
       >  , round( max(cast(t.spend_amt as decimal(38,10)) / cast(coalesce(t.weight_uom_qty, t.volume_uom_qty, t.item_qty) as decimal(38,10))), 2) as max_prodper_sell_price_1w_amt
       >  , round( min(cast(t.spend_amt as decimal(38,10)) / cast(coalesce(t.weight_uom_qty, t.volume_uom_qty, t.item_qty) as decimal(38,10))), 2) as min_prodper_sell_price_1w_amt
       >  , round(cast(avg(cast(t.spend_amt as decimal(38,10)) / cast(coalesce(t.weight_uom_qty, t.volume_uom_qty, t.item_qty) as decimal(38,10))) as decimal(38,10)), 2) as avg_prodper_sell_price_1w_amt
       >  , "2016-06-17 13:40:28"    as created_dttm 
       >  , round(cast(sum(cast(t.spend_amt as decimal(38,10)) / cast(coalesce(t.weight_uom_qty, t.volume_uom_qty, t.item_qty) as decimal(38,10))) as decimal(38,10)), 2) as tot_prodperspendperitem_amt
       >  , sum(t.item_discount_amt)      as tot_prodperdiscount_1w_amt
       >  , cast(cast(sum(t.item_discount_amt) as decimal(38,10)) / cast(sum(t.spend_amt) as decimal(38,10)) as decimal(38,10)) as prodper_pctdiscount_1w
       >  , max(t.transaction_dttm)             as max_prodperpurch_1w_dttm
       > from   
       >    met_ssepr.trans as t
       > left join 
       >    (
       >       select 
       >          prod_group_code
       >         ,count(distinct transaction_fid) as bsk_prodpurch_1w_cnt 
       >       from 
       >          met_ssepr.trans 
       >       where 
       >          cast(fis_week_id as string) = '201541'
       >       group by 
       >          prod_group_code
       >    ) as cnt
       >     on t.prod_group_code = cnt.prod_group_code
       > where   
       >    cast(t.fis_week_id as string) = '201541'
       > group by 
       >    t.prod_group_code
INFO - Ok
INFO - > compute incremental stats met_ssesu.trans_prod_period
       > partition 
       > (
       >    period_code_type = 'fis_week_id'
       >  , period_code      = '201541'
       >  , prod_code_type   = 'prod_group_code'
       > )
ERROR - Statement failed: Operation is in ERROR_STATE

Notice that I am computing incremental stats for the same partition that I just created.

 

Things to note:

  • This does not fail every time it runs. I'd say it fails ~20% of the time. 
  • If I restart the job, it succeeds. Note that a restart results in rerunning the insertion as well, not just the compute incremental stats operation.
  • When it does fail it fails with this same error
  • We have numerous other similar scripts (i.e. that create a partition and then compute incremental stats on it) but this is the only one that fails in this fashion.
  • We have a multi-tenant setup where we run this same code on the same infrastructure for multiple clients. Each client has their own set of databases. This script runs without any issues for other clients, including those with significantly bigger data volumes.
1 REPLY 1

Re: compute incremental stats fails after creating partition - error "no such file or directory

Explorer

OK, turns out someone (actually one of my colleagues) has already hit this: https://community.cloudera.com/t5/Interactive-Short-cycle-SQL/Partition-Spec-Does-Not-Exist-error/m-...

 

Could I ask if there's been any progress on fixing this?

 

Thx

Don't have an account?
Coming from Hortonworks? Activate your account here