Support Questions

ajayshenai15 · ‎05-14-2015

Hi,

We recently upgraded our environment to CDH 5.4 and CDM 5.4, but after the upgrade we get this error while insert overwriting 0 records from the source into the target table via impala shell.

The error & warning we get are as follows:

Query: insert overwrite tgt select * from src limit 0
WARNINGS: Could not list directory: hdfs://nameservice1/data/BI/SVUEDHD1_S/tgt
Error(11): Resource temporarily unavailable

Has anyone else also faced a similar issue?

Thanks,

Ajay

alex.behm · ‎05-14-2015

Hi Ajay,

we haven't seen this one, but it could be related to a recent fix:

https://issues.cloudera.org//browse/IMPALA-1438

That error typically means that HDFS is running out of file descriptors, e.g., for opening a socket. Do you think it's possible you are running out of file descriptors?

I wasn't able to reproduce the issue locally, so I'm probably missing some steps.

Can you provide more details on what your table looks like and how it was created? Ideally a series of steps to reproduce the issue in your setup?

Thanks!

Alex

ajayshenai15 · ‎05-14-2015

Hi Alex,

I first created these 2 tables via Impala Shell (v2.2.0).

[eplnx070:21000] > CREATE TABLE svuedhd1_s.src (

> col1 INT,

> col2 STRING

> )

> STORED AS TEXTFILE

> LOCATION '/data/BI/SVUEDHD1_S/src';

Query: create TABLE svuedhd1_s.src (

col1 INT,

col2 STRING

)

STORED AS TEXTFILE

LOCATION '/data/BI/SVUEDHD1_S/src'

Fetched 0 row(s) in 0.44s

[eplnx070:21000] > CREATE TABLE svuedhd1_s.tgt (

> col1 INT,

> col2 STRING

> )

> STORED AS TEXTFILE

> LOCATION '/data/BI/SVUEDHD1_S/tgt';

Query: create TABLE svuedhd1_s.tgt (

col1 INT,

col2 STRING

)

STORED AS TEXTFILE

LOCATION '/data/BI/SVUEDHD1_S/tgt'

Fetched 0 row(s) in 0.48s.

Once that was done, here is the content of these files in HDFS. They both contain nothing.

bash-4.1$ hadoop fs -ls /data/BI/SVUEDHD1_S/src

bash-4.1$ hadoop fs -ls /data/BI/SVUEDHD1_S/tgt

bash-4.1$

Next, I tried to just INSERT the data into tgt from src and that works absolutely fine in Impala.

[eplnx070:21000] > use svuedhd1_s;

Query: use svuedhd1_s

[eplnx070:21000] > select count(*) from tgt;

Query: select count(*) from tgt

+----------+

| count(*) |

+----------+

| 0 |

+----------+

Fetched 1 row(s) in 4.15s

[eplnx070:21000] > select count(*) from src;

Query: select count(*) from src

+----------+

| count(*) |

+----------+

| 0 |

+----------+

Fetched 1 row(s) in 5.11s

[eplnx070:21000] > insert into tgt select * from src;

Query: insert into tgt select * from src

Inserted 0 row(s) in 0.53s

[eplnx070:21000] >

But when, I do an INSERT OVERWRITE, I get the error and the warning.

[eplnx070:21000] > insert overwrite tgt select * from src;

Query: insert overwrite tgt select * from src

WARNINGS: Could not list directory: hdfs://nameservice1/data/BI/SVUEDHD1_S/tgt

Error(11): Resource temporarily unavailable

[eplnx070:21000] >

On the flipside, if I create these tables in Impala and then do an INSERT OVERWRITE using Hive, it works perfectly fine without any issue.

I will also, send out the JIRA link and the file descriptor issue to our Admin to see if that could be a possible problem.

Thanks,

Ajay

alex.behm · ‎05-14-2015

Hi Ajey,

thanks for the update. I was able to get the WARNING in my setup, but not the ERROR cause.

After some digging in our code, I've found a few interesting things that lead me to believe that this warning/error is simply incorrectly displayed.

If you are curious, I believe the core issue lies in the hdfsListDirectory() API call of libhdfs. Impala assumes that when it returns NULL there was an error (because that's what the API doc says). But when reading the libhdfs code I noticed that NULL is also returned if a directory is empty, so Impala will print that warning incorrectly.

We'll keep digging and file JIRAs as appropriate. I will update this thread then.

For now, I think it's safe to say that this message is annoying and wrong, but not dangerous.

Alex

ajayshenai15 · ‎05-14-2015

Hi Alex,

Thanks a lot for the information. But the problem at our end is that we run all our impala processes/queries through an UNIX shell script.

This error causes our script to fail and exits. Do you think this would issue would be fixed in the next minor release or the one after that?

Thank you for the help once again.

Regards,
Ajay

jyu · ‎05-14-2015

Thanks for the analysis, Alex. I filed a JIRA for the HDFS bug https://issues.apache.org/jira/browse/HDFS-8407

and an Impala JIRA https://issues.cloudera.org/browse/IMPALA-2008 which will depend on how the HDFS fix is.

alex.behm · ‎05-15-2015

Hi Ajey,

yes, I can see how that would be a problem. I cannot really promise any concrete release date, but since this is a behavioral regression, I'd say we should treat it with a high priority.

I'd appreciate it if you could comment on the Impala JIRA repeating what you told me about the return code being the actual issue, and that Impala has regressed in that sense.

Bte, as a workaround, you might try the "--ignore_query_failure" option in the shell. Understandably not ideal, but maybe there's a way to make it work in your workflow.

Alex

ajayshenai15 · ‎05-15-2015

Thanks a lot Alex. I shall go ahead and comment on the Impala JIRA to explain the same situation.

Thanks once again for all the help.

Regards,

Ajay

yasmin · ‎02-05-2019

Hi Alex,Ajay

Can you please help me here? im facing similar issue. I have atatched the query and error as well.

Please help me out.

Thanks

Yasmin

Cloudera Community

Support Questions

Resource temporarily unavailable error message from Impala shell