Created on 05-14-2015 11:47 AM - edited 09-16-2022 02:28 AM
Hi,
We recently upgraded our environment to CDH 5.4 and CDM 5.4, but after the upgrade we get this error while insert overwriting 0 records from the source into the target table via impala shell.
The error & warning we get are as follows:
Query: insert overwrite tgt select * from src limit 0
WARNINGS: Could not list directory: hdfs://nameservice1/data/BI/SVUEDHD1_S/tgt
Error(11): Resource temporarily unavailable
Has anyone else also faced a similar issue?
Thanks,
Ajay
Created 05-14-2015 01:30 PM
Hi Ajay,
we haven't seen this one, but it could be related to a recent fix:
https://issues.cloudera.org//browse/IMPALA-1438
That error typically means that HDFS is running out of file descriptors, e.g., for opening a socket. Do you think it's possible you are running out of file descriptors?
I wasn't able to reproduce the issue locally, so I'm probably missing some steps.
Can you provide more details on what your table looks like and how it was created? Ideally a series of steps to reproduce the issue in your setup?
Thanks!
Alex
Created 05-14-2015 02:19 PM
Hi Alex,
I first created these 2 tables via Impala Shell (v2.2.0).
[eplnx070:21000] > CREATE TABLE svuedhd1_s.src (
> col1 INT,
> col2 STRING
> )
> STORED AS TEXTFILE
> LOCATION '/data/BI/SVUEDHD1_S/src';
Query: create TABLE svuedhd1_s.src (
col1 INT,
col2 STRING
)
STORED AS TEXTFILE
LOCATION '/data/BI/SVUEDHD1_S/src'
Fetched 0 row(s) in 0.44s
[eplnx070:21000] > CREATE TABLE svuedhd1_s.tgt (
> col1 INT,
> col2 STRING
> )
> STORED AS TEXTFILE
> LOCATION '/data/BI/SVUEDHD1_S/tgt';
Query: create TABLE svuedhd1_s.tgt (
col1 INT,
col2 STRING
)
STORED AS TEXTFILE
LOCATION '/data/BI/SVUEDHD1_S/tgt'
Fetched 0 row(s) in 0.48s.
Once that was done, here is the content of these files in HDFS. They both contain nothing.
bash-4.1$ hadoop fs -ls /data/BI/SVUEDHD1_S/src
bash-4.1$ hadoop fs -ls /data/BI/SVUEDHD1_S/tgt
bash-4.1$
Next, I tried to just INSERT the data into tgt from src and that works absolutely fine in Impala.
[eplnx070:21000] > use svuedhd1_s;
Query: use svuedhd1_s
[eplnx070:21000] > select count(*) from tgt;
Query: select count(*) from tgt
+----------+
| count(*) |
+----------+
| 0 |
+----------+
Fetched 1 row(s) in 4.15s
[eplnx070:21000] > select count(*) from src;
Query: select count(*) from src
+----------+
| count(*) |
+----------+
| 0 |
+----------+
Fetched 1 row(s) in 5.11s
[eplnx070:21000] > insert into tgt select * from src;
Query: insert into tgt select * from src
Inserted 0 row(s) in 0.53s
[eplnx070:21000] >
But when, I do an INSERT OVERWRITE, I get the error and the warning.
[eplnx070:21000] > insert overwrite tgt select * from src;
Query: insert overwrite tgt select * from src
WARNINGS: Could not list directory: hdfs://nameservice1/data/BI/SVUEDHD1_S/tgt
Error(11): Resource temporarily unavailable
[eplnx070:21000] >
On the flipside, if I create these tables in Impala and then do an INSERT OVERWRITE using Hive, it works perfectly fine without any issue.
I will also, send out the JIRA link and the file descriptor issue to our Admin to see if that could be a possible problem.
Thanks,
Ajay
Created 05-14-2015 05:49 PM
Hi Ajey,
thanks for the update. I was able to get the WARNING in my setup, but not the ERROR cause.
After some digging in our code, I've found a few interesting things that lead me to believe that this warning/error is simply incorrectly displayed.
If you are curious, I believe the core issue lies in the hdfsListDirectory() API call of libhdfs. Impala assumes that when it returns NULL there was an error (because that's what the API doc says). But when reading the libhdfs code I noticed that NULL is also returned if a directory is empty, so Impala will print that warning incorrectly.
We'll keep digging and file JIRAs as appropriate. I will update this thread then.
For now, I think it's safe to say that this message is annoying and wrong, but not dangerous.
Alex
Created 05-14-2015 06:41 PM
Created 05-14-2015 10:20 PM
Thanks for the analysis, Alex. I filed a JIRA for the HDFS bug https://issues.apache.org/jira/browse/HDFS-8407
and an Impala JIRA https://issues.cloudera.org/browse/IMPALA-2008 which will depend on how the HDFS fix is.
Created 05-15-2015 12:01 AM
Hi Ajey,
yes, I can see how that would be a problem. I cannot really promise any concrete release date, but since this is a behavioral regression, I'd say we should treat it with a high priority.
I'd appreciate it if you could comment on the Impala JIRA repeating what you told me about the return code being the actual issue, and that Impala has regressed in that sense.
Bte, as a workaround, you might try the "--ignore_query_failure" option in the shell. Understandably not ideal, but maybe there's a way to make it work in your workflow.
Alex
Created 05-15-2015 06:37 AM
Thanks a lot Alex. I shall go ahead and comment on the Impala JIRA to explain the same situation.
Thanks once again for all the help.
Regards,
Ajay
Created 02-05-2019 02:14 PM
Hi Alex,Ajay
Can you please help me here? im facing similar issue. I have atatched the query and error as well.
Please help me out.
Thanks
Yasmin