Support Questions
Find answers, ask questions, and share your expertise
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Weird HDFS Shell Issues When Testing For Existing Files

Weird HDFS Shell Issues When Testing For Existing Files


Directory does not exist but can be deleted
1. Exception in thread "main" org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://host01-ns/etl/ABC/XYZ/2014-08-13/mytestdirectory already exists
2. hadoop fs -ls /etl/ABC/XYZ/2014-08-13/mytestdirectory ( nothing returned)
3. hadoop fs -test -e /etl/ABC/XYZ/2014-08-13/mytestdirectory ( nothing returned)
4. hadoop fs -rm -r /etl/ABC/XYZ/2014-08-13/mytestdirectory
14/08/14 23:07:49 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 1440 minutes, Emptier interval = 0 minutes.
Moved: 'hdfs://host01-ns/etl/ABC/XYZ/2014-08-13/mytestdirectory' to trash at: hdfs://host01-ns/user/testuser/.Trash/Current
( how did it delete a file that did not exist...supposedly )

This is causing all of our scripts that check for existing files to fail.

1. hadoop fs -test -e /etl/ABC/XYZ/2014-08-13/mytestfile.BIN ( nothing returned )
2. hadoop fs -ls /etl/ABC/XYZ/2014-08-13/mytestfile.BIN
Found 1 items
-rw-r--r-- 3 testuser supergroup 2087844001 2014-08-13 23:52 /etl/ABC/XYZ/2014-08-13/mytestfile.BIN


Re: Weird HDFS Shell Issues When Testing For Existing Files

Empty directories do have the same behaviour you saw:

$ hadoop fs -mkdir /tmp/empty


# ls returns nothing since directory is empty
$ hadoop fs -ls /tmp/empty



test passes

[hdfs@host-10-16-8-143 ~]$ hadoop fs -test -e /tmp/empty
[hdfs@host-10-16-8-143 ~]$ echo $?
[hdfs@host-10-16-8-143 ~]$ hadoop fs -test -d /tmp/empty
[hdfs@host-10-16-8-143 ~]$ echo $?


Whereas /tmp/empty1 does not exist

$ hadoop fs -ls /tmp/empty1
ls: `/tmp/empty1': No such file or directory

$ hadoop fs -test -e /tmp/empty1
$ echo $?

So in your case, it appears "/etl/ABC/XYZ/2014-08-13/mytestdirectory" did exist which is why you were able to delete it. So when you check for existence, don't look for output from "hadoop fs -test", instead check the return code with $?.

Gautam Gopalakrishnan
Don't have an account?
Coming from Hortonworks? Activate your account here