Weird HDFS Shell Issues When Testing For Existing Files

Directory does not exist but can be deleted
1. Exception in thread "main" org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://host01-ns/etl/ABC/XYZ/2014-08-13/mytestdirectory already exists
2. hadoop fs -ls /etl/ABC/XYZ/2014-08-13/mytestdirectory ( nothing returned)
3. hadoop fs -test -e /etl/ABC/XYZ/2014-08-13/mytestdirectory ( nothing returned)
4. hadoop fs -rm -r /etl/ABC/XYZ/2014-08-13/mytestdirectory
14/08/14 23:07:49 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 1440 minutes, Emptier interval = 0 minutes.
Moved: 'hdfs://host01-ns/etl/ABC/XYZ/2014-08-13/mytestdirectory' to trash at: hdfs://host01-ns/user/testuser/.Trash/Current
( how did it delete a file that did not exist...supposedly )

This is causing all of our scripts that check for existing files to fail.

1. hadoop fs -test -e /etl/ABC/XYZ/2014-08-13/mytestfile.BIN ( nothing returned )
2. hadoop fs -ls /etl/ABC/XYZ/2014-08-13/mytestfile.BIN
Found 1 items
-rw-r--r-- 3 testuser supergroup 2087844001 2014-08-13 23:52 /etl/ABC/XYZ/2014-08-13/mytestfile.BIN


Empty directories do have the same behaviour you saw:

$ hadoop fs -mkdir /tmp/empty


# ls returns nothing since directory is empty
$ hadoop fs -ls /tmp/empty



test passes

[hdfs@host-10-16-8-143 ~]$ hadoop fs -test -e /tmp/empty
[hdfs@host-10-16-8-143 ~]$ echo $?
[hdfs@host-10-16-8-143 ~]$ hadoop fs -test -d /tmp/empty
[hdfs@host-10-16-8-143 ~]$ echo $?


Whereas /tmp/empty1 does not exist

$ hadoop fs -ls /tmp/empty1
ls: `/tmp/empty1': No such file or directory

$ hadoop fs -test -e /tmp/empty1
$ echo $?

So in your case, it appears "/etl/ABC/XYZ/2014-08-13/mytestdirectory" did exist which is why you were able to delete it. So when you check for existence, don't look for output from "hadoop fs -test", instead check the return code with $?.

Gautam Gopalakrishnan
