<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Duplicate Directories in HDFS in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Duplicate-Directories-in-HDFS/m-p/37319#M19086</link>
    <description>&lt;P&gt;Thanks Denloe for your response. I actually got one more doubt.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;What if there is some other name following with after that ^M ? Should we need to use&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;hdfs dfs -rmdir "/a/b/c/d//20160205^Msomepart" or do we need to use some escape sequence for this?&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Moreover when I press ctrl+v or ctrl+m the command is immediately executing (it is not allowing to type the second word)&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Thu, 11 Feb 2016 17:23:48 GMT</pubDate>
    <dc:creator>sathishkumar</dc:creator>
    <dc:date>2016-02-11T17:23:48Z</dc:date>
    <item>
      <title>Duplicate Directories in HDFS</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Duplicate-Directories-in-HDFS/m-p/37308#M19083</link>
      <description>&lt;P&gt;Hi All,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Our application team created hdfs directories with below script.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;hadoop fs -mkdir /a/b/c/d/20160208&lt;BR /&gt;hadoop fs -mkdir /a/b/c/d/20160208/s&lt;BR /&gt;hadoop fs -mkdir /a/b/c/d/20160208/s/inputmap&lt;BR /&gt;hadoop fs -mkdir /a/b/c/d/20160208/s/temp&lt;BR /&gt;hadoop fs -mkdir /a/b/c/d/20160208/s/map&lt;BR /&gt;hadoop fs -mkdir /a/b/c/d/20160208/s/input&lt;BR /&gt;hadoop fs -copyFromLocal /x/y/z/20160208.dat /a/b/c/d/20160208/s/inputmap&lt;/P&gt;&lt;P&gt;echo "Setup Complete"&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The directories got creted but it throws error if we try to access it.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;hdfs@hostname$ hadoop fs -ls /a/b/c/d/&lt;BR /&gt;Found 20 items&lt;BR /&gt;drwxr-xr-x - user group 0 2016-01-27 09:10 /a/b/c/d/20141211&lt;BR /&gt;drwxr-xr-x - user group 0 2016-01-06 01:03 /a/b/c/d/20141212&lt;BR /&gt;drwxr-xr-x - user group 0 2016-01-06 01:09 /a/b/c/d/20141213&lt;BR /&gt;drwxr-xr-x - user group 0 2015-11-12 08:53 /a/b/c/d/20151106&lt;BR /&gt;drwxr-xr-x - user group 0 2016-01-12 01:48 /a/b/c/d/20151118&lt;BR /&gt;drwxr-xr-x - user group 0 2015-12-04 04:21 /a/b/c/d/20151130&lt;BR /&gt;drwxrwxr-x - user group 0 2016-01-12 10:48 /a/b/c/d/20151221&lt;BR /&gt;drwxr-xr-x - user group 0 2016-01-19 11:23 /a/b/c/d/20160111&lt;BR /&gt;drwxr-xr-x - user group 0 2016-01-27 14:56 /a/b/c/d/20160112&lt;BR /&gt;drwxr-xr-x - user group 0 2016-02-02 16:12 /a/b/c/d/20160125&lt;BR /&gt;drwxr-xr-x - user group 0 2016-02-08 12:41 /a/b/c/d/20160126&lt;BR /&gt;drwxr-xr-x - user group 0 2016-02-08 10:26 /a/b/c/d/20160127&lt;BR /&gt;drwxr-xr-x - user group 0 2016-01-29 10:48 /a/b/c/d/20160129&lt;BR /&gt;drwxr-xr-x - user group 0 2016-02-09 02:43 /a/b/c/d/20160203&lt;BR /&gt;drwxr-xr-x - user group 0 2016-02-09 02:42 /a/b/c/d/20160204&lt;BR /&gt;drwxr-xr-x - user group 0 2016-02-08 15:38 /a/b/c/d/20160205&lt;BR /&gt;drwxr-xr-x - user group 0 2016-02-08 09:02 /a/b/c/d/20160205&lt;BR /&gt;drwxr-xr-x - user group 0 2016-02-08 07:00 /a/b/c/d/20160206&lt;BR /&gt;drwxr-xr-x - user group 0 2016-02-09 17:11 /a/b/c/d/20160208&lt;BR /&gt;drwxr-xr-x - user group 0 2016-02-08 11:07 /a/b/c/d/20160208&lt;BR /&gt;&lt;FONT color="#FF0000"&gt;hdfs@hostname$ hadoop fs -ls /a/b/c/d/20160206&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF0000"&gt;ls: `/a/b/c/d/20160206': No such file or directory&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;when we did | cat -v along with "ls" we came to know that some special character got inserted in the directory name as below.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;hdfs@hostname$ hadoop fs -ls /a/b/c/d/ | cat -v&lt;BR /&gt;Found 20 items&lt;BR /&gt;drwxr-xr-x - user group 0 2016-01-27 09:10 /a/b/c/d//20141211&lt;BR /&gt;drwxr-xr-x - user group 0 2016-01-06 01:03 /a/b/c/d//20141212&lt;BR /&gt;drwxr-xr-x - user group 0 2016-01-06 01:09 /a/b/c/d//20141213&lt;BR /&gt;drwxr-xr-x - user group 0 2015-11-12 08:53 /a/b/c/d//20151106&lt;BR /&gt;drwxr-xr-x - user group 0 2016-01-12 01:48 /a/b/c/d//20151118&lt;BR /&gt;drwxr-xr-x - user group 0 2015-12-04 04:21 /a/b/c/d//20151130&lt;BR /&gt;drwxrwxr-x - user group 0 2016-01-12 10:48 /a/b/c/d//20151221&lt;BR /&gt;drwxr-xr-x - user group 0 2016-01-19 11:23 /a/b/c/d//20160111&lt;BR /&gt;drwxr-xr-x - user group 0 2016-01-27 14:56 /a/b/c/d//20160112&lt;BR /&gt;drwxr-xr-x - user group 0 2016-02-02 16:12 /a/b/c/d//20160125&lt;BR /&gt;drwxr-xr-x - user group 0 2016-02-08 12:41 /a/b/c/d//20160126&lt;BR /&gt;drwxr-xr-x - user group 0 2016-02-08 10:26 /a/b/c/d//20160127&lt;BR /&gt;drwxr-xr-x - user group 0 2016-01-29 10:48 /a/b/c/d//20160129&lt;BR /&gt;drwxr-xr-x - user group 0 2016-02-09 02:43 /a/b/c/d//20160203&lt;BR /&gt;drwxr-xr-x - user group 0 2016-02-09 02:42 /a/b/c/d//20160204&lt;BR /&gt;drwxr-xr-x - user group 0 2016-02-08 15:38 /a/b/c/d//20160205&lt;BR /&gt;&lt;FONT color="#FF0000"&gt;drwxr-xr-x - user group 0 2016-02-08 09:02 /a/b/c/d//20160205^M&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF0000"&gt;drwxr-xr-x - user group 0 2016-02-08 07:00 /a/b/c/d//20160206^M&lt;/FONT&gt;&lt;BR /&gt;drwxr-xr-x - user group 0 2016-02-09 17:11 /a/b/c/d//20160208&lt;BR /&gt;&lt;FONT color="#FF0000"&gt;drwxr-xr-x - user group 0 2016-02-08 11:07 /a/b/c/d//20160208^M&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Now i want to delete these duplicate entries, can anyone help me with this.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;&lt;P&gt;Srini&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 10:03:36 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Duplicate-Directories-in-HDFS/m-p/37308#M19083</guid>
      <dc:creator>Srini</dc:creator>
      <dc:date>2022-09-16T10:03:36Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicate Directories in HDFS</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Duplicate-Directories-in-HDFS/m-p/37313#M19084</link>
      <description>&lt;P&gt;You would handle this in the same way if the issue occurred on a linux filesystem. &amp;nbsp; Use quotes around the filename and &lt;STRONG&gt;ctrl-v&lt;/STRONG&gt; to insert the special characters.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;In this case, I type &lt;STRONG&gt;ctrl-v&lt;/STRONG&gt; then &lt;STRONG&gt;ctrl-m&lt;/STRONG&gt;&amp;nbsp;to insert ^M into my strings.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;$ hdfs dfs -put /etc/group "/tmp/abc^M"

$ hdfs dfs -ls /tmp
Found 4 items
drwxrwxrwx   - hdfs   supergroup          0 2016-02-11 11:29 /tmp/.cloudera_health_monitoring_canary_files
-rw-r--r--   3 hdfs   supergroup        954 2016-02-11 11:30 /tmp/abc
drwx-wx-wx   - hive   supergroup          0 2016-01-11 12:10 /tmp/hive
drwxrwxrwt   - mapred hadoop              0 2016-01-11 12:08 /tmp/logs

$ hdfs dfs -ls /tmp | cat -v
Found 4 items
drwxrwxrwx   - hdfs   supergroup          0 2016-02-11 11:30 /tmp/.cloudera_health_monitoring_canary_files
-rw-r--r--   3 hdfs   supergroup        954 2016-02-11 11:30 /tmp/abc^M
drwx-wx-wx   - hive   supergroup          0 2016-01-11 12:10 /tmp/hive
drwxrwxrwt   - mapred hadoop              0 2016-01-11 12:08 /tmp/logs

$ hdfs dfs -mv "/tmp/abc^M" /tmp/abc

$ hdfs dfs -ls /tmp | cat -v
Found 4 items
drwxrwxrwx   - hdfs   supergroup          0 2016-02-11 11:31 /tmp/.cloudera_health_monitoring_canary_files
-rw-r--r--   3 hdfs   supergroup        954 2016-02-11 11:30 /tmp/abc
drwx-wx-wx   - hive   supergroup          0 2016-01-11 12:10 /tmp/hive
drwxrwxrwt   - mapred hadoop              0 2016-01-11 12:08 /tmp/logs

&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 11 Feb 2016 16:40:34 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Duplicate-Directories-in-HDFS/m-p/37313#M19084</guid>
      <dc:creator>denloe</dc:creator>
      <dc:date>2016-02-11T16:40:34Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicate Directories in HDFS</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Duplicate-Directories-in-HDFS/m-p/37316#M19085</link>
      <description>&lt;P&gt;In my example I used -mv. &amp;nbsp;You would use -rmdir.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;FONT face="courier new,courier"&gt;hdfs dfs -rmdir "/a/b/c/d//20160205^M"&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Remember, to get "^M" type &lt;STRONG&gt;ctrl-v&lt;/STRONG&gt; &lt;STRONG&gt;ctrl-m&lt;/STRONG&gt;.&lt;/P&gt;</description>
      <pubDate>Thu, 11 Feb 2016 16:48:51 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Duplicate-Directories-in-HDFS/m-p/37316#M19085</guid>
      <dc:creator>denloe</dc:creator>
      <dc:date>2016-02-11T16:48:51Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicate Directories in HDFS</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Duplicate-Directories-in-HDFS/m-p/37319#M19086</link>
      <description>&lt;P&gt;Thanks Denloe for your response. I actually got one more doubt.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;What if there is some other name following with after that ^M ? Should we need to use&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;hdfs dfs -rmdir "/a/b/c/d//20160205^Msomepart" or do we need to use some escape sequence for this?&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Moreover when I press ctrl+v or ctrl+m the command is immediately executing (it is not allowing to type the second word)&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 11 Feb 2016 17:23:48 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Duplicate-Directories-in-HDFS/m-p/37319#M19086</guid>
      <dc:creator>sathishkumar</dc:creator>
      <dc:date>2016-02-11T17:23:48Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicate Directories in HDFS</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Duplicate-Directories-in-HDFS/m-p/37334#M19087</link>
      <description>&lt;P&gt;The non-printable character may be located anywhere in the filename. &amp;nbsp;You just need to insert it in the appropriate location when quoting the filename.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Using ctrl-v to insert special characters is the default for the bash shell, but your terminal emulator (especially if you are coming in from Windows) may be catching it instead.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Try using shift-insert instead of ctrl-v. &amp;nbsp;If that fails, you may need to find an alternate method to embed control characters, such as use vi to create to bash script and insert them using vi.&lt;/P&gt;</description>
      <pubDate>Thu, 11 Feb 2016 19:21:11 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Duplicate-Directories-in-HDFS/m-p/37334#M19087</guid>
      <dc:creator>denloe</dc:creator>
      <dc:date>2016-02-11T19:21:11Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicate Directories in HDFS</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Duplicate-Directories-in-HDFS/m-p/37358#M19088</link>
      <description>I prefer using the simpler bash syntax of using special escaped characters,&lt;BR /&gt;if it helps:&lt;BR /&gt;&lt;BR /&gt;We know that ^M is the same as \r, which makes sense if you used Windows&lt;BR /&gt;Notepad to write the commands but forgot to convert the file via dos2unix:&lt;BR /&gt;&lt;BR /&gt;~&amp;gt; echo $'\x0d' | cat -v&lt;BR /&gt;^M&lt;BR /&gt;~&amp;gt; echo -n $'\x0d' | od -c&lt;BR /&gt;0000000 \r&lt;BR /&gt;0000002&lt;BR /&gt;&lt;BR /&gt;(The \x0D or \x0d is the hex equivalent of \r, per&lt;BR /&gt;&lt;A href="http://www.asciitable.com/" target="_blank"&gt;http://www.asciitable.com/&lt;/A&gt; (carriage return))&lt;BR /&gt;&lt;BR /&gt;Therefore, you can use the $'' syntax to write a string that includes the&lt;BR /&gt;escape:&lt;BR /&gt;&lt;BR /&gt;~&amp;gt; hadoop fs -ls $'/a/b/c/d/20160206\r'&lt;BR /&gt;Or,&lt;BR /&gt;~&amp;gt; hadoop fs -ls $'/a/b/c/d/20160206\x0d'&lt;BR /&gt;&lt;BR /&gt;This words well regardless of the terminal emulator you are using, cause&lt;BR /&gt;we're escaping based on representation vs. by reliance on the emulator&lt;BR /&gt;understanding the characters via input.&lt;BR /&gt;</description>
      <pubDate>Fri, 12 Feb 2016 05:29:14 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Duplicate-Directories-in-HDFS/m-p/37358#M19088</guid>
      <dc:creator>Harsh J</dc:creator>
      <dc:date>2016-02-12T05:29:14Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicate Directories in HDFS</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Duplicate-Directories-in-HDFS/m-p/61198#M19089</link>
      <description>&lt;P&gt;There is a simple method to remove those.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;1. List those directories inside a txt file like below&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;hadoop fs -ls /path &amp;gt; test&lt;/P&gt;&lt;P&gt;2. cat -t test will give you positions of duplicate with junk character&lt;/P&gt;&lt;P&gt;3. open another shell and just try to comment it # to identify exact ones&lt;/P&gt;&lt;P&gt;4. again cat -t the file to confirm u commented the culprits&lt;/P&gt;&lt;P&gt;5. remove original folder frm list&lt;/P&gt;&lt;P&gt;6. for i in `cat list`;&lt;/P&gt;&lt;P&gt;do hadoop fs -rmr $i;&lt;/P&gt;&lt;P&gt;done&lt;/P&gt;</description>
      <pubDate>Tue, 24 Oct 2017 07:12:55 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Duplicate-Directories-in-HDFS/m-p/61198#M19089</guid>
      <dc:creator>ianeeshps</dc:creator>
      <dc:date>2017-10-24T07:12:55Z</dc:date>
    </item>
  </channel>
</rss>

