I have 3 node cluster using CDH 5.9 running on CentOS6.7.
I am trying to get RSS feed of a news website to HDFS. for this I had a java code to read RSS feed and used Flume Exec source and avro sink on one node and on another Avro source and hdfs sink. Every thing was running fine on Friday evening. Streaming was happening.
Also, I installed R package from EPEL repo and installed Sparklyr libraries. My cluster is non-kerberized
But when I came on Monday, I saw all HDFS directory is deleted and I have a wierd directory name /NODATA4U_SECUREYOUR**bleep**
-bash-4.1$ hadoop fs -ls /
Found 3 items
drwxr-xr-x - hdfs supergroup 0 2017-01-06 22:04 /NODATA4U_SECUREYOUR**bleep**
drwxrwxrwx - hdfs supergroup 0 2017-01-06 22:05 /tmp
drwxrwxr-x - hdfs supergroup 0 2017-01-06 22:08 /user
I am not sure what to do. Even if I manage to revert to my original state how do I prevent it in future.
I have no clue please help.
Thank you for reporting this situation. We take these issues seriously and are looking into it. We will report back when we have more information.
Please see the latest Cloudera advisory on the topic for more details on the issue.
We have just published a new Engineering blog post How to secure ‘Internet exposed’ Apache Hadoop that may be of interest.