Member since
11-24-2015
4
Posts
0
Kudos Received
0
Solutions
11-30-2015
01:50 PM
Hi Vikas, In general, we recommend storing data on instance storage drives for EC2 since EBS volumes are slow and charge you per access. Instance storage is ephemeral, which means that whether the dir is named "/swap" or something else, it'll disappear if you restart the machine. You should back up your data to a safe location before powering down your EC2 machine, as discussed here: http://www.cloudera.com/content/www/en-us/documentation/other/reference-architecture/PDF/cloudera_ref_arch_aws.pdf The difference between dfs.data.dir and dfs.datanode.data.dir is that the first is a very old name used in CDH3 and perhaps earlier, while the second the preferred current config name as of CDH4. They are both logically the same thing. Thanks, Darren
... View more