04-25-2014 09:03 AM - edited 04-25-2014 02:40 PM
It's possible to specify a default value for the replication factor of all future files created in the cluster using the dfs.replication property.
It's possible to specify the replication factor for a single file when it is created using the dfs.replication property (e.g. $ hdfs dfs -D dfs.replication=1 -put /local/path/myfile.txt /path/on/hdfs).
It's possible to change the replication factor for all files in a directory after they've been written using the setrep command.
It's unfortunately not possible to specify that all files written into a directory should have a specific replication factor. This idea was proposed many years ago in HDFS-199 but has still not been implemented. You'll have to enforce this behavior in the application as files are created or by watching a directory and using the setrep command after the files are written.