I have replicated a record from a nearby to the hdfs document framework and the document got duplicated -/client/hduser/in hduser@vagrant:/usr/neighborhood/hadoop/hadoop-1.2.1$ receptacle/hadoop fs - copyFromLocal/home/hduser/afile in Question:- 1.How does hadoop as a matter of course duplicates the record to this index -/client/hduser/in ...Where is this mapping indicated in the conf document?
The command syntax is
hadoop fs -copyFromLocal <localsrc> URI
If "URI" is not given, it will copy to /home/$(whoami), where in your case `$(whoami) == "hduser"`
In other words, running this command as the hduser linux account
hadoop fs -copyFromLocal afile in
Will copy "afile" to hdfs:///home/hduser/in
If you want to copy to a different location on HDFS, give the full path to the destination
Navigate to your "/install/hadoop/datanode/bin" folder or path where you could execute your hadoop commands:
To place the files in HDFS: Format: hadoop fs -put "Local system path"/filename.csv "HDFS destination path"
eg)./hadoop fs -put /opt/csv/load.csv /user/load
Here the /opt/csv/load.csv is source file path from my local linux system.
/user/load means HDFS cluster destination path in "hdfs://hacluster/user/load"
To get the files from HDFS to local system: Format : hadoop fs -get "/HDFSsourcefilepath" "/localpath"
eg)hadoop fs -get /user/load/a.csv /opt/csv/
After executing the above command, a.csv from HDFS would be downloaded to /opt/csv folder in local linux system.
This uploaded files could also be seen through HDFS NameNode web UI.