Created on 08-14-2018 04:14 PM - edited 09-16-2022 06:35 AM
I am having problem to create AWS EC2 because of the limitation and amazon is still processing my request.
So i use hortonworks Sandbox + Virtual box... i tried to upload the "fligthdelays" data into local, and then use " hdfs dfs -copyFromLocal"
But i actually i am confused about "local"....does this local mean... after connecting to maria_dev@localhost? but how can i upload files to home/maria_dev/datasets? even i tried to crate datasets by using ambri, but i can't find files from /home/maria_dev...... i am very confused how to write Task 1 by using sandbox....
Created 08-14-2018 04:56 PM
@seninus The Copy To and Copy From HDFS syntax is as follows:
hdfs dfs -copyFromLocal /local/folder/file.txt /hdfs/folder/
hdfs dfs -copyToLocal /hdfs/folder/file2.txt /local/folder/
where /local/ is the local file system and /hdfs/ is the hdfs file system
It is also important to note, you want to execute those commands as hdfs user so:
sudo su - hdfs
or
sudo su - hdfs -c "hdfs dfs -copyFromLocal /local/folder/file.txt /hdfs/folder/"
If this answer is helpful please choose ACCEPT.
Created 08-14-2018 04:56 PM
@seninus The Copy To and Copy From HDFS syntax is as follows:
hdfs dfs -copyFromLocal /local/folder/file.txt /hdfs/folder/
hdfs dfs -copyToLocal /hdfs/folder/file2.txt /local/folder/
where /local/ is the local file system and /hdfs/ is the hdfs file system
It is also important to note, you want to execute those commands as hdfs user so:
sudo su - hdfs
or
sudo su - hdfs -c "hdfs dfs -copyFromLocal /local/folder/file.txt /hdfs/folder/"
If this answer is helpful please choose ACCEPT.
Created 08-14-2018 05:25 PM
You mean I have to install local hadoop(hdfs) at my Mac terminal first? and then use sudo su - hdfs at my Mac terminal? I think I installed hadoop( havn't configured, it seems to be a long process), and when I tried "sh - hdfs ", the Mac terminal gave back " su: unknown login: hdfs" to me...
Created 08-14-2018 05:38 PM
@seninus those hdfs commands are commands to execute using terminal prompt on the sandbox node, not your mac... copy LOCAL is from sandbox node to hdfs, NOT your local mac
Created on 08-14-2018 05:48 PM - edited 08-17-2019 08:38 PM
Yes, this is what I understood and was trying to do. but I uploaded flightdelays*.csv files to /datasets/flightdelays/ through Ambari. but I can't see local files when I login with maria_dev in terminal...
Created 08-14-2018 06:35 PM
now you just need to do the copy from hdfs to local filesystem:
sudo su - hdfs -c "hdfs dfs -copyToLocal /datasets /tmp"
mv /tmp/datasets /home/maria_dev
This last command because hdfs user cant write to /home/maria_dev. So write hdfs to tmp, move from /tmp to /home/maria_dev
Created 08-14-2018 05:39 PM
additionally that sandbox probably has a local (to the mac) folder path mounted on the sandbox file system. You would need to use that path to get files from your mac to the sandbox, then sandbox to hdfs
Created 08-14-2018 08:37 PM
Thanks for the tips. yeah I found the similar questions and solved it. Thanks
Created 08-15-2018 02:39 PM
@seninus glad you got it working. Please click accept on the main answer please, it helps close the question and gives me some reputation points. ;O)