- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Coping files from Remote server to HDFS
- Labels:
-
Apache Hadoop
-
Apache Spark
Created 07-09-2018 09:26 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I have a remote server and Kerberos authenticated Hadoop environment.
I want to copy files from Remote server to HDFS for processing using Spark. Please advise efficient approach/HDFS command to copy files from remote server to HDFS. Any example will be helpful. We are bound by not to use flume or Nifi.
Please note Kerberos is installed on Remote server.
Created 07-10-2018 05:31 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I suggest on your HDFS server you start a process (using shellscript for example) to execute kinit and after that, get these remote files using sftp or scp example
# scp user@remoteserver:/remotepath/files localpath/
and
# hdfs dfs -put localpath/files /hdfspath
Note: To automate this process you can create a private/public ssh between these servers and create a crontab entry.
Created 07-10-2018 05:31 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I suggest on your HDFS server you start a process (using shellscript for example) to execute kinit and after that, get these remote files using sftp or scp example
# scp user@remoteserver:/remotepath/files localpath/
and
# hdfs dfs -put localpath/files /hdfspath
Note: To automate this process you can create a private/public ssh between these servers and create a crontab entry.
