Member since
03-09-2016
6
Posts
1
Kudos Received
0
Solutions
04-13-2017
07:08 AM
The requirement is to offload the data from SAP HANA to Hadoop on daily basis. There is two level of firewall between my Hadoop cluster ( 1300 + nodes) and remote SAP HANA database. I am exploring options like Sqoop/SPARK/Apache NiFi.Can anyone suggest the best and reliable option? I believe all DataNodes in the cluster should have access to SAP HANA to fetch the data through Sqoop... so it looks like the firewall needs to open across all DataNodes to SAP HANA. Is there any best way to fetch the data without giving access to all DataNodes? like routing the SAP HANA DB connection to only a few DataNodes... Node Labelling....
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Sqoop
04-29-2016
01:59 PM
@Bryan King Regarding username conflict.Alternative approach could be customizing the username specific to the cluster(say for eg <cluster-name>-hdfs,<cluster-name>yarn ) and maintain all of them in single KDC.
... View more
04-18-2016
09:09 AM
1 Kudo
@shimrit hoori You can download putty and use the ssh option. The port is 2222 and host is 127.0.0.1 or from the command line. ssh root@127.0.0.1 -p 2222 should prompt for username and password . root/hadoop is the password. Then you are in.! Hope this helps.
... View more
04-13-2016
04:43 AM
Paul Tamburello You can manage these instances by provisioning tool (eg Chef/Puppet) . Setup the Nodename as you wish for example , master , slave, edge and use one of the recipes like the below one to set the nodename to hostname. This way whenever you reboot the AWS instances receipes will automatically kicks in if there is some discrepency. https://supermarket.chef.io/cookbooks/hostnames/versions/0.3.1 Also make sure both reverse and forward lookup works. Hope this helps.
... View more