Created 07-18-2016 01:52 PM
We are trying to connect an instance of NiFi which is living on an EC2 instance to an HBase database which is living on an EMR cluster. We are having issues figuring out how to point the HBase_1_1_2_ClientService to point to the HBase config file which lives on a different machine (The EMR Cluster)
Created 07-18-2016 10:09 PM
You could also modify the local /etc/hosts file on your ec2 instances so that the hostname "ip-10-40-197.ec2.internal" resolves to the proper external IP addresses for those zk nodes if they have them.
Created 07-18-2016 02:10 PM
Usually if you have NiFi running on a node that is not part of the HDFS/HBase cluster, you copy the appropriate config files (hbase-site.xml and core-site.xml) to the NiFi node.
Created on 07-18-2016 02:30 PM - edited 08-19-2019 02:10 AM
I did try that, but I am still getting this error on the HBase_1_1_2_Client Service. It is also stuck in the Enabling status
Created 07-18-2016 03:39 PM
Are you sure that NiFi can reach all the services in EMR (HBase, ZK, etc)?
Also, can you look in nifi_home/logs/nifi-app.log and see if there is a full strack-trace that goes with that error. If so it would be helpful to see that, thanks.
Created 07-18-2016 04:00 PM
That's actually what I am starting to think the issue is. How would I make sure that NiFi can reach the EMR services?
nifi-app.zipAlso Log is attached
Created 07-18-2016 06:32 PM
2016-07-18 15:00:01,784 WARN [StandardProcessScheduler Thread-3] o.a.h.h.zookeeper.RecoverableZooKeeper Unable to create ZooKeeper Connection java.net.UnknownHostException: ip-10-40-20-197.ec2.internal: unknown error at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method) ~[na:1.8.0_77] at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928) ~[na:1.8.0_77] at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323) ~[na:1.8.0_77] at java.net.InetAddress.getAllByName0(InetAddress.java:1276) ~[na:1.8.0_77] at java.net.InetAddress.getAllByName(InetAddress.java:1192) ~[na:1.8.0_77] at java.net.InetAddress.getAllByName(InetAddress.java:1126) ~[na:1.8.0_77] at org.apache.zookeeper.client.StaticHostProvider.<init>(StaticHostProvider.java:61) ~[zookeeper-3.4.6.jar:3.4.6-1569965] at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:445) ~[zookeeper-3.4.6.jar:3.4.6-1569965] at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:380) ~[zookeeper-3.4.6.jar:3.4.6-1569965] at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.checkZk(RecoverableZooKeeper.java:141) [hbase-client-1.1.2.jar:1.1.2] at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:221) [hbase-client-1.1.2.jar:1.1.2] at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:541) [hbase-client-1.1.2.jar:1.1.2] at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.checkIfBaseNodeAvailable(ConnectionManager.java:895) [hbase-client-1.1.2.jar:1.1.2] at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.access$400(ConnectionManager.java:545) [hbase-client-1.1.2.jar:1.1.2] at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStubNoRetries(ConnectionManager.java:1483) [hbase-client-1.1.2.jar:1.1.2] at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStub(ConnectionManager.java:1524) [hbase-client-1.1.2.jar:1.1.2] at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$MasterServiceStubMaker.makeStub(ConnectionManager.java:1553) [hbase-client-1.1.2.jar:1.1.2] at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.getKeepAliveMasterService(ConnectionManager.java:1704) [hbase-client-1.1.2.jar:1.1.2] at org.apache.hadoop.hbase.client.MasterCallable.prepare(MasterCallable.java:38) [hbase-client-1.1.2.jar:1.1.2] at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:124) [hbase-client-1.1.2.jar:1.1.2] at org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3917) [hbase-client-1.1.2.jar:1.1.2] at org.apache.hadoop.hbase.client.HBaseAdmin.listTableNames(HBaseAdmin.java:413) [hbase-client-1.1.2.jar:1.1.2] at org.apache.hadoop.hbase.client.HBaseAdmin.listTableNames(HBaseAdmin.java:397) [hbase-client-1.1.2.jar:1.1.2] at org.apache.nifi.hbase.HBase_1_1_2_ClientService.onEnabled(HBase_1_1_2_ClientService.java:181) [nifi-hbase_1_1_2-client-service-0.6.1.jar:0.6.1] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_77] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_77] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_77] at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_77] at org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotations(ReflectionUtils.java:137) [nifi-framework-core-0.6.1.jar:0.6.1] at org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotations(ReflectionUtils.java:125) [nifi-framework-core-0.6.1.jar:0.6.1] at org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotations(ReflectionUtils.java:70) [nifi-framework-core-0.6.1.jar:0.6.1] at org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotation(ReflectionUtils.java:47) [nifi-framework-core-0.6.1.jar:0.6.1] at org.apache.nifi.controller.service.StandardControllerServiceNode$1.run(StandardControllerServiceNode.java:285) [nifi-framework-core-0.6.1.jar:0.6.1] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_77] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_77] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_77] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [na:1.8.0_77] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_77] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_77] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77]
This looks like a DNS issue. The hostname "ip-10-40-20-197.ec2.internal" is not routable from the node which you are running NiFi on. Since this is on EC2, you likely need to configure HBase to use the external hostnames, not the internal names, if you intend to communicate with it from outside those hosts. I don't know what fancy things you can do with AWS to make this possible otherwise (maybe you can set up some private network between your EMR cluster and your EC2 nodes?).
This document on how to set up multi-homing for Hadoop and HBase might be helpful for you (as that's essentially the environment that EC2 sets up for you by default): https://community.hortonworks.com/articles/24277/parameters-for-multi-homing.html
Created 07-18-2016 10:09 PM
You could also modify the local /etc/hosts file on your ec2 instances so that the hostname "ip-10-40-197.ec2.internal" resolves to the proper external IP addresses for those zk nodes if they have them.
Created 07-19-2016 02:23 AM
@Michael Sobelman That DNS is not detectable by the node you are trying to access from. You can be fancy on aws and configure through routing tables by setting up a proper vpn between the EMR and NiFi nodes. Another option I used is route53 which will give you DNS publicly available. Lastly you can put a ELB infront of your EMR HBase master node. You may have to script it up (via boot scripts) to configure your ELB to point to new internal IP.