Support Questions
Find answers, ask questions, and share your expertise

Connect a Nifi service which lives on an EC2 instance to HBase which lives on a separate EMR

New Contributor

We are trying to connect an instance of NiFi which is living on an EC2 instance to an HBase database which is living on an EMR cluster. We are having issues figuring out how to point the HBase_1_1_2_ClientService to point to the HBase config file which lives on a different machine (The EMR Cluster)

1 ACCEPTED SOLUTION

Accepted Solutions

Master Guru

You could also modify the local /etc/hosts file on your ec2 instances so that the hostname "ip-10-40-197.ec2.internal" resolves to the proper external IP addresses for those zk nodes if they have them.

View solution in original post

7 REPLIES 7

Usually if you have NiFi running on a node that is not part of the HDFS/HBase cluster, you copy the appropriate config files (hbase-site.xml and core-site.xml) to the NiFi node.

New Contributor

I did try that, but I am still getting this error on the HBase_1_1_2_Client Service. It is also stuck in the Enabling status

5834-capture.png

Are you sure that NiFi can reach all the services in EMR (HBase, ZK, etc)?

Also, can you look in nifi_home/logs/nifi-app.log and see if there is a full strack-trace that goes with that error. If so it would be helpful to see that, thanks.

New Contributor

That's actually what I am starting to think the issue is. How would I make sure that NiFi can reach the EMR services?

nifi-app.zipAlso Log is attached

2016-07-18 15:00:01,784 WARN [StandardProcessScheduler Thread-3] o.a.h.h.zookeeper.RecoverableZooKeeper Unable to create ZooKeeper Connection
java.net.UnknownHostException: ip-10-40-20-197.ec2.internal: unknown error
    at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method) ~[na:1.8.0_77]
    at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928) ~[na:1.8.0_77]
    at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323) ~[na:1.8.0_77]
    at java.net.InetAddress.getAllByName0(InetAddress.java:1276) ~[na:1.8.0_77]
    at java.net.InetAddress.getAllByName(InetAddress.java:1192) ~[na:1.8.0_77]
    at java.net.InetAddress.getAllByName(InetAddress.java:1126) ~[na:1.8.0_77]
    at org.apache.zookeeper.client.StaticHostProvider.<init>(StaticHostProvider.java:61) ~[zookeeper-3.4.6.jar:3.4.6-1569965]
    at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:445) ~[zookeeper-3.4.6.jar:3.4.6-1569965]
    at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:380) ~[zookeeper-3.4.6.jar:3.4.6-1569965]
    at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.checkZk(RecoverableZooKeeper.java:141) [hbase-client-1.1.2.jar:1.1.2]
    at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:221) [hbase-client-1.1.2.jar:1.1.2]
    at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:541) [hbase-client-1.1.2.jar:1.1.2]
    at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.checkIfBaseNodeAvailable(ConnectionManager.java:895) [hbase-client-1.1.2.jar:1.1.2]
    at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.access$400(ConnectionManager.java:545) [hbase-client-1.1.2.jar:1.1.2]
    at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStubNoRetries(ConnectionManager.java:1483) [hbase-client-1.1.2.jar:1.1.2]
    at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStub(ConnectionManager.java:1524) [hbase-client-1.1.2.jar:1.1.2]
    at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$MasterServiceStubMaker.makeStub(ConnectionManager.java:1553) [hbase-client-1.1.2.jar:1.1.2]
    at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.getKeepAliveMasterService(ConnectionManager.java:1704) [hbase-client-1.1.2.jar:1.1.2]
    at org.apache.hadoop.hbase.client.MasterCallable.prepare(MasterCallable.java:38) [hbase-client-1.1.2.jar:1.1.2]
    at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:124) [hbase-client-1.1.2.jar:1.1.2]
    at org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3917) [hbase-client-1.1.2.jar:1.1.2]
    at org.apache.hadoop.hbase.client.HBaseAdmin.listTableNames(HBaseAdmin.java:413) [hbase-client-1.1.2.jar:1.1.2]
    at org.apache.hadoop.hbase.client.HBaseAdmin.listTableNames(HBaseAdmin.java:397) [hbase-client-1.1.2.jar:1.1.2]
    at org.apache.nifi.hbase.HBase_1_1_2_ClientService.onEnabled(HBase_1_1_2_ClientService.java:181) [nifi-hbase_1_1_2-client-service-0.6.1.jar:0.6.1]
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_77]
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_77]
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_77]
    at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_77]
    at org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotations(ReflectionUtils.java:137) [nifi-framework-core-0.6.1.jar:0.6.1]
    at org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotations(ReflectionUtils.java:125) [nifi-framework-core-0.6.1.jar:0.6.1]
    at org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotations(ReflectionUtils.java:70) [nifi-framework-core-0.6.1.jar:0.6.1]
    at org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotation(ReflectionUtils.java:47) [nifi-framework-core-0.6.1.jar:0.6.1]
    at org.apache.nifi.controller.service.StandardControllerServiceNode$1.run(StandardControllerServiceNode.java:285) [nifi-framework-core-0.6.1.jar:0.6.1]
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_77]
    at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_77]
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_77]
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [na:1.8.0_77]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_77]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_77]
    at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77]

This looks like a DNS issue. The hostname "ip-10-40-20-197.ec2.internal" is not routable from the node which you are running NiFi on. Since this is on EC2, you likely need to configure HBase to use the external hostnames, not the internal names, if you intend to communicate with it from outside those hosts. I don't know what fancy things you can do with AWS to make this possible otherwise (maybe you can set up some private network between your EMR cluster and your EC2 nodes?).

This document on how to set up multi-homing for Hadoop and HBase might be helpful for you (as that's essentially the environment that EC2 sets up for you by default): https://community.hortonworks.com/articles/24277/parameters-for-multi-homing.html

Master Guru

You could also modify the local /etc/hosts file on your ec2 instances so that the hostname "ip-10-40-197.ec2.internal" resolves to the proper external IP addresses for those zk nodes if they have them.

View solution in original post

Super Guru

@Michael Sobelman That DNS is not detectable by the node you are trying to access from. You can be fancy on aws and configure through routing tables by setting up a proper vpn between the EMR and NiFi nodes. Another option I used is route53 which will give you DNS publicly available. Lastly you can put a ELB infront of your EMR HBase master node. You may have to script it up (via boot scripts) to configure your ELB to point to new internal IP.