Support Questions

Find answers, ask questions, and share your expertise

HBase replication to Amazon AWS: "Unknown Host"

avatar

We are currently trying to set up HBase replication from a cluster on our own hardware, to a new one on our VPC on Amazon AWS

(for disaster recovery).

 

The Amazon nodes have Private DNS like: ip-10-3-1-61.us-west-2.compute.internal and Private IPs like 10.3.1.61

 

All nodes on our hardware can ping, ssh, telnet etc to the Amazon nodes (using the IP or FQDN), and vice versa. There are entries in the

/etc/hosts files on all servers on both sides to ensure resolution. Our servers are running Ubuntu 12.04, the AWS ones are Red Hat 6.

 

Both clusters are running HBase 0.98.6, both have replication set to true. The tables we want replicated have replication_scope set to 1. I've

added the peer entry in hbase shell.

 

So everything seems good. However, when I start replication I see this in the Hbase logs on the master side:

 

2015-06-16 12:37:15,643 WARN org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Can't replicate because of a local or network error:
java.net.UnknownHostException: unknown host: ip-10-3-1-62.us-west-2.compute.internal
at org.apache.hadoop.hbase.ipc.RpcClient$Connection.<init>(RpcClient.java:385)
at org.apache.hadoop.hbase.ipc.RpcClient.createConnection(RpcClient.java:351)
at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1530)
at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1442)
at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661)
at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719)
at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.replicateWALEntry(AdminProtos.java:21036)
at org.apache.hadoop.hbase.protobuf.ReplicationProtbufUtil.replicateWALEntry(ReplicationProtbufUtil.java:65)
at org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.shipEdits(ReplicationSource.java:730)
at org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:388)

 

I understand the error, I just don't understand how it's getting it. What process is not able to resolve the hostname?

 

Any ideas?

 

Thanks in advance,

Paul

1 ACCEPTED SOLUTION

avatar

I realised the problem. My /etc/hosts file on the AWS cluster nodes needed to contain references to each other.

 

I had added references to the source cluster nodes, but the AWS nodes needed to be in there too.

View solution in original post

4 REPLIES 4

avatar
Check your VPC configuration and ensure "DNS resolution" is turned on.
There's no need to restart the instances.

Regards,
Gautam Gopalakrishnan

avatar

Thanks Gautam,

 

Yes both 'DNS Resolution' and 'DNS Hostnames' are set to YES for the VPC

 

Paul

avatar

I realised the problem. My /etc/hosts file on the AWS cluster nodes needed to contain references to each other.

 

I had added references to the source cluster nodes, but the AWS nodes needed to be in there too.

avatar
New Contributor

please, could you put, your hosts file example?
what did you put there?

 

 

Thanks a lot