I have a MapReduce job connecting to HBase. I have tested it on Cloudera VM 5.0.0 (so psuedo-distributed mode). Now I try to run it on a CDH5 cluster deployed on Amazon EC2.
Following the advices found in Google, I updated the dependency in my JAR to match the one on the cluster: 0.96.1.1-cdh5.0.2 for HBase and 2.3.0-cdh5.0.2 for Hadoop. Also, I added hbase.zookeeper.quorum property to my HBaseConfiguration (it was not necessary in pseudo-distributed mode...):
Right after the ZK connection is successfully established, the client would attempt to talk to the RegionServer hosting the META region. I suspect this is what it is getting failures at (and is silently retrying).
If you give it a while (say, 10m), or attempt a jstack on it, do you see any forms of exception or places where it is stuck trying to grab a connection to your RSes? That information may help you proceed with troubleshooting the issue.