About keeblerh

keeblerh · ‎08-02-2017

Cloudera said I had a new ranking "idiot". My apologies for not realizing this sooner. I was doing a hdfs dfs -du -h -s hdfs://server:8020/hbase/data/default/* and expected to see the results of the "restored" snapshot when apparently it is only information pointing to the /hbase/archive/data/default. So for anyone that is interested it looks like to find out the REAL table size and you are using snapshots you need to look in BOTH directories and add them together. Cloudera didn't really call me that 🙂

keeblerh · ‎08-02-2017

We noticed a ridiculous improvement on a table's size after it's snapshot was exported to another cluster. old-cluster - table is145.8 GB new-cluster - table is 54.8 MB The two clusters are configured similarly. Same number of region servers,etc. The numbers above reflect table size not including replication. One of the reasons were are moving to the new-cluster is for addtional storage because our data is growing so quickly and we have found that HBASE requires 50% free space to do any compaction. I assume some if not all of the difference in table size is compaction but I'm surprised at the huge difference in the sizes and wondering if there is something we could do to improve the working cluster to avoid such wasted space. Note that we load data in bulk once a day. It is not updated real-time. I realize this post is a little obtuse. My apologies.

keeblerh · ‎07-28-2017

This problem was finally resolved. For anyone else having similar, what appears to be quirky, problems with the exportSnapshot here is how we resolved it. FYI - finding the source code of the version of exportSnapshot we were running helped to pinpoint exactly where the error was occurring and what had been executed and successful to that point. I also ran it different ways from each cluster and found an unusual alias being used when trying the hftp://server:50070 port. So the bottom line was each cluster - all namenodes and datanodes had to resolve (have added to /etc/hosts) EVERY alias being used including all internal ip'd aliases to the external ips whether you thought it was explicity being used by hadoop or hbase somewhere or not. Thanks to all. Better error messages and/or the ability to debug the code would have been helpful.

keeblerh · ‎07-26-2017

This is related to the post - distcp post We are trying to export a snapshot from one cluster to another cluster using the below command hbase org.apache.hadoop.hbase.snapshot.ExcportSnapshot -snapshot mysnapshot -copy-from hdfs://namenode1:8020/hbase -copy-to hdfs://namenode2:8020/hbase We are running hbase 1.0.0 cdh5.5.1+274-1.cdh5.5.1.p0.15.e17. Ports 8020 and 50010 are open between the 2 clusters, i.e., I can telnet to these ports. When the command runs an empty file /hbase/.hbase-snapshot/.tmp/mysnapshot/.snapshotinfo is created on namenode2. The error received is: INFO [main] snapshot.ExportSnapshot: Copy Snapshot Manifest WARN [Thread-6] hdfs.DFSClient: DataStreamer Exception java.nio.channels.UnresolvedAddressException at sun.nio.ch.Net.checkAddress(Net.java:101) ..... ..... at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java.668) Exception in thread "main" org.apache.hadoop.hbase.snapshot.ExportSnapshotExcpetion: Failed to copy the snapshot directory: from=hdfs://namenode1:8020/hbase/.hbase-snapshot/contentSnapshot to=hdfs://namenode2:8020/hbase/.hbase-snapshot/.tmp/contentSnapshot at org.apache.hadoop.hbase.snapshot.ExportSnapshot.run(ExportSnapshot.java:932) ..... and more and more ... So, anyway I am trying to debug this and I cannot find the exportSnapshot source code for the specific version of the software we are running. Some connection between the 2 servers is happening because the empty file is created. It seems to fail when copying the data.manifest (maybe?). I guess one question is can the source code be located for 1.0 5.5.1 ?

keeblerh · ‎07-26-2017

I am going to repost this as an HBASE SNAPSHOT question.

keeblerh · ‎07-20-2017

Thanks for the response. I think that 50070 is the hftp port and both clusters are the same version of CDH5. I was able to make progress in one direction only. First I added the remote namenode into the hadoop-conf/masters file and open up the 8020 ports for 0.0.0.0/0 (wide open for some reason). so the following command worked from cluster1: distcp hdfs://namenode2:8020/file hdfs://namenode1:8020/ So basically where the source is the remote cluster. I was able to do the same sort of thing from the cluster2. I still cannot do it where the source is the same cluster. The bottom line is we are trying to do an HBASE snapshot export between 2 clusters and kept receiving the UnresolvedAddressException. Since the export does the distcp under the covers I was trying to debug that first but still get the same results doing the snapshot export: hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot mySnapshot -copy-to hdfs://namenode2/hbase

keeblerh · ‎07-19-2017

Our "catch22" situation: we been continually getting the java.nio.channels.UnresolvedAddressException trying to do a distcp between our 2 clusters. Our setup is the following: running CDH5, 2 separate clusters with their own internal ips and external ips do they can communicate. The 8020 and 50010 ports are open between all of them. I think the internal ips are the problem. I run the following command from one of the datanodes on the namenode1 cluster distcp hdfs://namenode1:8020/somefile hdfs://namenode2:8020/ each of the namenode and datanodes on each cluster have the external ips for the other one in their /etc/hosts The above commands fails with the following doctored error message (my system has no access to internet) .....Caused by: java.nio.channels.UnresolvedAddressException at sun.nio.ch.Net.checkAddress(Net.java:101) ...... The unresolved address appears to be the one on namenode2 cluster. After many review of posts on similar issues it was suggested to set "dfs.client.use.datanode.hostname=false". When I do this it gets past the above error attempts to create a file on the destination but then fails with the following error: Connection timed out at sun.nio.ch.SocketChannelImpl.checkConnect...... .... Abandoning BP-xxxxxxx- <the internal ip address>xxxxxxxx ...Excluding datanode DatanodeInfoWithStorage[<the internal ip>:50010, xxxxxx ...... Note that an empty file is created on the destination "tmp" directory. After reviewing more posts on this error it says you NEED to have "dfs.client.use.datanode.hostname=true" which kind of makes sense BUT puts me back to the first error. Thanks for any assistance of where else to look. I can provide more detail if necessary.

Online	Offline
Last Visited	‎08-11-2017 02:53 PM

Member Since	‎07-18-2017 10:18 AM
Last Visited	‎08-11-2017 02:53 PM
Posts	7
Kudos received	1

Cloudera Community

Re: Significant HBASE table storage utilization re...

Re: HBase exportSnapshot failing - need help with ...

Re: Significant HBASE table storage utilization re...

Significant HBASE table storage utilization reduct...

Re: HBase exportSnapshot failing - need help with ...

HBase exportSnapshot failing - need help with debu...

Re: UnresolvedAddressException trying to distcp be...

Re: UnresolvedAddressException trying to distcp be...

UnresolvedAddressException trying to distcp betwee...