Member since
01-24-2014
101
Posts
32
Kudos Received
18
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
28051 | 02-06-2017 11:23 AM | |
6972 | 11-30-2016 12:56 AM | |
7871 | 11-29-2016 11:57 PM | |
3692 | 08-16-2016 11:45 AM | |
3711 | 05-10-2016 01:55 PM |
04-11-2016
09:33 PM
From the context I'm assuming you have setup a 1 node test cluster? HDFS replicates data between different nodes, the /data/1, /data/2, and /data/3 are just different drives. HDFS will use each of those drives to store blocks, and will replicate those blocks to other nodes in the cluster. by Deleting /data/1 deleted the blocks on that drive. /data/2 or /data/3 won't have those blocks. If you have more than 1 node, HDFS will replicate a copy of the blocks that were stored on /data/1 to one of those other drives, likely spread out among all the available drives on that node. when /data/1 was deleted in that case, HDFS will detect those blocks went missing the next time the datanode checks in and start automatically repairing the under-replicated blocks. Missing blocks implies that the only copy of the block has gone missing, so in that case the only way to recover them would have been to do drive recovery operations on that drive. This will be the case in single node test clusters, thus the assumption above.
... View more
03-24-2016
03:39 PM
For the Original Poster: Your Issue appears to be related to kerberos for zookeeper. This guide[1] might help [1]http://www.cloudera.com/documentation/enterprise/5-4-x/topics/cm_sg_sec_troubleshooting.html
... View more
01-18-2016
02:10 PM
You won't save HDFS filesystem space by "archiving" or "combining" small files. In many scenarios you will get a performance boost from combining. You will also reduce the metadata overhead on the namenode by combining as well.
... View more
07-17-2015
01:04 PM
I'd think you could use an orchestrator service like Oozie to do this for you. http://oozie.apache.org/
... View more
06-16-2015
03:04 PM
1 Kudo
usually this is due to the map/reduce job running the regionserver OOM from a massive amount of requests or many very large requests. Then, the region of interest moves to the next regionserver, OOM again, etc. Generally to fix this you either have to give the map/reduce job less resources or you have to increase regionserver memory, sometimes you may be able to just tune down your hbase scan cache setting in your MR job. you can confirm my suspicions by checking your garbage collection log for Out of Memory errors.
... View more
03-13-2015
11:40 PM
Hello Yibin, This reply is 16 days later so hopefully you've already solved this. If not, from your logs it looks like the .Meta region is not online on any of your regionservers. Meta is crucial to Hbase and without it being online almost no operations will succeed. Are your regionservers reporting in to the hmaster? the hmaster should be agressively trying to assign the meta region. It could also be that hbase can't assign META because HDFS is not actually up. I'd advise checking over your hbase configuration and following the advice here[1]: [1]http://hbase.apache.org/book.html#confirm Hope this helps! -Ben
... View more
03-13-2015
11:27 PM
1 Kudo
Hello Thai, Going from CDH 4.5 (hbase .94.x) to CDH 5.3 (hbase .98.x) is actually 2 major version jumps. The big one is that .96 introduced "the singularity" which in short means hbase is not wire compatable between the two versions. [1] If you absolutely can't upgrade both clusters to the same version at the same time, then you will have to disable replication and create your own way of replicating the data. I know of two methods to move the data between clusters that would still work: 1: The rest client (slow, so may not be able to keep up depending on your use case) 2: Export to HDFS -> Distcp -> Import to target cluster (batch, so there will be a large lag in syncronization) [1]http://hbase.apache.org/book.html#upgrade0.96 Hope this helps! -Ben
... View more
08-04-2014
01:12 PM
6 Kudos
The error indicates that mapreduce wants to be able to write to /. you have the owner as hdfs with rwx, you have groups with r-x, and others set to r-x. Since you added mapred to the groups membership earlier by adding it to supergroup and making supergroup the group for / it is the group level permissions that we will need to modify. To get it working you can do the following: sudo -u hdfs hdfs dfs -chmod 775 / this will change the permissions on / to drwxrwxr-x as for why mapreduce is trying to write to / it may be that it's trying to create /user and /tmp that you have defined as the user space and the temporary space. if you don't have those directories you could instead do the following: sudo -u hdfs hdfs dfs -mkdir /user sudo -u hdfs hdfs dfs -chown mapred:mapred /user sudo -u hdfs hdfs dfs -mkdir /tmp sudo -u hdfs hdfs dfs -chown mapred:mapred /tmp
... View more
07-24-2014
12:06 PM
Hi Mike, yes I believe you are on the right track. you would need to add the remote host to the cluster, and then make it part of the gateway role group for hdfs and the gateway role group for mapreduce for that cluster. You could create a new rolegroup if the remote host needs to have different configuration than the other gateway nodes in your cluster for whatever reason.
... View more
07-22-2014
12:34 PM
1 Kudo
You are right, I don't see this in hadoop metrics or in jmx stats in the active namenode on my clusters either. here is a way to get what you are after in hdfs commands: hdfs dfs -du /user/*/.Trash
... View more
- « Previous
- Next »