About jmspaggi

jmspaggi · ‎05-10-2015

Hi TS, are you still facing this issue too? Have you changed back to 3 replicates? Or still configured with 2? 1) Should I se "hadoop fs -setrep" to change the replication factor of certain files? JMS: No. Keep it the way it is for now. 2) What's the manual way to 'force' the affected blocks to replicate themselves? JMS: It depends... If they are configure to replicate 100 times, you might not have enought nodes and you can not force that. How many nodes do you have in your cluster? Car you past here part of the fsck output? 3) Should I remove permanetly certain types of files? For instance, in the fsch log report I am seeing a lot of files with of this type: /user/hue/.Trash/150507010000/user/hue/.cloudera_manager_hive_metastore_canary/hive0_hms/cm_test_table1430446320640/p1=p1/p2=421 <dir> /user/hue/.Trash/150507010000/user/hue/.cloudera_manager_hive_metastore_canary/hive0_hms/cm_test_table1430446620772 <dir> /user/hue/.Trash/150507010000/user/hue/.cloudera_manager_hive_metastore_canary/hive0_hms/cm_test_table1430446620772/p1=p0 <dir> JMS: This is the trash. If you don't need those file, clean the trash? 4) How about the /tmp/logs/ files? Dp I reset their setrep setting or periodically remove them? JMS: Same thing. Temporary files. Can you list them to make sure? You might be able to delete them. 5) I am also having quite a few Accumulo tables reporting under-replicated blocks! JMS: Here again, please paste the logs here. This one is the most concerning. They should have the default, except if accumulo set that to more than the factor 2 you have. JMS

jmspaggi · ‎05-10-2015

You're very welcome! Glad it now works for you. Enjoy your cluster! Best, JMS

jmspaggi · ‎05-10-2015

So ZK is working fine. Perfect. Make sure HBase is down and clear everything: hadoop fs -rm -r /hbase/* echo "rmr /hbase" | zookeeper-client Then try to start HBase. JMS

jmspaggi · ‎05-10-2015

When you type "zookeeper-client" what's the output? Does it ends with something like: 2015-05-10 17:08:53,831 [myid:] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@975] - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error) 2015-05-10 17:08:53,832 [myid:] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@852] - Socket connection established to localhost/127.0.0.1:2181, initiating session 2015-05-10 17:08:53,842 [myid:] - INFO [main-SendThread(localhost:2181):ClientCnxn$SendThread@1235] - Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x14d3c2b942f0793, negotiated timeout = 30000 WATCHER:: WatchedEvent state:SyncConnected type:None path:null Can you try to type "help"? And "ls /" And "ls /hbase" ? I don't see any error in your logs. Only INFO. Thanks.

jmspaggi · ‎05-10-2015

Hi, I think there is no need to restart everything for now. We need to focus on your ZK issue. You can stop HBase until it's fixed. Can you share the ZK logs (/var/log/zookeeper/)? How is ZK looking in CM? All green? How many ZK servers? JMS

jmspaggi · ‎05-09-2015

Hi TS, sorry, had to transit to was not able to reply yesterday. So there is 2 issues here. 1) You can not access ZK 2) HBase can not create the namespace table. They might be related. If we can not access ZK, maybe HBase can't too. So let's focus on the ZK issue first. Please stop all HBase services and make sure ZK is running. When you try to run the zkClient command, what do you get? What do you have on the ZK logs? Have you tried to restart the ZK service? JM

jmspaggi · ‎05-08-2015

OK. The namespace table is not there. So your HBase is empty, right? Can you provide hadoop fs -ls -R /hbase/data/ then? Have you also cleared the ZK nodes? If you want to clear HBase correctly you need to: 1) Clear /hbase/* and make sure /hbase belongs to the HBase user 2) Clear all HBase zk nodes. Thanks, JM

jmspaggi · ‎05-08-2015

Hi, Can you please describe how you arrived into this? Have you tried to remove some files from HDFS or some data from ZK? From the logs, sound like the master is not able to find the namespace table, so tries to create it and failed because it exist. Can you paste the result of the following command? hadoop fs -ls -R /hbase/data/hbase/ Thanks, JMS

jmspaggi · ‎05-08-2015

Ok. so here is the complete situation. When you run a MR on top of a Snapshot, the MR framework will look at all the inputs and create all the tasks for that. However, those tasks might have to wait for some time to be executed depending on the number of slots available on the cluster vs the number of tasks. The issue is, if while the tasks are pending one of the input is move/deleted/split/merged, etc. then the splits are not pointing anymore to a valid input and the MR job wil fail. To avoid that, we have to create al lthe links to all the inputs to make sure HBase keep a reference to those files even if they have to me moved, the same way a snapshot is doing. The issue is, those links have to be in the /hbase folder. And this is why you need the rights for that. So to be able to run a MR job on top of a snapshot you need a user with reads/writes access to the /hbase folder. This should be fixed in HBase 1.2 (but it's just on the plans for now and you will need to double check wen we will be closer to 1.2). Also, please keep in mind that doing MR on top of Snapshots bypass all the HBase layers. Therefore, if there is any ACLs or Cell level security activated on the initial table, then will all by bypassed by the MR job. Everything will be readable by the job. Let me know if you have any other question or if I can help with anything. HTH. JMS

jmspaggi · ‎05-08-2015

FYI, I'm able to reproduce the issue. Steps: 1) Download 5.3.0 VM, 2) Change hadoop-site.xml to manage permissions, 3) Create and fill a table, 4) Create snapshot, 5) Try to MR over it. I'm now debugging step by step to see where it's coming from. Can you please send me your TableMapReduceUtil.initTableSnapshotMapperJob line? Thanks, JM

Online	Offline
Last Visited	‎04-02-2019 04:18 PM

Member Since	‎07-08-2013 02:11 PM
Last Visited	‎04-02-2019 04:18 PM
Posts	35
Kudos received	11

Cloudera Community

Re: How to activate HBase Bulk Delete?

Re: Linux OS build - disk space requirements for C...

Re: HDFS Under-Replicated Blocks (High Number)

Re: HBase Master: Failed to become active master

Re: HBase snapshots as Map-reduce job input - perm...

Re: HDFS Under-Replicated Blocks (High Number)

Re: HBase Master: Failed to become active master

Re: HBase Master: Failed to become active master

Re: HBase Master: Failed to become active master

Re: HBase Master: Failed to become active master

Re: HBase Master: Failed to become active master

Re: HBase Master: Failed to become active master

Re: HBase Master: Failed to become active master

Re: HBase snapshots as Map-reduce job input - perm...

Re: HBase snapshots as Map-reduce job input - perm...