About sagarshimpi

sagarshimpi · ‎11-18-2019

Hi@Manoj690 , Seems your AMS hbase master is not able to start - Please try below steps - In the Ambari Dashboard, go to the 'Ambari Metrics' section and under the 'Service Actions' dropdown click 'Stop'. Check and confirm from backend that ams process is stopped. If the process is still running use below command to stop - # ambari-metrics-collector stop Delete all AMS Hbase data - In the Ambari Dashboard, under the Ambari Metrics section do a search for the following configuration values "hbase.rootdir" Remove entire files from “hbase.rootdir” Eg. #hdfs dfs -cp /user/ams/hbase/* /tmp/ #hdfs dfs -rm -r -skipTrash /user/ams/hbase/* In the Ambari Dashboard, under the Ambari Metrics section do a search for the following configuration values “hbase.tmp.dir”. Backup the directory and remove the data. Eg. #cp /var/lib/ambari-metrics-collector/hbase-tmp/* /tmp/ #rm –fr /var/lib/ambari-metrics-collector/hbase-tmp/* Remove the znode for hbase in zookeeper cli Login to ambari UI -> Ambari Metrics -> Configs -> Advance ams-hbase-site and search for property “zookeeper.znode.parent” #/usr/hdp/current/zookeeper-client/bin/zkCli.sh #rmr /ams-hbase-secure Start AMS Let know if you still have issue

sagarshimpi · ‎11-17-2019

Hi@Manoj690 1. First check which is the directory configured for dfs data storage. Login to Ambari UI -> Services-> HDFS -> Configs -> [search for dfs.datanode.data.dir] Capture this list of directories defined here. EG. i have list as below - /data01/hadoop/hdfs/data,/data02/hadoop/hdfs/data 2. Login to the datanodes and go to the mount. In my case- $cd /data01/hadoop/hdfs/ 3. Check if there is any directory/data inside "/data01/hadoop/hdfs/" or "/data01/" 4. Other data than "**data" is consider as nondfs and which will show you in dfsadmin -report. 5. You need to get rid of those data which will lesser your NON DFS used. You can share your output if you have any confusion/need help.

sagarshimpi · ‎11-17-2019

Probably i see chronyd command are similar to NTP - you can refer this for debugging - https://www.thegeekdiary.com/centos-rhel-7-tips-on-troubleshooting-ntp-chrony-issues/

sagarshimpi · ‎11-17-2019

Please check this once - Try running "ntpdate ipsap01.ecb.de" on all hosts and check if any issue reported while running this command Make sure chronyd/ntp.conf is same on all nodes hwclock--systohc systemctl restart cloudera-scm-agent Further more if the above wont help then you need to debug ntp server side. Execute below commands - ntpq -c pe The output shown is good, but note that if the refid column indicates ".INIT." it can suggest a communication issue. ntpq -c as The output below is good however if the reach column indicates "no" it suggests that the client cannot reach peer hosts. You probably need to check stratum of your ntp servers - The "assID" from ntpq -c as can be used with command ntpq -c "rv assID" to determine the "stratum". The lower the stratum the better. The upper limit for stratum is 15; stratum 16 is used to indicate that a device is unsynchronized. ntpq -c "rv <association_id_from_above_command_output>"

sagarshimpi · ‎11-15-2019

@mike_bronson7 you just need to backup /hadoop/hdfs/namenode/current from active namenode Also if you backup one week earlier the activity and lets say your first cluster is going serve more request to clients then you will loose that data which was written after backup. So best is to do savenamespace and backup when you are going to do activity and freeze clients not accessing the cluster.

sagarshimpi · ‎11-15-2019

Backup i mean, copy the namenode current directory only first do safemode on and then save namespace. once both commands are executed take backup of namenode current directory from active node. you can copy to destination/new cluster using any command (like scp) or tool. scp sill be simplest option.

sagarshimpi · ‎11-15-2019

1. if you can backup metadata drom orignal cluster(where datanode were existing at first) and copy that metadata to new cluster then thats the best option. 2. if you are not able to go with point 1, then probably you can try " hadoop namenode -recover" option. below link might be useful https://blog.cloudera.com/understanding-hdfs-recovery-processes-part-1/ https://clouderatemp.wpengine.com/blog/2015/03/understanding-hdfs-recovery-processes-part-2/

sagarshimpi · ‎11-15-2019

@mike_bronson7 What i got from your scenario is on second scratch installation your master nodes [ie. active/standby name-node ] are fresh installed and you are only adding the datanodes which are having pre-existing data [from other cluster]..right!! -- In this case its not possible to get the cluster up with new data from the HDD which was restored. Since namenode will not have any information about the blocks lying in blockstorage on the datanode disk. If you have opted support from Cloudera then you can approach them for DR scenario where they can help you to get existing data from datanodes to be added back in cluster[not sure if it can be recovered/added back 100%] Same for kafka.

sagarshimpi · ‎11-14-2019

@BaoHoYou can reach on below number - or subcribe with details here https://www.cloudera.com/contact-sales.html and sales person will reach to you. US: (888) 789-1488 International: +1 (650) 362-0488

sagarshimpi · ‎11-14-2019

@bdelpizzo Can you see any error in kafka logs/mirror maker logs? It might be possible that the mirror maker is not able to process messages, because of size of message. If the size of any message is high than configured/default value then it might stuck in queue. Check for message.max.bytes property

Online	Offline
Last Visited	‎10-24-2024 09:03 AM

Member Since	‎02-18-2016 01:33 AM
Last Visited	‎10-24-2024 09:03 AM
Posts	141
Kudos received	19

Cloudera Community

Re: Using yarn logs command

Re: Using yarn logs command

Re: Data replication in datanode new

Re: Data replication in datanode new

Re: Mysql JDBC connection error for ambari install...

Re: Ambari metircs not started

Re: How to delete non-DFS stoage data

Re: Random "Clock offset bad" alerts

Re: Random "Clock offset bad" alerts

Re: how to recover HDP cluster by installing HDP f...

Re: how to recover HDP cluster by installing HDP f...

Re: how to recover HDP cluster by installing HDP f...

Re: how to recover HDP cluster by installing HDP f...

Re: CDP Pricing - what does CCU mean

Re: When there are a lot message the mirror maker ...