Member since
01-20-2017
17
Posts
1
Kudos Received
0
Solutions
12-12-2024
09:40 AM
Though one can do the manual intervention to fix the under replicated blocks, HDFS has matured a lot and the NameNode will take care of fixing the under replicated blocks on its own. The drawback for doing the manual step is that it may add additional load to the NameNode Operations and may cause performance degradation with existing jobs. So if you plan to do manually you may do it at least business hours or over the weekend.
... View more
10-10-2018
06:20 PM
Sorry to come to this party so late, but the script as presented at
https://community.hortonworks.com/articles/38149/how-to-create-and-register-custom-ambari-alerts.html doesn't work on CentOS7 + Python 2.7 + Ambari 2.6.2.2. I can write a mean bash script, but I'm not a Python coder. In spite of my deficiencies, I got things working.
As Dmitro implies, the script by default tries to asses utilization of all mounts, not just mounted block devices - and when you're looking at shared memory or proc objects and similar, that quickly becomes problematic. The solution posted here - a custom list of mount points - works, but isn't flexible. Without extensive rewriting of the script, it would be better to just strip out things like '/sys', '/proc', '/dev', and '/run'. We also need to strip out net_prio and cpuacct.
So, with the understanding that there's almost certainly a better way to do this, I changed: print "mountPoints = " + mountPoints
mountPointsList = mountPoints.split(",")
print mountPointsList
for l in mountPointsList: to: print "mountPoints = " + mountPoints
mountPointsList = mountPoints.split(",")
mountPointsList = [ x for x in mountPointsList if not x.startswith('net_pri')]
mountPointsList = [ x for x in mountPointsList if not x.startswith('cpuacc')]
mountPointsList = [ x for x in mountPointsList if not x.startswith('/sys')]
mountPointsList = [ x for x in mountPointsList if not x.startswith('/proc')]
mountPointsList = [ x for x in mountPointsList if not x.startswith('/run')]
mountPointsList = [ x for x in mountPointsList if not x.startswith('/dev')]
print mountPointsList
for l in mountPointsList: And it works. It's perhaps also worth noting that to get the script to run from the command line, you'll need to link several library directory structures, similar to: ln -s /usr/lib/ambari-server/lib/resource_management /usr/lib/python2.7/site-packages/
ln -s /usr/lib/ambari-server/lib/ambari_commons /usr/lib/python2.7/site-packages/
ln -s /usr/lib/ambari-server/lib/ambari_simplejson /usr/lib/python2.7/site-packages/ After that, you can do like so: # python test_alert_disk_space.py
mountPoints = ,/sys,/proc,/dev,/sys/kernel/security,/dev/shm,/dev/pts,/run,/sys/fs/cgroup,/sys/fs/cgroup/systemd,/sys/fs/pstore,/sys/fs/cgroup/cpu,cpuacct,/sys/fs/cgroup/net_cls,net_prio,/sys/fs/cgroup/hugetlb,/sys/fs/cgroup/blkio,/sys/fs/cgroup/devices,/sys/fs/cgroup/perf_event,/sys/fs/cgroup/freezer,/sys/fs/cgroup/cpuset,/sys/fs/cgroup/memory,/sys/fs/cgroup/pids,/sys/kernel/config,/,/sys/fs/selinux,/proc/sys/fs/binfmt_misc,/dev/mqueue,/sys/kernel/debug,/dev/hugepages,/data,/boot,/proc/sys/fs/binfmt_misc,/run/user/1000,/run/user/0
['', '/', '/data', '/boot']
---------- l :
FINAL finalResultCode CODE .....
---------- l : /
/
disk_usage.total
93365735424
=>OK
FINAL finalResultCode CODE .....OK
---------- l : /data
/data
disk_usage.total
1063256064
=>OK
FINAL finalResultCode CODE .....OK
---------- l : /boot
/boot
disk_usage.total
1063256064
=>OK
FINAL finalResultCode CODE .....OK
... View more
08-02-2018
07:29 PM
Bless you!!
... View more
04-16-2019
01:08 PM
Referring to the PUT command used for triggering the alert manually, Is it possible to pass some parameters/custom information to the script which gets trigged by the alert ? If yes, is it via headers OR body of the PUT request ? Thanks.
... View more
08-27-2018
08:30 PM
The article doesn't indicate this, so for reference, the listed HDFS settings do not exist by default. These settings, as shown below, need to go into hdfs-site.xml, which is done in Ambari by adding fields under "Custom hdfs-site". dfs.namenode.rpc-bind-host=0.0.0.0 dfs.namenode.servicerpc-bind-host=0.0.0.0 dfs.namenode.http-bind-host=0.0.0.0 dfs.namenode.https-bind-host=0.0.0.0 Additionally, I found that after making this change, both NameNodes under HA came up as stand-by; the article at https://community.hortonworks.com/articles/2307/adding-a-service-rpc-port-to-an-existing-ha-cluste.html got me the missing step of running a ZK format. I have not tested the steps below against a Production cluster and if you foolishly choose to follow these steps, you do so at a very large degree of risk (you could lose all of the data in your cluster). That said, this worked for me in a non-Prod environment: 01) Note the Active NameNode. 02) In Ambari, stop ALL services except for ZooKeeper. 03) In Ambari, make the indicated changes to HDFS. 04) Get to the command line on the Active NameNode (see Step 1 above). 05) At the command line you opened in Step 4, run: `sudo -u hdfs hdfs zkfc -formatZK` 06) Start the JournalNodes. 07) Start the zKFCs. 08) Start the NameNodes, which should come up as Active and Standby. If they don't, you're on your own (see the "high risk" caveat above). 09) Start the DataNodes. 10) Restart / Refresh any remaining HDFS components which have stale configs. 11) Start the remaining cluster services. It would be great if HWX could vet my procedure and update the article accordingly (hint, hint).
... View more
10-04-2018
06:01 PM
Go to https://support.hortonworks.com/CustomerPortalLoginPageV2?ec=302&startURL=%2Fs%2F Then Login >> Tools >> Generate Smart Sense ID There you will be able to see your Account Name and CustomerID. Now go to Ambari >> SmartSense >> Config FIll in your details and restart SmartSense Analyzer. Now you will be able to generate Bundles.
... View more