About bandarusridhar1

bandarusridhar1 · ‎04-19-2018

With HIVE-13670 Till today we need to remember the complete Hive Connection String either you are using direct 1000 port or ZK connection string. After the above Jira we can optimize that by setting up the environment variable(/etc/profile) on the Edge nodes. export BEELINE_URL_HIVE="<jdbc url>" Example: export BEELINE_URL_HIVE="jdbc:hive2://<ZOOKEEPER QUORUM>/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2" Now just type beeline -u HIVE Even we can setup multiple connection strings just by setting different naming connections like BEELINE_URL_BATCH, BEELIVE_URL_LLAP. Hope this helps you.

ManoogianDavid · ‎03-18-2018

The problem was because of iptables. I turn off it and it works with NAT (port forwarding)

shumailshaikh63 · ‎08-20-2018

@Dhiraj Refer the below article: https://community.hortonworks.com/storage/attachments/5493-bench-marking-and-stress-testing-ilovepdf-compress.pdf

kbadani · ‎01-25-2018

@Sridhar Reddy Since Spark2 interpreter is in globally shared mode, there is only one Spark2 session (i.e. Spark2 context) shared between all users and all notebooks in zeppelin. A variable defined in one paragraph of one notebook maybe accessed freely in other paragraphs of the same notebook, and for that matter paragraphs of other notebooks as well. Attaching screenshots screen-shot-2018-01-25-at-14317-pm.png screen-shot-2018-01-25-at-14344-pm.png

lime_raghu · ‎01-24-2018

great, worked like charm

jarnold · ‎10-10-2018

Sorry to come to this party so late, but the script as presented at https://community.hortonworks.com/articles/38149/how-to-create-and-register-custom-ambari-alerts.html doesn't work on CentOS7 + Python 2.7 + Ambari 2.6.2.2. I can write a mean bash script, but I'm not a Python coder. In spite of my deficiencies, I got things working. As Dmitro implies, the script by default tries to asses utilization of all mounts, not just mounted block devices - and when you're looking at shared memory or proc objects and similar, that quickly becomes problematic. The solution posted here - a custom list of mount points - works, but isn't flexible. Without extensive rewriting of the script, it would be better to just strip out things like '/sys', '/proc', '/dev', and '/run'. We also need to strip out net_prio and cpuacct. So, with the understanding that there's almost certainly a better way to do this, I changed: print "mountPoints = " + mountPoints mountPointsList = mountPoints.split(",") print mountPointsList for l in mountPointsList: to: print "mountPoints = " + mountPoints mountPointsList = mountPoints.split(",") mountPointsList = [ x for x in mountPointsList if not x.startswith('net_pri')] mountPointsList = [ x for x in mountPointsList if not x.startswith('cpuacc')] mountPointsList = [ x for x in mountPointsList if not x.startswith('/sys')] mountPointsList = [ x for x in mountPointsList if not x.startswith('/proc')] mountPointsList = [ x for x in mountPointsList if not x.startswith('/run')] mountPointsList = [ x for x in mountPointsList if not x.startswith('/dev')] print mountPointsList for l in mountPointsList: And it works. It's perhaps also worth noting that to get the script to run from the command line, you'll need to link several library directory structures, similar to: ln -s /usr/lib/ambari-server/lib/resource_management /usr/lib/python2.7/site-packages/ ln -s /usr/lib/ambari-server/lib/ambari_commons /usr/lib/python2.7/site-packages/ ln -s /usr/lib/ambari-server/lib/ambari_simplejson /usr/lib/python2.7/site-packages/ After that, you can do like so: # python test_alert_disk_space.py mountPoints = ,/sys,/proc,/dev,/sys/kernel/security,/dev/shm,/dev/pts,/run,/sys/fs/cgroup,/sys/fs/cgroup/systemd,/sys/fs/pstore,/sys/fs/cgroup/cpu,cpuacct,/sys/fs/cgroup/net_cls,net_prio,/sys/fs/cgroup/hugetlb,/sys/fs/cgroup/blkio,/sys/fs/cgroup/devices,/sys/fs/cgroup/perf_event,/sys/fs/cgroup/freezer,/sys/fs/cgroup/cpuset,/sys/fs/cgroup/memory,/sys/fs/cgroup/pids,/sys/kernel/config,/,/sys/fs/selinux,/proc/sys/fs/binfmt_misc,/dev/mqueue,/sys/kernel/debug,/dev/hugepages,/data,/boot,/proc/sys/fs/binfmt_misc,/run/user/1000,/run/user/0 ['', '/', '/data', '/boot'] ---------- l : FINAL finalResultCode CODE ..... ---------- l : / / disk_usage.total 93365735424 =>OK FINAL finalResultCode CODE .....OK ---------- l : /data /data disk_usage.total 1063256064 =>OK FINAL finalResultCode CODE .....OK ---------- l : /boot /boot disk_usage.total 1063256064 =>OK FINAL finalResultCode CODE .....OK

bandarusridhar1 · ‎01-09-2018

@prarthana basgod As the official HBase book states: You may need to find a sweet spot between a low number of RPCs and the memory used on the client and server. Setting the scanner caching higher will improve scanning performance most of the time, but setting it too high can have adverse effects as well: each call to next() will take longer as more data is fetched and needs to be transported to the client, and once you exceed the maximum heap the client process has available it may terminate with an OutOfMemoryException. When the time taken to transfer the rows to the client, or to process the data on the client, exceeds the configured scanner lease threshold, you will end up receiving a lease expired error, in the form of a ScannerTimeoutException being thrown. So it would be better not to avoid the exception by the above configuration, but to set the caching of your Map side lower, enabling your mappers to process the required load into the pre-specified time interval. Even you can increase <property> <name>hbase.regionserver.lease.period</name> <value>300000</value> </property> Hope this helps you.

bandarusridhar1 · ‎09-30-2017

@frank policano May I know what version of HDP are you using? HDFS-6621 and officially released as part of Apache Hadoop 2.6.0. Since this is a bug in the Balancer itself, it is possible to run an updated version of the Balancer without upgrading your cluster. Datanodes will limit the number of threads used for balancing so as to not eat up all the resources of the cluster/datanode. This is what causes the WARN statement you're seeing. By default the number of threads is 5. This was not configurable prior to Apache Hadoop 2.5.0. HDFS-6595added this proeprty dfs.datanode.balance.max.concurrent.moves to allow you to control the number of threads used for balancing. Since this is a datanode side property, this will require an upgrade to your cluster if you want to use this setting. https://stackoverflow.com/questions/25222633/hadoop-balancer-command-warn-messages-threads-quota-is-exceeded Hope this article helps in resolving balancer issue by running from commandline https://community.hortonworks.com/questions/19694/help-with-exception-from-hdfs-balancer.html Hope this helps you.

kabadou_rawia · ‎09-25-2017

@PJ Thank you for your reply.

bandarusridhar1 · ‎09-19-2017

The reason why Ambari is unable to start Namenode smoothly is bug and below is the workaround. Issue got fixed permanently in Ambari 2.5.x. Few lines of Error message from Ambari Ops logs: File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/decorator.py", line 55, in wrapper return function(*args, **kwargs) File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py", line 562, in is_this_namenode_active raise Fail(format("The NameNode {namenode_id} is not listed as Active or Standby, waiting..."))resource_management.core.exceptions.Fail: The NameNode nn2 is not listed as Active or Standby, waiting... ROOT CAUSE: https://issues.apache.org/jira/browse/AMBARI-18786 RESOLUTION: Increase the timeout in /var/lib/ambari-server/resources/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py from this; @retry(times=5, sleep_time=5, backoff_factor=2,err_class=Fail) to this; @retry(times=25, sleep_time=25, backoff_factor=2,err_class=Fail)

Online	Offline
Last Visited	‎04-28-2023 03:22 PM

Member Since	‎04-13-2016 05:38 PM
Last Visited	‎04-28-2023 03:22 PM
Posts	422
Kudos received	149

Cloudera Community

Re: yarn local cache on ssd

Re: Where to add timeout configuration for hive on...

Re: restrict user access to queues

Re: How create blueprint of existing cluster and h...

Re: Not able to run HDFS command

Easy way Hive Connection String Setup

Re: There is no way to get ambari UI after install...

Re: How to run the benchmarking tests Teragen, Ter...

Re: Global Variables in Zeppelin Notebook

Re: Not able to run HDFS command

Re: Custom ambari alerts error "test_alert_disk_sp...

Re: Copy table operation fails on cluster

Re: hdfs balancer

Re: how to check disks number ?

Unable to start/restart standby namenode smoothly ...