About nsabharwal

nsabharwal · ‎10-30-2015

@Andrew Grande Thanks for sharing the HBASE approach. Nice!!!

nsabharwal · ‎10-30-2015

If you are asking about iptables then iptables = on port exceptions stays on or knox plays its charm.

nsabharwal · ‎10-30-2015

@Mats Johansson Yes. Link With Alert Groups and Notifications, you can create groups of alerts and setup notification targets for each group. This way, you can notify different parties interested in certain sets of alerts via different methods. For example, you might want your Hadoop Operations team to receive all alerts via EMAIL, regardless of status. And at the same time, have your System Administration team receive all RPC and CPU related alerts that are Critical only via SNMP. To achieve this scenario, you would have an Alert Notification that handles Email for all alert groups for all severity levels, and you would have a different Alert Notification group that handles SNMP on critical severity for an Alert Group that contains the RPC and CPU alerts.

nsabharwal · ‎10-30-2015

What database are you using for Metastore? @Wes Floyd

nsabharwal · ‎10-30-2015

Nicely explained! Thanks @Alex Miller

nsabharwal · ‎10-30-2015

@Alex Miller Generally, Its ok to deploy ZK with other components (dedicated server not required). As you know , odd number of zk is the best practice and I see no issues in deploying in HA node but I would deploy in non HA node too to keep the balance. Link Here are some common problems you can avoid by configuring ZooKeeper correctly: Gotchas: Common Problems and Troubleshooting If you are using watches, you must look for the connected watch event. When a ZooKeeper client disconnects from a server, you will not receive notification of changes until reconnected. If you are watching for a znode to come into existance, you will miss the event if the znode is created and deleted while you are disconnected. You must test ZooKeeper server failures. The ZooKeeper service can survive failures as long as a majority of servers are active. The question to ask is: can your application handle it? In the real world a client's connection to ZooKeeper can break. (ZooKeeper server failures and network partitions are common reasons for connection loss.) The ZooKeeper client library takes care of recovering your connection and letting you know what happened, but you must make sure that you recover your state and any outstanding requests that failed. Find out if you got it right in the test lab, not in production - test with a ZooKeeper service made up of a several of servers and subject them to reboots. The list of ZooKeeper servers used by the client must match the list of ZooKeeper servers that each ZooKeeper server has. Things can work, although not optimally, if the client list is a subset of the real list of ZooKeeper servers, but not if the client lists ZooKeeper servers not in the ZooKeeper cluster. Be careful where you put that transaction log. The most performance-critical part of ZooKeeper is the transaction log. ZooKeeper must sync transactions to media before it returns a response. A dedicated transaction log device is key to consistent good performance. Putting the log on a busy device will adversely effect performance. If you only have one storage device, put trace files on NFS and increase the snapshotCount; it doesn't eliminate the problem, but it can mitigate it. Set your Java max heap size correctly. It is very important to avoid swapping. Going to disk unnecessarily will almost certainly degrade your performance unacceptably. Remember, in ZooKeeper, everything is ordered, so if one request hits the disk, all other queued requests hit the disk. To avoid swapping, try to set the heapsize to the amount of physical memory you have, minus the amount needed by the OS and cache. The best way to determine an optimal heap size for your configurations is to run load tests. If for some reason you can't, be conservative in your estimates and choose a number well below the limit that would cause your machine to swap. For example, on a 4G machine, a 3G heap is a conservative estimate to start with.

nsabharwal · ‎10-30-2015

@pardeep In 2.6.0 - NO Fixed version...It's in the link that you posted. My cluster has not reported the issue. HDP 2.3.2

nsabharwal · ‎10-30-2015

@ayusuf@hortonworks.com This is good explanation ...namenode will know about the stale DN dfs.namenode.stale.datanode.interval Default time interval for marking a datanode as "stale", i.e., if the namenode has not received heartbeat msg from a datanode for more than this time interval, the datanode will be marked and treated as "stale" by default. The stale interval cannot be too small since otherwise this may cause too frequent change of stale states. We thus set a minimum stale interval value (the default value is 3 times of heartbeat interval) and guarantee that the stale interval cannot be less than the minimum value. A stale data node is avoided during lease/block recovery. It can be conditionally avoided for reads (see dfs.namenode.avoid.read.stale.datanode) and for writes (see dfs.namenode.avoid.write.stale.datanode).

nsabharwal · ‎10-30-2015

@Ofer Mendelevith Why are we not including IPython in HDP stack?

nsabharwal · ‎10-30-2015

Are you trying to configure local repo? or Whats the customization in repo file?

Online	Offline
Last Visited	‎07-18-2019 05:10 PM

Member Since	‎09-18-2015 05:49 PM
Last Visited	‎07-18-2019 05:10 PM
Posts	3,274
Kudos received	1129

Cloudera Community

Re: Is Ranger KMS Encryption FIPS 140-2 compliant ...

Re: How to add another HiveServer for current meta...

Re: FQDNs - are they necessary?

Re: java.io.FileNotFoundException: (Is a director...

Re: Need Design/Architecture Suggestion on HDP & H...

Re: Maximum Hive Table Partitions allowed & recomm...

Re: Is there a valid use case for activating Hadoo...

Re: Do Ambari Metrics support snmp for Metric valu...

Re: Maximum Hive Table Partitions allowed & recomm...

Re: How to identify stale datanode?

Re: Best practices for zookeeper placement?

Re: Is MAPREDUCE-5649 fixed in HDP 2.2.6?

Re: How to identify stale datanode?

Re: Zeppelin vs. IPython notebook(Jupyter)

Re: How can I avoid Ambari overwriting HDP.repo an...