Member since
09-18-2015
3274
Posts
1159
Kudos Received
426
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 2555 | 11-01-2016 05:43 PM | |
| 8469 | 11-01-2016 05:36 PM | |
| 4848 | 07-01-2016 03:20 PM | |
| 8162 | 05-25-2016 11:36 AM | |
| 4305 | 05-24-2016 05:27 PM |
10-30-2015
09:37 PM
1 Kudo
@Andrew Grande Thanks for sharing the HBASE approach. Nice!!!
... View more
10-30-2015
09:35 PM
1 Kudo
If you are asking about iptables then iptables = on port exceptions stays on or knox plays its charm.
... View more
10-30-2015
06:43 PM
1 Kudo
@Mats Johansson Yes. Link With Alert Groups and Notifications, you can create groups of alerts and setup notification targets for each group. This way, you can notify different parties interested in certain sets of alerts via different methods. For example, you might want your Hadoop Operations team to receive all alerts via EMAIL, regardless of status. And at the same time, have your System Administration team receive all RPC and CPU related alerts that are Critical only via SNMP. To achieve this scenario, you would have an Alert Notification that handles Email for all alert groups for all severity levels, and you would have a different Alert Notification group that handles SNMP on critical severity for an Alert Group that contains the RPC and CPU alerts.
... View more
10-30-2015
05:53 PM
What database are you using for Metastore? @Wes Floyd
... View more
10-30-2015
09:33 AM
7 Kudos
@Alex Miller Generally, Its ok to deploy ZK with other components (dedicated server not required). As you know , odd number of zk is the best practice and I see no issues in deploying in HA node but I would deploy in non HA node too to keep the balance. Link Here are some common problems you can avoid by configuring ZooKeeper correctly:
Gotchas: Common Problems and Troubleshooting
If you are using watches, you must look for the connected watch event. When a ZooKeeper client disconnects from a server, you will not receive notification of changes until reconnected. If you are watching for a znode to come into existance, you will miss the event if the znode is created and deleted while you are disconnected. You must test ZooKeeper server failures. The ZooKeeper service can survive failures as long as a majority of servers are active. The question to ask is: can your application handle it? In the real world a client's connection to ZooKeeper can break. (ZooKeeper server failures and network partitions are common reasons for connection loss.) The ZooKeeper client library takes care of recovering your connection and letting you know what happened, but you must make sure that you recover your state and any outstanding requests that failed. Find out if you got it right in the test lab, not in production - test with a ZooKeeper service made up of a several of servers and subject them to reboots. The list of ZooKeeper servers used by the client must match the list of ZooKeeper servers that each ZooKeeper server has. Things can work, although not optimally, if the client list is a subset of the real list of ZooKeeper servers, but not if the client lists ZooKeeper servers not in the ZooKeeper cluster. Be careful where you put that transaction log. The most performance-critical part of ZooKeeper is the transaction log. ZooKeeper must sync transactions to media before it returns a response. A dedicated transaction log device is key to consistent good performance. Putting the log on a busy device will adversely effect performance. If you only have one storage device, put trace files on NFS and increase the snapshotCount; it doesn't eliminate the problem, but it can mitigate it. Set your Java max heap size correctly. It is very important to avoid swapping. Going to disk unnecessarily will almost certainly degrade your performance unacceptably. Remember, in ZooKeeper, everything is ordered, so if one request hits the disk, all other queued requests hit the disk. To avoid swapping, try to set the heapsize to the amount of physical memory you have, minus the amount needed by the OS and cache. The best way to determine an optimal heap size for your configurations is to run load tests. If for some reason you can't, be conservative in your estimates and choose a number well below the limit that would cause your machine to swap. For example, on a 4G machine, a 3G heap is a conservative estimate to start with.
... View more
10-30-2015
01:03 AM
@pardeep In 2.6.0 - NO Fixed version...It's in the link that you posted. My cluster has not reported the issue. HDP 2.3.2
... View more
10-30-2015
01:00 AM
@ayusuf@hortonworks.com This is good explanation ...namenode will know about the stale DN dfs.namenode.stale.datanode.interval
Default time interval for marking a datanode as "stale", i.e., if the namenode has not received heartbeat msg from a datanode for more than this time interval, the datanode will be marked and treated as "stale" by default. The stale interval cannot be too small since otherwise this may cause too frequent change of stale states. We thus set a minimum stale interval value (the default value is 3 times of heartbeat interval) and guarantee that the stale interval cannot be less than the minimum value. A stale data node is avoided during lease/block recovery. It can be conditionally avoided for reads (see dfs.namenode.avoid.read.stale.datanode) and for writes (see dfs.namenode.avoid.write.stale.datanode).
... View more
10-30-2015
12:41 AM
Are you trying to configure local repo? or Whats the customization in repo file?
... View more