Member since
03-16-2016
707
Posts
1753
Kudos Received
203
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5119 | 09-21-2018 09:54 PM | |
6487 | 03-31-2018 03:59 AM | |
1966 | 03-31-2018 03:55 AM | |
2174 | 03-31-2018 03:31 AM | |
4805 | 03-27-2018 03:46 PM |
06-06-2016
09:17 PM
1 Kudo
@Micheal Kubbo Is /demo/data/test a correct path? Have you tried "./demo/data/test" or "demo/data/test" or the absolute path?
... View more
06-01-2016
01:47 AM
17 Kudos
Introduction Ambari 2.2.2 provides a better understanding of cluster health and performance metrics through advanced visualizations and pre-built dashboards, isolating critical metrics for core cluster services such as Kafka reducing time to troubleshoot problems, and improving the level of service for cluster tenants. Ambari Metrics System (AMS) has also a new API to discover metrics. Only HDFS, YARN and HBASE default dashboards are included along with System metrics. Ambari 2.2.2 comes with built-in Grafana integration. Kafka default Grafana dashboard is targeted for Ambari 2.4. Until then, you can build a custom dashboard to see over 30 Kafka BrokerTopicMetrics using Grafana. Tutorial 1. Install Ambari 2.2.2 or Upgrade to Ambari 2.2.2. On upgrade, to enable Grafana, follow instructions from Upgrade to Ambari 2.2.2. 2. Automated Install of HDP 2.4 with Ambari 2.2.2, including Kafka 0.9.0.2.4. Grafana is added automatically on a new install. 3. Explore Ambari Kafka Metrics that can be accessed as: http://<ams-host>:6188/ws/v1/timeline/metrics/metadata/ 4. Add Grafana to Ambari Metrics. 5. Access Grafana to see out-of-box dashboards. Port 3000 is the default for Grafana UI. This is view-only. Built-in Grafana does not allow creation of new dashboards. 6. A few Kafka metrics are displayed by default, however, to see more Kafka metrics, click on Kafka link on the left nav, then click the big "+" sign to add a new widget for one or many metrics. Select a widget from Widget Browser window: Click on "Create Widget", select a widget type, let's say "Graph", "Add Metric" Kafka/ All Kafka Brokers, add a metric, for example kafka.server.BrokerTopicMetrics.BytestInPerSec.1MinuteRate and select the aggregation type. Follow screen instructions and save. The new widget will be added to Kafka metrics dashboard. Repeat the steps for all Kafka Broker Topic Metrics, as necessary. Unfortunately, you can't add a widget for a specific topic or a specific broker, at least not yet. 7. To create a custom dashboard adding desired Kafka topic & broker metrics, a separate Grafana installation is needed. This could be on your development machine. Follow instructions to build a custom dashboard. After complete, deploy on the server. Conclusion Ambari 2.2.2 metrics dashboard can provide a reasonable insight on all brokers metrics. Combined with Burrow and the promise for more Kafka metrics in the next versions of Ambari, the near future seems promising for Kafka monitoring.
... View more
Labels:
05-31-2016
03:30 PM
2 Kudos
@sankar rao In case that you did not do it already, enable Ambari metrics for services of interest and see what they show. Also, check the logs for services running slow. Quite often they may show errors related to JVM settings etc.
... View more
05-31-2016
03:27 PM
1 Kudo
Sure. Post any interesting findings that may require feed-back. Please vote the response above, if useful.
... View more
05-31-2016
01:12 PM
6 Kudos
@Ahmad Debbas As always, start by checking your most recent /logs for ERROR or even WARN, e.g.: cat nifi-app* | grep ERROR Logs are: nifi-app*.log, nifi-bootstrap*.log, nifi-user*.log
... View more
05-27-2016
09:34 PM
@Vladimir Zlatkin Updated section "NUMA optimization" to include a link to OS CPU optimizations for RHEL. "Spark applications performance could be improved not only by configuring various Spark parameters and JVM options, but also using the operating system side optimizations, e.g. CPU affinity, NUMA policy, hardware performance policy etc. to take advantage of the most recent hardware NUMA capable." The referenced section is vast. We could qualify some of the settings to best choices. This could be a follow-up article if it part 1 presented real interest.
... View more
05-24-2016
06:49 PM
Section 2.7. You probably meant /lib instead of /bin. That's where nar files are deployed.
... View more
05-24-2016
03:19 PM
In production, for remote access, you would have to deal with firewall issues, however, for special cases when high severity issues troubleshooting is needed, Ops folks may agree to perform the needed changes. Additionally, you need to start the JVM with something like this in order to be able to truly access the JVM remotely (from a different host): -Djava.rmi.server.hostname = host ip , which forces RMI service to use the host ip instead of 127.0.0.1. By the nature of the Hadoop beast, most of the tools in the ecosystem would have multiple JVMs and some of them would be volatile, just to perform a task. Getting a lot of value of jvisualvm could be quite difficult, but it might prove useful in some boundary scenarios.
... View more
05-21-2016
06:17 PM
@Dominika B Let me know if the above instructions helped. Vote response.
... View more
05-21-2016
01:33 AM
13 Kudos
Modern CPU and NUMA Processor clock speed has increased dramatically in the recent years, however, the CPU needs to be supplied with a large amount of memory bandwidth to use its processing power effectively.
Even a single CPU running a memory-intensive workload, such as a scientific computing application, can be constrained by memory bandwidth.
This problem is amplified on symmetric multiprocessing (SMP) systems, where many processors must compete for bandwidth on the same system bus. Non-uniform memory access (NUMA) is a computer memory design used in multiprocessing, where the memory access time depends on the memory location relative to the processor. NUMA systems have more than one system bus and they can harness large numbers of processors in a single system, linking several small, cost-effective nodes using a high-performance connection.. Each node contains processors and memory, much like a small SMP system.
However, an advanced memory controller allows a node to use memory on all other nodes, creating a single system image.
When a processor accesses memory that does not lie within its own node (remote memory), the data must be transferred over the NUMA connection, which is slower than accessing local memory. Memory access times are not uniform and depend on the location of the memory and the node from which it is accessed, as the technology’s name implies. Memory Access Overhead
Spark relies heavily on in-memory processing, as such, the CPU use with distant memory access on Spark is a wasteful stall. NUMA Optimization Spark applications performance could be improved not only by configuring various Spark parameters and JVM options, but also using the operating system side optimizations, e.g. CPU affinity, NUMA policy, hardware performance policy etc. to take advantage of the most recent hardware NUMA capable. Spark launches executor JVMs with many task worker threads in a system. Then, the operating system tries to schedule these worker threads to multiple cores. However, the scheduler does not always bind worker threads to the same NUMA node, so several worker threads are often scheduled to distant cores on remote NUMA nodes. Once a thread moves to another core over NUMA, the thread has to incur an overhead to access data on remote memory. Since Spark’s compute intensive workloads such as machine learning continue to compute many times for the same RDD dataset in memory, the remote memory access overhead is not negligible in total. Setting NUMA aware affinity for tasks is one well known approach. This means to apply NUMA aware process affinity to the executor JVMs on the same system and compared the computation performance with multiple Spark workloads.
Setting NUMA aware locality for executor JVMs achieves better performance in many Spark applications, enabling core bindings while launching executor JVMs.
Simultaneous multi-threading and hardware prefetching are effective ways to hide data access latencies and additional
latency over-head due to accesses to remote memory can be
removed by co-locating the computations with data they access
on the same socket. For example, if you start a separate pinned JVM for each NUMA node and have them talk to each other using Akka and assuming you start Spark with executor-cores = 32 (8 virtual cores x 4 sockets), the wasteful stall is still there. A good trick is to start 4 workers per machine, each with executor-cores = 8 instead. Then you could pin these executors to the nodes.
This setup will incur more communication overhead, but will likely be a good trade-off. Spark tries to minimize communication between executors, since they are on different machines in the typical case. Performance Gain How much performance gain is achievable by colocating the data and computations on NUMA nodes for in-memory data analytics with Spark? The following benchmark tests shown a 10-15% performance increase:
http://domino.watson.ibm.com/library/CyberDig.nsf/papers/9AF3F3F4DE3E84D785257EE300249572/$File/RT0968.pdf
https://arxiv.org/pdf/1604.08484.pdf
... View more
Labels: