Member since
11-07-2016
58
Posts
26
Kudos Received
6
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1572 | 05-17-2017 04:57 PM | |
4572 | 03-17-2017 06:51 PM | |
2087 | 01-14-2017 07:03 PM | |
3267 | 01-14-2017 06:59 PM | |
1734 | 12-29-2016 06:45 PM |
01-02-2018
09:27 PM
1 Kudo
For certain large environments, it's very easy to for Spark History Server to get overwhelmed by the large number of applications being posted and number of users / developers viewing history data. Spark jobs create an artifact called the history file which is what is parsed by the Spark History Server (SHS) and served via the UI. The size of this file has a huge impact in driving the load on the SHS also note that the size of history file is determined by the number of events generated by the SHS (small executor heart beat interval) Workaround: If you are still interested in analyzing performance issues with these large history files, one option is to download these files and browse them from a locally hosted SHS instance. To run this: Download Spark 1.6 https://spark.apache.org/downloads.html Unpack Create a directory to hold the logs called spark-logs Create a properties file called test.properties Inside test.properties add spark.history.fs.logDirectory=<path to the spark-logs directory> <spark download>/sbin/start-history-server.sh --properties-file <path to test.properties> Open web browser and visit http://localhost:18080 Once done, you can now download Spark History files from HDFS and copy them to this directory. The running Spark History Server will dynamically load the files as they are made available in spark-logs directory.
... View more
Labels:
11-01-2017
12:15 AM
Abstract:
Nimbus metrics are critical to operations as well as development teams for monitoring the performance and stability of Storm applications / topology. Usually most production environments have a metrics / operations monitoring systems including solr, elasticsearch, tsdbs etc. This post shows you; how you can use Collectd to forward these metrics over to your desired metrics environment and alert on them.
Solution:
Collectd is a standard metrics collection tool that can be run natively on linux operating systems. It's capable of capturing a wide variety of metrics, you can find more information on Collectd here: https://collectd.org/
So to capture Storm nimbus metrics, here's a collectd plugin that needs to be complied and built: https://github.com/srotya/storm-collectd (using Maven). Simply run:
mvn clean package assembly:single
In addition, you will need to install collectd and ensure that it has Java plugin capability. Here's a great post on how to do that:
http://blog.asquareb.com/blog/2014/06/09/enabling-java-plugin-for-collectd/ (Please note that the JAR="/path/to/jar" JAVAC="/path/to/javac" variables need to be fixed before you can run it)
Once installed, you will need to configure collectd using the following: (DON'T FORGET TO CONFIGURE OUTPUT PLUGIN)
LoadPlugin java
<Plugin "java">
# required JVM argument is the classpath
# JVMArg "-Djava.class.path=/installpath/collectd/share/collectd/java"
# Since version 4.8.4 (commit c983405) the API and GenericJMX plugin are
# provided as .jar files.
JVMARG "-Djava.class.path=<ABSOLUTE PATH>/lib/collectd-api.jar:<ABSOLUTE PATH>/target/storm-collectd-0.0.1-SNAPSHOT-jar-with-dependencies.jar"
LoadPlugin "com.srotya.collectd.storm.StormNimbusMetrics"
<Plugin "storm">
address "http://localhost:8084/"
kerberos false
jaas "<PATH TO JAAS CONF>"
</Plugin>
</Plugin>
... View more
Labels:
10-31-2017
11:59 PM
2 Kudos
Problem: If you have an AD/LDAP environment and using HDP with Ranger, it's critical to review the case in which usernames and group ids are stored in your Directory Services environment. Ranger authorization is case sensitive therefore if the username / group id doesn't match the one returned from Directory (AD/LDAP) authorization will be denied Solution: To solve this problem Ranger offers 2 parameters that can be set via Ambari. This should ideally be done at install time to avoid the need to re-sync all users. Ranger usersync properties for case conversion are:
ranger.usersync.ldap.username.caseconversion ranger.usersync.ldap.groupname.caseconversion You can set these properties to lower or upper; this will make sure that Ranger will store the usernames and groups in the above specified format in it's local database therefore when users login their authorization parameter will match correctly.
... View more
Labels:
03-29-2017
03:07 AM
@Ambud Sharma we are testing this change and will accept once we are done. I am still not 100% convinced that this solves the problem since the Storm documentation says BasicBolt does the acking and anchoring http://storm.apache.org/releases/1.0.1/Guaranteeing-message-processing.html Search for BasicBolt in that link and you will find "Storm has an interface called BasicBolt that encapsulates this pattern for you."
... View more
01-23-2017
01:36 AM
1 Kudo
Good write-up from @Ambud Sharma plus you can visit http://storm.apache.org/releases/1.0.2/Guaranteeing-message-processing.html for info from the source. Additionally, take a peek at the picture below I just exported from our http://hortonworks.com/training/class/hdp-developer-storm-and-trident-fundamentals/ course that might help visualize all of this information. Good luck and happy Storming!
... View more
10-22-2018
07:16 AM
Hi @Ambud Sharma, I am new to HCP and Storm, I am running though the squid use case. From Kafka console consumer, my kafka is receiving data, and from metron UI, my squid sensor is running but throughput is 0kb/s, from Storm UI squid topology is active and 1 worker 5 executors, but data not coming in to the topology. The logs in storm/workers-artifacts for worker.log.err is empty, for worker.log it stopped below image.
... View more
11-24-2016
04:00 AM
@ambud.sharma Voted up :). Before, it was counter-intuitive.
... View more
11-14-2016
09:55 PM
So the repo needs to be a shared (like NFS) storage between Nifi nodes?
... View more