Community Articles

Find and share helpful community-sourced technical articles.
Announcements
Celebrating as our community reaches 100,000 members! Thank you!
Labels (1)
avatar
Master Guru

The goal of this article is to provide you step by step instruction to install jVisualVM to monitor JVMs inside your hadoop environment. Consult your operations team prior to making any production changes

Go here to download jVisualVM - tool to visually monitor JVM in hadoop. HBase is a great example where you want to visually analyze JVM health. You may already have this tool on your workstation. Simple go to command line and type jVisualVM. If it comes up your in business. else download it:

4385-2016-05-20-12-24-16.jpg

Once you have application up and running install the Visual GC plugin. Tools ->available plugins->Visual GC

Lets go to the node you want to monitor. For jVisualVM to work it needs jstatd to run on the node. Once jstatd is running it automatically sends updates of remote applications running on the node to jVisualVM.

Run jstatd from command line.

If you get the following error:

4386-2016-05-20-12-31-22.jpg

We need to perform an additional simple configuration. Do the following steps

Run which jstatd then cd into the bin location

4387-2016-05-20-12-34-17.jpg

Create a policy file jstatd.all.policy in the bin directory (same location as jstatd). We are building a policy to allow jstatd to have permission to everything.

Inside the file add the following:

grant codebase "file:${java.home}/../lib/tools.jar" {

permission java.security.AllPermission;

};

Now run following command to start jstatd with your policy:

./jstatd -J-Djava.security.policy=jstatd.all.policy &

4388-2016-05-20-12-39-44.jpg

End of steps for jstatd error

Test if jstatd is running by issuing this command:

jps -l -m -v rmi://localhost

You will see tons of output such as:

4390-2016-05-20-12-42-53.jpg

Which looks good

Open jvisualVM (command line type jVisualVM)

Right click on remote and select add remote host

4401-2016-05-20-12-44-03.jpg

Add hostname or IP address of node you want to analyze:

4403-2016-05-20-12-46-08.jpg

Now you are connected. Click on arrow next to your remote. This will display all JVMs running on the node:

4404-2016-05-20-12-48-34.jpg

Now you can start analyzing. lets click on hbase data node and click on monitor tab:

4405-2016-05-20-12-51-24.jpg

Click on Visual GC tab to analyze different generations inside the JVM.

I hope this article helps you start analyzing at a deeper level JVMs on hadoop especially with hbase.


2016-05-20-12-46-08.jpg
5,312 Views
Comments
avatar
Super Guru

In production, for remote access, you would have to deal with firewall issues, however, for special cases when high severity issues troubleshooting is needed, Ops folks may agree to perform the needed changes. Additionally, you need to start the JVM with something like this in order to be able to truly access the JVM remotely (from a different host): -Djava.rmi.server.hostname = host ip , which forces RMI service to use the host ip instead of 127.0.0.1. By the nature of the Hadoop beast, most of the tools in the ecosystem would have multiple JVMs and some of them would be volatile, just to perform a task. Getting a lot of value of jvisualvm could be quite difficult, but it might prove useful in some boundary scenarios.

avatar
Master Guru

@Constantin Stanca This is standard for hbase development. If you can't see what your JVM is doing your driving blind. tuning the flushes for the memstore and blockcache are vital for performance. Testing GC for G1 vs CMS on namenode or hbase is vital for performance. For production remote access yes you always need clearance. Monitoring JVM during development is highly useful for namenode and hbase.

avatar
Master Guru

@Constantin Stanca what do you mean by "Additionally, you need to start the JVM with something like this in order to be able to truly access the JVM remotely"? JVM start as they normally do to use this tool.