Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Understanding Zeppelin interpreter architecture

Understanding Zeppelin interpreter architecture

Super Collaborator

Hi,

Just after a restart of the Zeppelin Server I ran a simple %spark note. At the time I was the only user on the Zeppelin Server.

When I take a look at the footprint of this interaction in the local process list I get 3 additional processes (the one on top "12268 is the Zeppelin server itself!) :

#> ps -u zeppelin -f --forest

UID PID PPID C STIME TTY TIME CMD
zeppelin 12268 1 0 11:33 ? 00:00:35 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.111-1.b15.el7_2.x86_64/bin/java -Dhdp.version=2.6.2.0-205 -Dspark.executor.memor (cut off ...)

zeppelin 26818 12268 0 12:53 ? 00:00:00 \_ /bin/bash /usr/hdp/current/zeppelin-server/bin/interpreter.sh -d /usr/hdp/current/zeppelin-server/interpreter/spar (cut off ...)

zeppelin 26830 26818 0 12:53 ? 00:00:00 \_ /bin/bash /usr/hdp/current/zeppelin-server/bin/interpreter.sh -d /usr/hdp/current/zeppelin-server/interpreter/ (cut off ...)

zeppelin 26831 26830 6 12:53 ? 00:01:09 \_ /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.111-1.b15.el7_2.x86_64/bin/java -Dhdp.version=2.6.2.0-205 -cp /etc/z (cut off ...)

Although not very clear in the output above, the 4 processes have parent > child relationships as the following points out:

#> ps -u zeppelin --forest

PID TTY TIME CMD

12268 ? 00:00:35 java
26818 ? 00:00:00 \_ interpreter.sh
26830 ? 00:00:00   \_ interpreter.sh
26831 ? 00:01:09     \_ java

Pid 12268 is the Zeppelin server itself. The last pid 26831 is the local Spark instance launched on Yarn.

My actual question is about pid 26818 & 26830 which seem to be identical:

#> ps aux | grep 26818

zeppelin 26818 0.0 0.0 113128 1568 ? S 12:53 0:00 /bin/bash /usr/hdp/current/zeppelin-server/bin/interpreter.sh -d /usr/hdp/current/zeppelin-server/interpreter/spark -p 33433 -l /usr/hdp/current/zeppelin-server/local-repo/2CKX8WPU1 -g spark


#> ps aux | grep 26830

zeppelin 26830 0.0 0.0 113124 636 ? S 12:53 0:00 /bin/bash /usr/hdp/current/zeppelin-server/bin/interpreter.sh -d /usr/hdp/current/zeppelin-server/interpreter/spark -p 33433 -l /usr/hdp/current/zeppelin-server/local-repo/2CKX8WPU1 -g spark

Pid 26830 is related to the pid registered at /var/run/zeppelin/zeppelin-interpreter-spark-zeppelin-<hostname>.pid. So I get that an interpreter instance is lauched when the %spark note is fired from the UI.

But what is the meaning and function of that identical and intermediary pid 26818 ?

Is it something on my environment only or is this by design?

Don't have an account?
Coming from Hortonworks? Activate your account here