Created 01-27-2019 04:16 PM
Hello Everyone,
I have my application configured to use Yarn Resource Manager, Hadoop Name Node and Samza Containers. All the data are written under /var/lib location (/var is a mounted drive). I want to validate my application for Disk latency fault. To validate disk latency for application, I want to mount the /var/lib as Fuse file system which will be help me in adding delays.
Here is what I have done to mount the fuse file system (mounting fuse file system has no issues).
After above mentioned steps, I have started all the services. All the applications started successfully except Node manager and Samza jobs. My observation is that, Node manager is crashing (starts and stops immediately). Samza jobs is expected to fail due to failure of Node Manager.
Question:
What's the issue here?
Following is the message from logs:
p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px 'Helvetica Neue'} p.p2 {margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px 'Helvetica Neue'; min-height: 14.0px}
# Problematic frame:
# C[libc.so.6+0x15556b]
#
+ date +%h %d %Y %H:%M:%S-%3N
+ echo [Jan 27 2019 06:53:18-872] Exiting yarn node manager...
INFO org.apache.hadoop.yarn.server.nodemanager.NodeManager: registered UNIX signal handlers for [TERM, HUP, INT]
INFO org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService: Using state database at /var/lib/hadoop-yarn/yarn-nm-recovery/yarn-nm-state for recovery
INFO org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService$LeveldbLogger: Recovering log #36038
Created 01-27-2019 09:29 PM
Did you update the entries in /etc/fstab ?
Created 01-28-2019 04:18 AM
Thanks @Geoffrey Shelton Okot for the response,
I did NOT update /etc/fstab as I didn't want to retain the mount point across reboots. Is there any other reason why I should be updating /etc/fstab?
Thanks,
DB
Created 01-28-2019 06:28 AM
Then you might need to manually mount it for the session ONLY like reference fuse mount
sshfs username@hostname:/remote/directory/path /local/mount/point
If you don't mount it how will the OS know?
Created 01-28-2019 08:10 PM
As I pointed out in my 1st post, I have already mounted "/var/lib" as fuse mount. I could see the /var/lib mount as fuse mount on entering "df -lh" command. Also, I could do a successful 'ls /var/lib' directory after mount.
I don't think mount has any issues here as I could observe yarn resource manager service running successfully after the mount.
Please note: /var/lib is the parent directory where Yarn RM, NM, Samza, my applications are writing. Since, I have mounted /var/lib as the fuse mount, all the other applications could run with no issues.
Procedure I followed before I mount /var/lib as fuse mount point is: I have created a directory "/var/myapp/lib/". Stopped all the processes including yarn RM, Samza jobs, Hdfs NM. Copied all the files (cp -rfp to retain file ownership and permissions as it is) from /var/lib to "/var/myapp/lib/". Mounted the /var/lib pointing to the data directory "/var/myapp/lib/".
After the above said operation, I could see all the processes starting with no issues except HDFS NM and samza job processes.
The errors observed in the logs are quoted in my 1st post in this question.
I wanted to understand, if the procedure I'm following is correct as the HDFS NM process is crashing with the message "recovering". Wondering if I'm missing any steps.
Created 01-28-2019 03:21 PM
I am the user of ubuntu os so far and this question answer i don't have honestly and thanks for sharing this solution with us. [Spurious hyperlink removed] assured me to get this crashing issue solved.