Created on 03-07-2017 09:20 PM - edited 08-19-2019 12:55 AM
Running Storm on a Kerberized HDP 2.5.3 cluster and am able to run "storm list" to see no topologies are running and the Storm UI reports my two nimbus servers are running as well as my five supervisors. Now I'm getting the following excerpt when I tried to run the simple WordCount topology described in Step 4 on http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.3/bk_command-line-installation/content/validat....
[student2@ip-172-xxx-xxx-42 ~]$ storm jar /usr/hdp/current/storm-client/contrib/storm-starter/storm-starter-topologies-*.jar org.apache.storm.starter.WordCountTopology wordcount OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x000000076e980000, 85983232, 0) failed; error='Cannot allocate memory' (errno=12) OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x000000076e980000, 85983232, 0) failed; error='Cannot allocate memory' (errno=12) OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x000000076e980000, 85983232, 0) failed; error='Cannot allocate memory' (errno=12) Running: /usr/java/default/bin/java -server -Ddaemon.name= -Dstorm.options= -Dstorm.home=/usr/hdp/2.5.3.0-37/storm -Dstorm.log.dir= -Djava.library.path= -Dstorm.conf.file= -cp /usr/hdp/2.5.3.0-37/storm/lib/asm-5.0.3.jar:RM'D_LONG_LIST_OF_JARS_FOR_READABILITY org.apache.storm.daemon.ClientJarTransformerRunner /usr/hdp/current/storm-client/contrib/storm-starter/storm-starter-topologies-1.0.1.2.5.3.0-37.jar /tmp/40f0348a037911e7b88f02ad93562fff.jar OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x000000076e980000, 85983232, 0) failed; error='Cannot allocate memory' (errno=12) # # There is insufficient memory for the Java Runtime Environment to continue. # Native memory allocation (mmap) failed to map 85983232 bytes for committing reserved memory. # An error report file with more information is saved as: # /home/LAB.HORTONWORKS.NET/student2/hs_err_pid11132.log OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x000000076e980000, 85983232, 0) failed; error='Cannot allocate memory' (errno=12) OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x000000076e980000, 85983232, 0) failed; error='Cannot allocate memory' (errno=12) Running: /usr/java/default/bin/java -client -Ddaemon.name= -Dstorm.options= -Dstorm.home=/usr/hdp/2.5.3.0-37/storm -Dstorm.log.dir= -Djava.library.path= -Dstorm.conf.file= -cp /usr/hdp/2.5.3.0-37/storm/lib/asm-5.0.3.jar:RM'D_LONG_LIST_OF_JARS_FOR_READABILITY -Dstorm.jar=/tmp/40f0348a037911e7b88f02ad93562fff.jar org.apache.storm.starter.WordCountTopology wordcount OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x000000076e980000, 85983232, 0) failed; error='Cannot allocate memory' (errno=12) # # There is insufficient memory for the Java Runtime Environment to continue. # Native memory allocation (mmap) failed to map 85983232 bytes for committing reserved memory. # An error report file with more information is saved as: # /home/LAB.HORTONWORKS.NET/student2/hs_err_pid11138.log Traceback (most recent call last): File "/usr/hdp/2.5.3.0-37/storm/bin/storm.py", line 774, in <module> main() File "/usr/hdp/2.5.3.0-37/storm/bin/storm.py", line 771, in main (COMMANDS.get(COMMAND, unknown_command))(*ARGS) File "/usr/hdp/2.5.3.0-37/storm/bin/storm.py", line 248, in jar os.remove(tmpjar) OSError: [Errno 2] No such file or directory: '/tmp/40f0348a037911e7b88f02ad93562fff.jar' [student2@ip-172-xxx-xxx-42 ~]$
The referenced file is available: hs-err-pid11132log.txt (added a .txt so HCC would allow it as an attachment).
I tried to increase the memory foot print of Nimbus from 1GB to 2GB and Supervisors from 768MB to 2GB with the changes highlighted below, but same error before and after this change.
Did I make the right changes to give these services more memory??
Is there something else I should be trying??
Created 03-08-2017 04:46 AM
You are not getting "OutOfMemory in Java Heap", but it is actually "Native Out Of Memory"
Out of Memory Error (os_linux.cpp:2638)
.
So increasing the Xmx (Heap Memory) will make the issue more worse here. I will suggest you to reduce heap (Xmx) also please check the output of `free -m` before starting the storm process to findout if the OS really has enough memory to allocate?
It will also help us in knowing if already swap memory was being used?
free -m
.
Created 03-08-2017 04:46 AM
You are not getting "OutOfMemory in Java Heap", but it is actually "Native Out Of Memory"
Out of Memory Error (os_linux.cpp:2638)
.
So increasing the Xmx (Heap Memory) will make the issue more worse here. I will suggest you to reduce heap (Xmx) also please check the output of `free -m` before starting the storm process to findout if the OS really has enough memory to allocate?
It will also help us in knowing if already swap memory was being used?
free -m
.
Created 03-08-2017 03:51 PM
Thanks for all the great responses here and below. Yes, indeed, the worker nodes that I have all of this running on are overloaded. Taking into account the distribution of services and the underlying box resource footprint. Thanks again!!
Created 03-08-2017 04:56 AM
If you see that the storm process is getting crashed even though you have enough memory (swap/free) available then you should also check the "/proc/sys/vm/overcommit_memory"
- This switch knows 3 different settings:
=> 0: The Linux kernel is free to over commit memory(this is the default), a heuristic algorithm is applied to figure out if enough memory is available.
=> 1: The Linux kernel will always over commit memory, and never check if enough memory is available. This increases the risk of out-of-memory situations, but also improves memory-intensive workloads.
=> 2: The Linux kernel will not over commit memory, and only allocate as much memory as defined in over commit_ratio.
As sometimes OS kills /crashes a process due to a system OS setting, the system OS memory overcommit setting was 2 (when it should have been set to 0) - https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Performance_Tuning_Gui...
Created 03-08-2017 05:08 AM
Also as per the attached "hs_err_pid" file we see that you have only 161 MB of memory left out of 16 GB.
Memory: 4k page, physical 16004820k(161604k free), swap 0k(0k free)
.
So try to either increase the RAM or stop unwanted components/processes that are consuming more memory on your OS.