Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

What are the prerequisites to increase the heap size of data node.

Highlighted

What are the prerequisites to increase the heap size of data node.

New Contributor

Hi Team,

I got the below error in GC log.

Full GC (Allocation Failure CMS-concurrent-mark-start CMS-concurrent-abortable-preclean-start)

Current heap utilization is below.

Heap Configuration: MinHeapFreeRatio = 40 MaxHeapFreeRatio = 70 MaxHeapSize = 12884901888 (12288.0MB) NewSize = 1610612736 (1536.0MB) MaxNewSize = 1610612736 (1536.0MB) OldSize = 11274289152 (10752.0MB) NewRatio = 2 SurvivorRatio = 8 MetaspaceSize = 21807104 (20.796875MB) CompressedClassSpaceSize = 1073741824 (1024.0MB) MaxMetaspaceSize = 17592186044415 MB G1HeapRegionSize = 0 (0.0MB) Heap Usage: New Generation (Eden + 1 Survivor Space): capacity = 1449590784 (1382.4375MB) used = 1449590784 (1382.4375MB) free = 0 (0.0MB) 100.0% used Eden Space:=========================================>>>new object space. capacity = 1288568832 (1228.875MB) used = 1288568832 (1228.875MB) free = 0 (0.0MB) 100.0% used From Space: capacity = 161021952 (153.5625MB) used = 161021952 (153.5625MB) free = 0 (0.0MB) 100.0% used To Space: capacity = 161021952 (153.5625MB) used = 0 (0.0MB) free = 161021952 (153.5625MB) 0.0% used concurrent mark-sweep generation: capacity = 11274289152 (10752.0MB) used = 11274289152 (10752.0MB) free = 0 (0.0MB) 100.0% used

So My questions are below.

1.We need to increase the heap size using command line (as GUI is not here) for data node and our cluster using hadoop version is hadoop-common 2.4.0.2.1.2.0-402.

2. So if we change the parameter in hadoop-env.sh on name node will it reflect to all nodes or we have to do it manually in all data nodes.

3.whether we need down time or we can change it without stoping any services.

just need to run sh hadoop-env.sh

4.If we need to stop the services then please let us know the services name i.e which services need to stop.

Kindly give me the proper steps to do it using command line and also how to cross verify it.

Regards,

Satya Gaurav



3 REPLIES 3

Re: What are the prerequisites to increase the heap size of data node.

Rising Star

@satya gaurav Do you have any tools like Ambari or similar tools that allow you to manage the cluster? if you have, the UI will have settings to make this change and do a rolling restart of your datanodes.

Please let me know if that is not the case, and you really want to do this from command line, it is possible, but wanted to make sure that I am not giving wrong instructions, hence the query.

Re: What are the prerequisites to increase the heap size of data node.

New Contributor

@aengineer Hi,

Thanks for your response, but i don't have GUI(Ambari). However this cluster is HA cluster.

So let me know the proper steps to do it.

Can we do it as a rolling restart by login in each data node one by one.

Regards,

Satya Gaurav

Re: What are the prerequisites to increase the heap size of data node.

Rising Star

@satya gaurav You will need to change a parameter inside hadoop-env.sh, the parameter is called

HADOOP_DATANODE_OPTS, you can pass standard JVM commands via this parameters.

now to specifically answer your questions.

1. We need to increase the heap size using command line (as GUI is not here) for data node and our cluster using hadoop version is hadoop-common 2.4.0.2.1.2.0-402.

change the value of HADOOP_DATANODE_OPTS inside hadoop-env.sh.

2. So if we change the parameter in hadoop-env.sh on name node will it reflect to all nodes or we have to do it manually in all data nodes.

No, if you change in namenode, it will not automatically reflect in all datanodes. That is where a tool like Amabri is useful, in this case you will have to copy the hadoop-env.sh yourself.

3.whether we need down time or we can change it without stoping any services.

You can achieve this goal via a rolling restart, since this is a manual operation I suggest that you make sure that all datanodes are up and running and reboot 2 nodes at a time. That will ensure maximum availability of your cluster.

4.If we need to stop the services then please let us know the services name i.e which services need to stop.

you need to restart datanode service on all datanodes.

Hope this helps. Please let me know if you have any more questions.

Don't have an account?
Coming from Hortonworks? Activate your account here