About Nagamalleswara

Nagamalleswara · ‎08-15-2020

Thanks @Prakashcit If I am not wrong to enter a support case we should have license with Cloudera. Currently this is our test cluster and we don't have a license/subscription with Cloudera. Thanks for your help @Prakashcit

Nagamalleswara · ‎08-13-2020

Thanks @Prakashcit for pointing me to the Hive JIRA related to this bug. I see that Fixed version is 4.0.0 does it mean I can't avail in CDH 6.2 (Hive 2.1), any idea? Could you provide any guidance how to apply this patch to the version of Hive I am running? Thanks in advance!

Nagamalleswara · ‎08-09-2020

Hi All, We've recently started using CDH 6.2 and hive-2.1.1. In Our existing jobs ( in old cluster ) we've set of properties that we set by default. In that one of the property we use is `hive.exec.parallel=true`. In the new ( hive-2.1.1/ CDH 6.2.1) cluster when we run jobs with `hive.exec.parallel=true` beeline is not writing any console logs, which shows the status of the job, application URL and other info like below INFO : Running with YARN Application = application_1596901153465_0484 INFO : Kill Command = /opt/cloudera/parcels/CDH-6.2.1-1.cdh6.2.1.p0.1425774/lib/hadoop/bin/yarn application -kill application_1596901153465_0484 INFO : Hive on Spark Session Web UI URL: http://us-east-1a-test-east-cdh-tasknode670.throtle-test.internal:45956 INFO : Query Hive on Spark job[0] stages: [0, 1, 2] INFO : Spark job[0] status = RUNNING INFO : Job Progress Format CurrentTime StageId_StageAttemptId: SucceededTasksCount(+RunningTasksCount-FailedTasksCount)/TotalTasksCount INFO : 2020-08-09 04:28:23,096 Stage-0_0: 0(+1)/3 Stage-1_0: 0/64 Stage-2_0: 0/64 INFO : 2020-08-09 04:28:26,106 Stage-0_0: 0(+1)/3 Stage-1_0: 0/64 Stage-2_0: 0/64 INFO : 2020-08-09 04:28:27,110 Stage-0_0: 0(+3)/3 Stage-1_0: 0/64 Stage-2_0: 0/64 I am able to get these logs **only if I disable** `hive.exec.parallel=true` Can any one help me in getting console logs even with hive parallel execution?

Nagamalleswara · ‎08-06-2020

All, I was able to fix this issue by changing permissions to 755 on directories /usr/lib64/python2.7/site-packages; and /usr/lib/python2.7/site-packages in the server that's running Hue service.

Nagamalleswara · ‎08-05-2020

Hi All, We're trying to install new cluster with Cloudera Manager 6.2.0 ( CDH 6.2.1 ), while adding Hue service, I am getting an error as " Unexpected error. Unable to verify database connection." in Cloudera Manager. I checked cloudera manager server logs in /var/log/cloudera-scm-server/cloudera-scm-server.log Here is the log I found at the time Hue test connection failure 2020-08-05 11:49:55,499 INFO scm-web-436:com.cloudera.enterprise.JavaMelodyFacade: Entering HTTP Operation: Method:POST, Path:/dbTestConn/checkConnectionResult 2020-08-05 11:49:55,506 INFO scm-web-436:com.cloudera.enterprise.JavaMelodyFacade: Exiting HTTP Operation: Method:POST, Path:/dbTestConn/checkConnectionResult, Status:200 2020-08-05 11:49:57,541 INFO scm-web-370:com.cloudera.enterprise.JavaMelodyFacade: Entering HTTP Operation: Method:POST, Path:/dbTestConn/checkConnectionResult 2020-08-05 11:49:57,548 INFO scm-web-370:com.cloudera.enterprise.JavaMelodyFacade: Exiting HTTP Operation: Method:POST, Path:/dbTestConn/checkConnectionResult, Status:200 2020-08-05 11:49:59,588 INFO scm-web-436:com.cloudera.enterprise.JavaMelodyFacade: Entering HTTP Operation: Method:POST, Path:/dbTestConn/checkConnectionResult 2020-08-05 11:49:59,596 INFO scm-web-436:com.cloudera.enterprise.JavaMelodyFacade: Exiting HTTP Operation: Method:POST, Path:/dbTestConn/checkConnectionResult, Status:200 2020-08-05 11:50:01,634 INFO scm-web-370:com.cloudera.enterprise.JavaMelodyFacade: Entering HTTP Operation: Method:POST, Path:/dbTestConn/checkConnectionResult 2020-08-05 11:50:01,642 INFO scm-web-370:com.cloudera.enterprise.JavaMelodyFacade: Exiting HTTP Operation: Method:POST, Path:/dbTestConn/checkConnectionResult, Status:200 2020-08-05 11:50:02,044 INFO CommandPusher:com.cloudera.cmf.service.AbstractOneOffHostCommand: Unsuccessful 'HueTestDatabaseConnection' 2020-08-05 11:50:02,047 INFO CommandPusher:com.cloudera.cmf.service.AbstractDbConnectionTestCommand: Command exited with code: 1 2020-08-05 11:50:02,047 INFO CommandPusher:com.cloudera.cmf.service.AbstractDbConnectionTestCommand: + '[' syncdb = is_db_alive ']' + '[' ldaptest = is_db_alive ']' + exec /opt/cloudera/parcels/CDH-6.2.1-1.cdh6.2.1.p0.1425774/lib/hue/build/env/bin/hue is_db_alive Traceback (most recent call last): File "/opt/cloudera/parcels/CDH-6.2.1-1.cdh6.2.1.p0.1425774/lib/hue/build/env/bin/hue", line 9, in <module> from pkg_resources import load_entry_point File "/opt/cloudera/parcels/CDH-6.2.1-1.cdh6.2.1.p0.1425774/lib/hue/build/env/lib/python2.7/site-packages/pkg_resources/__init__.py", line 3250, in <module> @_call_aside File "/opt/cloudera/parcels/CDH-6.2.1-1.cdh6.2.1.p0.1425774/lib/hue/build/env/lib/python2.7/site-packages/pkg_resources/__init__.py", line 3234, in _call_asi de f(*args, **kwargs) File "/opt/cloudera/parcels/CDH-6.2.1-1.cdh6.2.1.p0.1425774/lib/hue/build/env/lib/python2.7/site-packages/pkg_resources/__init__.py", line 3263, in _initiali ze_master_working_set working_set = WorkingSet._build_master() File "/opt/cloudera/parcels/CDH-6.2.1-1.cdh6.2.1.p0.1425774/lib/hue/build/env/lib/python2.7/site-packages/pkg_resources/__init__.py", line 574, in _build_mas ter ws = cls() File "/opt/cloudera/parcels/CDH-6.2.1-1.cdh6.2.1.p0.1425774/lib/hue/build/env/lib/python2.7/site-packages/pkg_resources/__init__.py", line 567, in __init__ self.add_entry(entry) File "/opt/cloudera/parcels/CDH-6.2.1-1.cdh6.2.1.p0.1425774/lib/hue/build/env/lib/python2.7/site-packages/pkg_resources/__init__.py", line 623, in add_entry for dist in find_distributions(entry, True): File "/opt/cloudera/parcels/CDH-6.2.1-1.cdh6.2.1.p0.1425774/lib/hue/build/env/lib/python2.7/site-packages/pkg_resources/__init__.py", line 2065, in find_on_p ath for dist in factory(fullpath): File "/opt/cloudera/parcels/CDH-6.2.1-1.cdh6.2.1.p0.1425774/lib/hue/build/env/lib/python2.7/site-packages/pkg_resources/__init__.py", line 2127, in distribut ions_from_metadata if len(os.listdir(path)) == 0: OSError: [Errno 13] Permission denied: '/usr/lib64/python2.7/site-packages/simplejson-3.17.2.dist-info' OSError: [Errno 13] Permission denied: '/usr/lib64/python2.7/site-packages/simplejson-3.17.2.dist-info' Just to check if it's really a permission issue, I've changed permission on the above directory to 777 and still it was failing with same issue. Please note that Test db connection for Hive and Oozie working fine, only Hue is not working. OS : CentOS 7.8; Python : 2.7.5

Nagamalleswara · ‎07-24-2020

Hi @Bender As provided in the link I tried to produce thread & dump files from the running process, but as I mentioned earlier those process were getting killed/throwing error. Here is the output am getting when I run jmap as per the doc/link provided [yarn@us-east-1a-test-east-cdh-tasknode5152 process]$ ps -fe | grep nodemanager yarn 5235 12503 0 13:07 ? 00:00:00 /usr/lib/jvm/java-openjdk/bin/java -Dproc_nodemanager -Xmx1000m -Djava.net.preferIPv4Stack=true -server -Xms1073741824 -Xmx1073741824 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled -Dlibrary.leveldbjni.path=/run/cloudera-scm-agent/process/109-yarn-NODEMANAGER -Dhadoop.event.appender=,EventCatcher -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/CD-YARN-QafZaOEK_CD-YARN-QafZaOEK-NODEMANAGER-4756e03a64cd1a4e535550d4cd740b08_pid5235.hprof -XX:OnOutOfMemoryError=/usr/lib64/cmf/service/common/killparent.sh -Dhadoop.log.dir=/var/log/hadoop-yarn -Dyarn.log.dir=/var/log/hadoop-yarn -Dhadoop.log.file=hadoop-cmf-CD-YARN-QafZaOEK-NODEMANAGER-us-east-1a-test-east-cdh-tasknode5152.throtle-test.internal.log.out -Dyarn.log.file=hadoop-cmf-CD-YARN-QafZaOEK-NODEMANAGER-us-east-1a-test-east-cdh-tasknode5152.throtle-test.internal.log.out -Dyarn.home.dir=/opt/cloudera/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8/lib/hadoop-yarn -Dhadoop.home.dir=/opt/cloudera/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8/lib/hadoop-yarn -Dhadoop.root.logger=INFO,RFA -Dyarn.root.logger=INFO,RFA -Djava.library.path=/opt/cloudera/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8/lib/hadoop/lib/native -classpath /run/cloudera-scm-agent/process/109-yarn-NODEMANAGER:/run/cloudera-scm-agent/process/109-yarn-NODEMANAGER:/run/cloudera-scm-agent/process/109-yarn-NODEMANAGER:/opt/cloudera/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8/lib/hadoop/lib/*:/opt/cloudera/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8/lib/hadoop/.//*:/opt/cloudera/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8/lib/hadoop-hdfs/./:/opt/cloudera/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8/lib/hadoop-hdfs/lib/*:/opt/cloudera/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8/lib/hadoop-hdfs/.//*:/opt/cloudera/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8/lib/hadoop-yarn/lib/*:/opt/cloudera/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8/lib/hadoop-yarn/.//*:/opt/cloudera/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8/lib/hadoop-mapreduce/lib/*:/opt/cloudera/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8/lib/hadoop-mapreduce/.//*:/usr/share/cmf/lib/plugins/event-publish-5.16.2-shaded.jar:/usr/share/cmf/lib/plugins/tt-instrumentation-5.16.2.jar:/opt/cloudera/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8/lib/hadoop-yarn/.//*:/opt/cloudera/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8/lib/hadoop-yarn/lib/*:/run/cloudera-scm-agent/process/109-yarn-NODEMANAGER/nm-config/log4j.properties org.apache.hadoop.yarn.server.nodemanager.NodeManager yarn 5240 5235 0 13:07 ? 00:00:00 python2.7 /usr/lib64/cmf/agent/build/env/bin/cmf-redactor /usr/lib64/cmf/service/yarn/yarn.sh nodemanager yarn 5487 31141 0 13:07 pts/0 00:00:00 grep --color=auto nodemanager [yarn@us-east-1a-test-east-cdh-tasknode5152 process]$ [yarn@us-east-1a-test-east-cdh-tasknode5152 process]$ [yarn@us-east-1a-test-east-cdh-tasknode5152 process]$ /usr/lib/jvm/java-openjdk/bin/jmap -heap 5235 > /tmp/jmap_5235_heap.out Exception in thread "main" java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at sun.tools.jmap.JMap.runTool(JMap.java:201) at sun.tools.jmap.JMap.main(JMap.java:130) Caused by: java.lang.NullPointerException at sun.jvm.hotspot.tools.HeapSummary.run(HeapSummary.java:157) at sun.jvm.hotspot.tools.Tool.startInternal(Tool.java:260) at sun.jvm.hotspot.tools.Tool.start(Tool.java:223) at sun.jvm.hotspot.tools.Tool.execute(Tool.java:118) at sun.jvm.hotspot.tools.HeapSummary.main(HeapSummary.java:50) ... 6 more Here is the error message I am getting when I run jstack with -l option [yarn@us-east-1a-test-east-cdh-tasknode5152 process]$ ps -fe | grep nodemanager yarn 4518 12503 0 13:04 ? 00:00:00 /usr/lib/jvm/java-openjdk/bin/java -Dproc_nodemanager -Xmx1000m -Djava.net.preferIPv4Stack=true -server -Xms1073741824 -Xmx1073741824 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled -Dlibrary.leveldbjni.path=/run/cloudera-scm-agent/process/109-yarn-NODEMANAGER -Dhadoop.event.appender=,EventCatcher -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/CD-YARN-QafZaOEK_CD-YARN-QafZaOEK-NODEMANAGER-4756e03a64cd1a4e535550d4cd740b08_pid4518.hprof -XX:OnOutOfMemoryError=/usr/lib64/cmf/service/common/killparent.sh -Dhadoop.log.dir=/var/log/hadoop-yarn -Dyarn.log.dir=/var/log/hadoop-yarn -Dhadoop.log.file=hadoop-cmf-CD-YARN-QafZaOEK-NODEMANAGER-us-east-1a-test-east-cdh-tasknode5152.throtle-test.internal.log.out -Dyarn.log.file=hadoop-cmf-CD-YARN-QafZaOEK-NODEMANAGER-us-east-1a-test-east-cdh-tasknode5152.throtle-test.internal.log.out -Dyarn.home.dir=/opt/cloudera/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8/lib/hadoop-yarn -Dhadoop.home.dir=/opt/cloudera/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8/lib/hadoop-yarn -Dhadoop.root.logger=INFO,RFA -Dyarn.root.logger=INFO,RFA -Djava.library.path=/opt/cloudera/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8/lib/hadoop/lib/native -classpath /run/cloudera-scm-agent/process/109-yarn-NODEMANAGER:/run/cloudera-scm-agent/process/109-yarn-NODEMANAGER:/run/cloudera-scm-agent/process/109-yarn-NODEMANAGER:/opt/cloudera/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8/lib/hadoop/lib/*:/opt/cloudera/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8/lib/hadoop/.//*:/opt/cloudera/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8/lib/hadoop-hdfs/./:/opt/cloudera/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8/lib/hadoop-hdfs/lib/*:/opt/cloudera/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8/lib/hadoop-hdfs/.//*:/opt/cloudera/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8/lib/hadoop-yarn/lib/*:/opt/cloudera/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8/lib/hadoop-yarn/.//*:/opt/cloudera/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8/lib/hadoop-mapreduce/lib/*:/opt/cloudera/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8/lib/hadoop-mapreduce/.//*:/usr/share/cmf/lib/plugins/event-publish-5.16.2-shaded.jar:/usr/share/cmf/lib/plugins/tt-instrumentation-5.16.2.jar:/opt/cloudera/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8/lib/hadoop-yarn/.//*:/opt/cloudera/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8/lib/hadoop-yarn/lib/*:/run/cloudera-scm-agent/process/109-yarn-NODEMANAGER/nm-config/log4j.properties org.apache.hadoop.yarn.server.nodemanager.NodeManager yarn 4523 4518 0 13:04 ? 00:00:00 python2.7 /usr/lib64/cmf/agent/build/env/bin/cmf-redactor /usr/lib64/cmf/service/yarn/yarn.sh nodemanager yarn 4717 31141 0 13:05 pts/0 00:00:00 grep --color=auto nodemanager [yarn@us-east-1a-test-east-cdh-tasknode5152 process]$ [yarn@us-east-1a-test-east-cdh-tasknode5152 process]$ [yarn@us-east-1a-test-east-cdh-tasknode5152 process]$ [yarn@us-east-1a-test-east-cdh-tasknode5152 process]$ /usr/lib/jvm/java-openjdk/bin/jstack -F 4518 > /tmp/jstack_4518_f.out [yarn@us-east-1a-test-east-cdh-tasknode5152 process]$ less /tmp/jstack_4518_f.out [yarn@us-east-1a-test-east-cdh-tasknode5152 process]$ [yarn@us-east-1a-test-east-cdh-tasknode5152 process]$ [yarn@us-east-1a-test-east-cdh-tasknode5152 process]$ [yarn@us-east-1a-test-east-cdh-tasknode5152 process]$ /usr/lib/jvm/java-openjdk/bin/jstack -l 4518 > /tmp/jstack_4518_f.out 4518: Unable to open socket file: target process not responding or HotSpot VM not loaded The -F option can be used when the target process is not responding For the process 4518 I ran jstack with -F option and here is the output Debugger attached successfully. Server compiler detected. JVM version is 25.181-b13 Deadlock Detection: No deadlocks found.

Nagamalleswara · ‎07-23-2020

Thanks for providing the link for JVM analysis. it's really helpful. Yes, in my case the process was not responding so I should've used `kill -3` option. I will try next time and provide you results.

Nagamalleswara · ‎07-23-2020

Hi @Bender Yes we checked ports and verified that they're accessible., and yes it's very strange to see issue like this. I was not able to do anything since the process ( yarn NODEMANAGER ) will be running but does nothing like not even writing logs. In some cases reboot of the server works and some cases service restart works. But in some cases even server reboot doesn't work. I have tried following options Tried changing log level to DEBUG and see if that writes any logs -- But it didn't work or didn't write any log Since the Java process is running ( supervisor process thinks that NM was running since process is running and stayed up > 20 sec ) , I thought of getting thread dump analyze if I could be able to get any thing. To get thread dump I used jcmd, but after running this it's killing the process ( that was not running/ not producing logs ) and a new process spun up. ( even if I do the same on new process it's killing again ) jcmd <pid> Thread.print >> /path/to/file I have tried to see if there were any deadlocks with jstack -F and the result it showed is there were no deadlocks Please let me know if I can check anything else to resolve the issue.

Nagamalleswara · ‎07-21-2020

Thanks @Bender for detailed instructions. Yes I am still facing this issue and trying to chase this down for last few days. I have tried the commands you've provided and here is the output [root@us-east-1a-test-east-cdh-corenode4163 cloudera-scm-agent]# export SUPER_CONF=/var/run/cloudera-scm-agent/supervisor/supervisord.conf [root@us-east-1a-test-east-cdh-corenode4163 cloudera-scm-agent]# /usr/lib64/cmf/agent/build/env/bin/supervisorctl -c $SUPER_CONF status 138-cluster-host-inspector EXITED Jul 21 03:00 PM 201-hdfs-DATANODE RUNNING pid 96936, uptime 0:11:58 209-hbase-REGIONSERVER RUNNING pid 97227, uptime 0:11:31 214-yarn-NODEMANAGER RUNNING pid 97307, uptime 0:11:30 cmflistener RUNNING pid 7412, uptime 4:53:45 flood RUNNING pid 7449, uptime 4:53:44 [root@us-east-1a-test-east-cdh-corenode4163 cloudera-scm-agent]# So in my case I don't have problem of two NM instances running in a single node. Also here is the process output from ps command yarn 97307 0.0 0.0 2858224 23704 ? Sl 15:22 0:00 /usr/lib/jvm/java-openjdk/bin/java -Dproc_nodemanager -Xmx1000m -Djava.net.preferIPv4Stack=true -server -Xms1073741824 -Xmx1073741824 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled -Dlibrary.leveldbjni.path=/run/cloudera-scm-agent/process/214-yarn-NODEMANAGER -Dhadoop.event.appender=,EventCatcher -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/CD-YARN-LGzbJezU_CD-YARN-LGzbJezU-NODEMANAGER-f953f0c79fd5345f10fb347aa90e7500_pid97307.hprof -XX:OnOutOfMemoryError=/usr/lib64/cmf/service/common/killparent.sh -Dhadoop.log.dir=/var/log/hadoop-yarn -Dyarn.log.dir=/var/log/hadoop-yarn -Dhadoop.log.file=hadoop-cmf-CD-YARN-LGzbJezU-NODEMANAGER-us-east-1a-test-east-cdh-corenode4163.throtle-test.internal.log.out -Dyarn.log.file=hadoop-cmf-CD-YARN-LGzbJezU-NODEMANAGER-us-east-1a-test-east-cdh-corenode4163.throtle-test.internal.log.out -Dyarn.home.dir=/opt/cloudera/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8/lib/hadoop-yarn -Dhadoop.home.dir=/opt/cloudera/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8/lib/hadoop-yarn -Dhadoop.root.logger=INFO,RFA -Dyarn.root.logger=INFO,RFA -Djava.library.path=/opt/cloudera/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8/lib/hadoop/lib/native -classpath /run/cloudera-scm-agent/process/214-yarn-NODEMANAGER:/run/cloudera-scm-agent/process/214-yarn-NODEMANAGER:/run/cloudera-scm-agent/process/214-yarn-NODEMANAGER:/opt/cloudera/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8/lib/hadoop/lib/*:/opt/cloudera/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8/lib/hadoop/.//*:/opt/cloudera/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8/lib/hadoop-hdfs/./:/opt/cloudera/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8/lib/hadoop-hdfs/lib/*:/opt/cloudera/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8/lib/hadoop-hdfs/.//*:/opt/cloudera/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8/lib/hadoop-yarn/lib/*:/opt/cloudera/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8/lib/hadoop-yarn/.//*:/opt/cloudera/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8/lib/hadoop-mapreduce/lib/*:/opt/cloudera/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8/lib/hadoop-mapreduce/.//*:/usr/share/cmf/lib/plugins/event-publish-5.16.2-shaded.jar:/usr/share/cmf/lib/plugins/tt-instrumentation-5.16.2.jar:/opt/cloudera/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8/lib/hadoop-yarn/.//*:/opt/cloudera/parcels/CDH-5.16.2-1.cdh5.16.2.p0.8/lib/hadoop-yarn/lib/*:/run/cloudera-scm-agent/process/214-yarn-NODEMANAGER/nm-config/log4j.properties org.apache.hadoop.yarn.server.nodemanager.NodeManager Yes, the NodeManager by default given 1 GB , and I even tried giving 2 GB & 4 GB still the issue persists. Please note that I am seeing this issue on random servers, for every restart of cluster/YARN service. ( As mentioned earlier the process will be running but no logs are written, just cloudera-scm-agent complains that it's not able to connect to 8042 port ) Could you provide any other ways/method to debug this issue? Note : Not sure if it helps, when we tested cluster with small nodes ( r4.xlarge ) we're not seeing this issue. We are seeing this issue when we increased size of the node to r4.8xlarge

Nagamalleswara · ‎07-20-2020

A quick update, Even though I was not able to find the root cause, or fix for this root cause, I just tried rebooting the server running NodeManager service, and now I could see that NodeManager was up and running. But for every service restart I see the same issue of NodeManager not running and had to reboot server to get it up and running.

Online	Offline
Last Visited	‎11-26-2024 05:43 PM

Member Since	‎06-25-2017 07:50 AM
Last Visited	‎11-26-2024 05:43 PM
Posts	29
Kudos received	2

Cloudera Community

Re: Hue test db connection failing in CDH 6.2

Re: spark2-submit throwing error with multiple pa...

Re: Changing rack awareness in a running Hadoop cl...

Re: beeline ( HiveServer2 ) not writing console lo...

Re: beeline ( HiveServer2 ) not writing console lo...

beeline ( HiveServer2 ) not writing console logs w...

Re: Hue test db connection failing in CDH 6.2

Hue test db connection failing in CDH 6.2

Re: unable to start nodemanager in CDH 5.16.2

Re: unable to start nodemanager in CDH 5.16.2

Re: unable to start nodemanager in CDH 5.16.2

Re: unable to start nodemanager in CDH 5.16.2

Re: unable to start nodemanager in CDH 5.16.2