Member since
10-02-2014
13
Posts
0
Kudos Received
3
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
8717 | 05-18-2015 11:27 AM | |
8509 | 05-15-2015 08:28 PM | |
1635 | 10-08-2014 03:46 PM |
06-03-2015
09:46 AM
Thanks for your note & question; This was several weeks ago and I didn't take extensive notes... I seem to remember abandoning the session at that point and circled back a day or two later only to find that Cloudera Manager had successrully installed. At that point, i was able to add nodes and servives individually which resulted in a working cluster for our lab... apologies for the lack of detail here but believe you'll have enough direction to hack your way through this. mit Freundlichen Grüßen (with Friendly Greetings), Jan
... View more
05-18-2015
11:27 AM
Hi dlo, Thank you for your help; you actually led me in another direction and I determined there were 2 "linked" (no pun intended) issues; Seems that the /etc/hadoop/conf symlink was pointing at another broken symlink, /etc/alternatives/hadoop-conf, that was pointing to a non-existant directory on the 2 nodes where nodemanager was failing. I corrected the /etc/alternatives/hadoop-conf symlink from /etc/hadoop/conf.cloudera.mapred (which doesn't exist) to /etc/hadoop/conf.cloudera.yarn. Then I deployed client configuration yet again and restarted the cluster... and voile` Problem solved. When I checked back through everything, I was able to see that the timestamp updated on the /etc/hadoop/conf.cloudera.yarn/topology.map & /etc/hadoop/conf.cloudera.yarn/topology.py files which was (at some level) a confirmation that the configs had been successfully re-deployed. Hope this helps and thank you again for your help. mit Freundlichen Grüßen (with Friendly Greetings), Jan
... View more
05-18-2015
10:40 AM
Hi dlo, Thank you for getting back to me so quickly. I believe I'd done that Friday night, but that said, I'll retry it. Before doing that, I have a question about the deploy client configuration and the deploy Kerberos client configuration command; First, I assume that I run one of these against the cluster and not just the node managers, correct? Second, do I run both or just the deploy Kerberos client configuration? Thank you in advance for your advice on this and look forward to your reply. mit Freundlichen Grüßen (with Friendly Greetings), Jan
... View more
05-18-2015
10:25 AM
Hi All, I've hit the wall again and need to reach out for community wisdom on this issue. I had a fully functioning CDH5.3.2 3-node (yarn) cluster... and then... I configured it for Kerberos. I've done this successfully before using CDH4.2.1 and CDH5.1.2MRv1... so it's not like I've never done this before. What I'm getting now is the proverbial messaging; 9:31:21.867 AM INFO org.apache.hadoop.yarn.server.nodemanager.NodeManager registered UNIX signal handlers for [TERM, HUP, INT] 9:31:24.736 AM INFO org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService Using state database at /var/lib/hadoop-yarn/yarn-nm-recovery/yarn-nm-state for recovery 9:31:24.871 AM INFO org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService$LeveldbLogger Recovering log #18 9:31:24.894 AM INFO org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService$LeveldbLogger Delete type=0 #18
9:31:24.894 AM INFO org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService$LeveldbLogger Delete type=3 #17
9:31:24.910 AM INFO org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService Loaded NM state version info 1.0 9:31:25.357 AM WARN org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor Exit code from container executor initialization is : 24
ExitCodeException exitCode=24: Configuration file ../etc/hadoop/container-executor.cfg not found.
at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
at org.apache.hadoop.util.Shell.run(Shell.java:455)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702)
at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:180)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:209)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:462)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:509) 9:31:25.372 AM INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor 9:31:25.373 AM INFO org.apache.hadoop.service.AbstractService Service NodeManager failed in state INITED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to initialize container executor
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to initialize container executor
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:211)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:462)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:509)
Caused by: java.io.IOException: Linux container executor not configured properly (error=24)
at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:186)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:209)
... 3 more
Caused by: ExitCodeException exitCode=24: Configuration file ../etc/hadoop/container-executor.cfg not found.
at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
at org.apache.hadoop.util.Shell.run(Shell.java:455)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702)
at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:180)
... 4 more 9:31:25.391 AM WARN org.apache.hadoop.service.AbstractService When stopping the service NodeManager : java.lang.NullPointerException
java.lang.NullPointerException
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.stopRecoveryStore(NodeManager.java:161)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStop(NodeManager.java:273)
at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
at org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52)
at org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:171)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:462)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:509) 9:31:25.392 AM FATAL org.apache.hadoop.yarn.server.nodemanager.NodeManager Error starting NodeManager
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to initialize container executor
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:211)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:462)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:509)
Caused by: java.io.IOException: Linux container executor not configured properly (error=24)
at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:186)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:209)
... 3 more
Caused by: ExitCodeException exitCode=24: Configuration file ../etc/hadoop/container-executor.cfg not found.
at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
at org.apache.hadoop.util.Shell.run(Shell.java:455)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702)
at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:180)
... 4 more It appears there have been other postings on this; Please see 1. "enable kerberos,nodemanager can not start Exit code from container executor initialization is : 24" & 2. "Issue with starting Yarn after deploying kerberos on Cloudera Quickstart VM (CDH 5.3.0)" and I've tried 2 solutions; 1. ensuring /etc & /etc/hadoop have 755 permissions and 2. stopping the cluster, deleting the /var/lib/hadoop-yarn/yarn-nm-recovery directory. & restarting the cluster. and nothing seems to resolve this issue. This appears to be some sort of permissions issue around the Kerberization so I need help asap. Thanks in advance for your wisdom and advice. mit Freundlichen Grüßen (with Friendly Greetings), Jan
... View more
Labels:
05-15-2015
08:28 PM
Hi Darren,
Thank you for getting back to me; seems I managed to work around this issue and get the cluster going so I'm good for now on this.
mit Freundlichen Grüßen (with Friendly Greetings),
Jan
Jan Peters
... View more
05-15-2015
03:38 PM
OK, so I"ve nearly completed a CDH5.3.2 Cloudera Manager Installation and when navigating to the next to last installation screen, get a lightbox error message "Cannot have empty version string segment" with only a close button...
So click the Close button and can't seem to move forward and I'm not interested in uninstalling...
so what can I do to fix this in the background?? Thanks in advance for your collective wisdom.
mit Freundlichen Grüßen (with Friendly Greetings),
Jan
Jan Peters
... View more
Labels:
- Labels:
-
Cloudera Manager
10-08-2014
03:46 PM
Okay,
I'm today's cheap entertainment... problem solved! 😄
Someone disabled the krb5kdc service on our KDC server (AAARRRGGGHHH) 😮
so a simple service krb5kdc start there allowed the rest of the Kerberos set up to complete and we're back up using Kerberos.
mit Freundlichen Grüßen (with Friendly Greetings), Jan Jan Peters
... View more
10-08-2014
11:55 AM
Hi All,
Good news was that I was able to get CHD5.1.2 3-node cluster up and running using Cloudera Manager Installer; thanks for the help there.
Now for my next adventure; I've created all the necessary Kerberos pricipals, keytab/krb5.conf files and scp'd them over to the nodes on the cluster... and then started the Kerberos set-up using Cloudera Manager. Everything seemed to go very well... until after all configuration changes were made and the cluster was restated. That's when disaster struck. :-0
it seemed to choke on the restart of Cloudera Management Services starting with Activity manager onwards as seen below.
So to provide more, here's a screenshot of the failed commands;
So now what I have is a cluster that will not start and there doesn't appear to be a clear way of rolling back the kerberos set up.
I have 2 sets of questions;
First, what keytab file does CM think is missing? If I create and add this keytab, how do I restart the process?
Second, how do I recover from here? Do I have to restore a back-up and start all over again or is there a way for Cloudera Manager to roll back the Kerberos configurations?
Thank you in advance for your collective wisdom on this.
mit Freundlichen Grüßen (with Friendly Greetings), Jan Jan Peters
... View more
Labels:
- Labels:
-
Cloudera Manager
-
Kerberos
10-03-2014
09:23 AM
Hi Mark,
Well, it seems I'm wearing egg on my face this morning... I came back to my desk yesterday and found that all 3 nodes nuked on the install... and when I checked there was a mismatch between 5.1.3 and 5.1.2 elements.
Seems the bin installer I pulled from http://archive-primary.cloudera.com/cm5/installer/5.1.2/cloudera-manager-installer.bin created a cloudera-manager.repo pointing at http://archive-primary.cloudera.com/cm5/redhat/6/x86_64/cm/5/RPMS/x86_64/ and even though I edited the cloudera-manager.repo to point at http://archive-primary.cloudera.com/cm5/redhat/6/x86_64/cm/5.1.2/RPMS/x86_64/ it appears it had already installed the cloudera-manager-agent-5.1.3 before I got to that point, so when it installed the rest of the manifest (the 5.1.2 elements) the entire install nuked on all 3 nodes.
So now what I've done is to yum remove everything and start over by hand... I've now done the following;
yum install http://archive-primary.cloudera.com/cm5/redhat/6/x86_64/cm/5.1.2/RPMS/x86_64/cloudera-manager-agent-5.1.2-1.cm512.p0.116.el6.x86_64.rpm yum install http://archive-primary.cloudera.com/cm5/redhat/6/x86_64/cm/5.1.2/RPMS/x86_64/cloudera-manager-daemons-5.1.2-1.cm512.p0.116.el6.x86_64.rpm yum install http://archive-primary.cloudera.com/cm5/redhat/6/x86_64/cm/5.1.2/RPMS/x86_64/cloudera-manager-server-5.1.2-1.cm512.p0.116.el6.x86_64.rpm yum install http://archive-primary.cloudera.com/cm5/redhat/6/x86_64/cm/5.1.2/RPMS/x86_64/cloudera-manager-server-db-2-5.1.2-1.cm512.p0.116.el6.x86_64.rpm yum install http://archive-primary.cloudera.com/cm5/redhat/6/x86_64/cm/5.1.2/RPMS/x86_64/enterprise-debuginfo-5.1.2-1.cm512.p0.116.el6.x86_64.rpm yum install http://archive-primary.cloudera.com/cm5/redhat/6/x86_64/cm/5.1.2/RPMS/x86_64/jdk-6u31-linux-amd64.rpm yum install http://archive-primary.cloudera.com/cm5/redhat/6/x86_64/cm/5.1.2/RPMS/x86_64/oracle-j2sdk1.7-1.7.0+update55-2.x86_64.rpm
So will need to figure out how to start cloudera manager by hand to see if I can complete a very basic install with MRv1 framework and then Kerberize the cluster. Any advice you'd have is appreciated and thanks for your patience.
mit Freundlichen Grüßen (with Friendly Greetings),
Jan
Jan Peters
... View more
10-02-2014
04:08 PM
Hi Mark,
It took a bit of mental gymnastics but I finally figured it out... first issue for me was finding/removing artifacts from the aborted 5.1.3 install. The install choked early on, on several artifacts and so I had to use yum --setopt=tsflags=noscripts remove <> along with yum history sync to clean up the repos.
The other big learning for me was that if I halted the install, edited the /etc/yum.repos.d/cloudera-manager.repo file and re-ran the install, CM would save out my edited version and replace it with the original one pointing to the symlink... SO... what I ended up doing is waiting until it was ready to install, and then edited the /etc/yum.repos.d/cloudera-manager.repo file before stepping into the install. Additionally, the other learning was using packages rather than parcels as it allowed me to specifically select 5.1.2 among many other versions.
Thank you again for you advice and wisdom; I'll be back when I start Kerberizing this 3-node cluster. 🙂
mit Freundlichen Grüßen (with Friendly Greetings),
Jan
Jan Peters
... View more