Member since
02-16-2015
9
Posts
6
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
51848 | 03-01-2015 05:32 PM |
03-01-2015
05:32 PM
6 Kudos
Fix was to remove or move the urika cache directory from all the nodes with (computes in my case). Seems that these directories will get re-created during a run. Bug when you go from simlpe AUTH to kerberos AUTH; the cache directories will not work if created under simple AUTH.
... View more
02-21-2015
01:16 AM
Try to run a simple test and get permissioned denied errors; tried as both root and urika user. Just enabled kerberos... [root@skipper4 cloudera-scm-server]# hadoop jar /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar pi 10 100 Number of Maps = 10 Samples per Map = 100 Wrote input for Map #0 Wrote input for Map #1 Wrote input for Map #2 Wrote input for Map #3 Wrote input for Map #4 Wrote input for Map #5 Wrote input for Map #6 Wrote input for Map #7 Wrote input for Map #8 Wrote input for Map #9 Starting Job 15/02/21 03:11:40 INFO client.RMProxy: Connecting to ResourceManager at skipper4/10.0.1.4:8032 15/02/21 03:11:40 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 4 for urika on 10.0.1.4:8020 15/02/21 03:11:40 INFO security.TokenCache: Got dt for hdfs://skipper4:8020; Kind: HDFS_DELEGATION_TOKEN, Service: 10.0.1.4:8020, Ident: (HDFS_DELEGATION_TOKEN token 4 for urika) 15/02/21 03:11:41 INFO input.FileInputFormat: Total input paths to process : 10 15/02/21 03:11:41 INFO mapreduce.JobSubmitter: number of splits:10 15/02/21 03:11:41 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1424508393097_0004 15/02/21 03:11:41 INFO mapreduce.JobSubmitter: Kind: HDFS_DELEGATION_TOKEN, Service: 10.0.1.4:8020, Ident: (HDFS_DELEGATION_TOKEN token 4 for urika) 15/02/21 03:11:41 INFO impl.YarnClientImpl: Submitted application application_1424508393097_0004 15/02/21 03:11:41 INFO mapreduce.Job: The url to track the job: http://skipper4:8088/proxy/application_1424508393097_0004/ 15/02/21 03:11:41 INFO mapreduce.Job: Running job: job_1424508393097_0004 15/02/21 03:11:56 INFO mapreduce.Job: Job job_1424508393097_0004 running in uber mode : false 15/02/21 03:11:56 INFO mapreduce.Job: map 0% reduce 0% 15/02/21 03:11:56 INFO mapreduce.Job: Job job_1424508393097_0004 failed with state FAILED due to: Application application_1424508393097_0004 failed 2 times due to AM Container for appattempt_1424508393097_0004_000002 exited with exitCode: -1000 due to: Application application_1424508393097_0004 initialization failed (exitCode=255) with output: main : command provided 0 main : user is urika main : requested yarn user is urika Can't create directory /mnt/ssd/yarn/nm/usercache/urika/appcache/application_1424508393097_0004 - Permission denied Did not create any app directories .Failing this attempt.. Failing the application. 15/02/21 03:11:56 INFO mapreduce.Job: Counters: 0 Job Finished in 15.543 seconds java.io.FileNotFoundException: File does not exist: hdfs://skipper4:8020/user/urika/QuasiMonteCarlo_1424509895729_44418568/out/reduce-out at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1093) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1085) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1085) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1749) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1773) at org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:314) at org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:354) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:363) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72) at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:145) at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
... View more
Labels:
02-17-2015
06:40 PM
Hey Guys, Any updates? Tried re-installing today and followed instructions; same exact issue... Worse part is the 48 compute nodes become unreachable if I try to back out changes. Change mgmt hosts to FQDN; [root@skipper4 ~]# nslookup skipper3 Server: 172.30.84.40 Address: 172.30.84.40#53 Name: skipper3.us.com Address: 172.30.64.116 [root@skipper4 ~]# host skipper3 skipper3.us.com has address 172.30.64.116 [root@skipper4 ~]# nslookup 172.30.64.116 Server: 172.30.84.40 Address: 172.30.84.40#53 116.64.30.172.in-addr.arpa name = skipper3.us.com [root@skipper4 ~]# at org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:87)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:135)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:79)
2015-02-17 20:31:55,699 INFO org.apache.zookeeper.server.quorum.QuorumPeerConfig: Reading configuration from: /var/run/cloudera-scm-agent/process/3346-zookeeper-server/zoo.cfg
2015-02-17 20:31:55,709 INFO org.apache.zookeeper.server.quorum.QuorumPeerConfig: Defaulting to majority quorums
2015-02-17 20:31:55,713 INFO org.apache.zookeeper.server.DatadirCleanupManager: autopurge.snapRetainCount set to 5
2015-02-17 20:31:55,713 INFO org.apache.zookeeper.server.DatadirCleanupManager: autopurge.purgeInterval set to 24
2015-02-17 20:31:55,714 INFO org.apache.zookeeper.server.DatadirCleanupManager: Purge task started.
2015-02-17 20:31:55,723 INFO org.apache.zookeeper.server.quorum.QuorumPeerMain: Starting quorum peer
2015-02-17 20:31:55,730 INFO org.apache.zookeeper.server.DatadirCleanupManager: Purge task completed.
2015-02-17 20:31:55,792 ERROR org.apache.zookeeper.server.quorum.QuorumPeerMain: Unexpected exception, exiting abnormally
java.io.IOException: Could not configure server because SASL configuration did not allow the ZooKeeper server to authenticate itself properly: javax.security.auth.login.LoginException: Connection refused
at org.apache.zookeeper.server.ServerCnxnFactory.configureSaslLogin(ServerCnxnFactory.java:207)
at org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:87)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:135)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:79) The KDC is on mgmt3; mgmt4 = CM quick test; [root@skipper4 ~]# kadmin -k -t /etc/cloudera-scm-server/cmf.keytab -p cloudera-scm/admin@URIKA-XA.COM -r URIKA-XA.COM Authenticating as principal cloudera-scm/admin@URIKA-XA.COM with keytab /etc/cloudera-scm-server/cmf.keytab. kadmin: From a cmpute node: [root@urika-xa1 ~]# kinit hdfs Password for hdfs@URIKA-XA.COM: [root@urika-xa1 ~]# [root@urika-xa1 ~]# [root@urika-xa1 ~]# [root@urika-xa1 ~]# k k5srvutil kbdrate kexec killall5 klist krb5-send-pr ktutil kadmin kdestroy kill kinit kpartx ksu kvno kbd_mode kdump killall kite-dataset kpasswd kswitch [root@urika-xa1 ~]# klist Ticket cache: FILE:/tmp/krb5cc_0 Default principal: hdfs@URIKA-XA.COM Valid starting Expires Service principal 02/17/15 20:35:25 02/18/15 20:35:25 krbtgt/URIKA-XA.COM@URIKA-XA.COM renew until 02/24/15 20:35:25 [root@urika-xa1 ~]# [root@urika-xa1 ~]# cat /etc/krb5.conf [libdefaults] default_realm = URIKA-XA.COM dns_lookup_kdc = false dns_lookup_realm = false ticket_lifetime = 86400 renew_lifetime = 604800 forwardable = true default_tgs_enctypes = rc4-hmac default_tkt_enctypes = rc4-hmac permitted_enctypes = rc4-hmac udp_preference_limit = 1 [realms] URIKA-XA.COM = { kdc = skipper3 admin_server = skipper3 } [root@urika-xa1 ~]#
... View more
02-17-2015
10:23 AM
Updated the encyption types to include the ones listed; then re-pushed krb5.conf from cloudera. Same issue. [root@mgmt2-ib ~]# cat /etc/krb5.conf [libdefaults] default_realm = URIKA-XA.COM dns_lookup_kdc = false dns_lookup_realm = false ticket_lifetime = 86400 renew_lifetime = 604800 forwardable = true default_tgs_enctypes = rc4-hmac aes256-cts aes128-cts des3-hmac-sha1 arcfour-hmac des-hmac-sha1 des-cbc-md5 des-cbc-crc default_tkt_enctypes = rc4-hmac aes256-cts aes128-cts des3-hmac-sha1 arcfour-hmac des-hmac-sha1 des-cbc-md5 des-cbc-crc permitted_enctypes = rc4-hmac aes256-cts aes128-cts des3-hmac-sha1 arcfour-hmac des-hmac-sha1 des-cbc-md5 des-cbc-crc udp_preference_limit = 1 [realms] URIKA-XA.COM = { kdc = mgmt4-ib.urika-xa.com admin_server = mgmt4-ib.urika-xa.com } Same issue at org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:87)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:135)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:79)
2015-02-17 12:20:40,767 INFO org.apache.zookeeper.server.quorum.QuorumPeerConfig: Reading configuration from: /var/run/cloudera-scm-agent/process/2344-zookeeper-server/zoo.cfg
2015-02-17 12:20:40,777 INFO org.apache.zookeeper.server.quorum.QuorumPeerConfig: Defaulting to majority quorums
2015-02-17 12:20:40,780 INFO org.apache.zookeeper.server.DatadirCleanupManager: autopurge.snapRetainCount set to 5
2015-02-17 12:20:40,780 INFO org.apache.zookeeper.server.DatadirCleanupManager: autopurge.purgeInterval set to 24
2015-02-17 12:20:40,782 INFO org.apache.zookeeper.server.DatadirCleanupManager: Purge task started.
2015-02-17 12:20:40,791 INFO org.apache.zookeeper.server.quorum.QuorumPeerMain: Starting quorum peer
2015-02-17 12:20:40,794 INFO org.apache.zookeeper.server.DatadirCleanupManager: Purge task completed.
2015-02-17 12:20:40,859 ERROR org.apache.zookeeper.server.quorum.QuorumPeerMain: Unexpected exception, exiting abnormally
java.io.IOException: Could not configure server because SASL configuration did not allow the ZooKeeper server to authenticate itself properly: javax.security.auth.login.LoginException: Connection refused
at org.apache.zookeeper.server.ServerCnxnFactory.configureSaslLogin(ServerCnxnFactory.java:207)
at org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:87)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:135)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:79)
... View more
02-17-2015
02:57 AM
That was due to trying shortname; can change back (had it with FQDN before). Internal network; so no DNS. KDC and CM are the same system; (mgmt4 node). Treid from mgmt2; another system; everything seems to work [root@mgmt2-ib ~]# kadmin -p crayadm/admin Authenticating as principal crayadm/admin with password. Password for crayadm/admin@URIKA-XA.COM: kadmin: kadmin: kadmin: list list_policies list_principals list_requests listpols listprincs kadmin: listprincs HTTP/mgmt1-ib@URIKA-XA.COM HTTP/mgmt2-ib@URIKA-XA.COM HTTP/mgmt4-ib@URIKA-XA.COM HTTP/urika-xa10@URIKA-XA.COM From a compute node: [root@urika-xa37 ~]# hostname -f urika-xa37.urika-xa.com [root@urika-xa37 ~]# hostname urika-xa37 [root@urika-xa37 ~]# [root@mgmt2-ib ~]# kinit hdfs Password for hdfs@URIKA-XA.COM: [root@mgmt2-ib ~]# klist Ticket cache: FILE:/tmp/krb5cc_0 Default principal: hdfs@URIKA-XA.COM Valid starting Expires Service principal 02/17/15 04:52:56 02/18/15 04:52:56 krbtgt/URIKA-XA.COM@URIKA-XA.COM renew until 02/24/15 04:52:56 [root@mgmt2-ib ~]# Have hostnames setup in /etc/hosts as: # Management Nodes 192.168.1.1 mgmt1.urika-xa.com mgmt1 192.168.1.2 mgmt2.urika-xa.com mgmt2 192.168.1.3 mgmt3.urika-xa.com mgmt3 192.168.1.4 mgmt4.urika-xa.com mgmt4 10.0.1.1 mgmt1-ib.urika-xa.com mgmt1-ib 10.0.1.2 mgmt2-ib.urika-xa.com mgmt2-ib 10.0.1.3 mgmt3-ib.urika-xa.com mgmt3-ib 10.0.1.4 mgmt4-ib.urika-xa.com mgmt4-ib # Compute Blades 192.168.100.101 urika-xa1-eth.urika-xa.com urika-xa1-eth 192.168.100.102 urika-xa2-eth.urika-xa.com urika-xa2-eth 192.168.100.103 urika-xa3-eth.urika-xa.com urika-xa3-eth 192.168.100.104 urika-xa4-eth.urika-xa.com urika-xa4-eth 192.168.100.105 urika-xa5-eth.urika-xa.com urika-xa5-eth 192.168.100.106 urika-xa6-eth.urika-xa.com urika-xa6-eth 192.168.100.107 urika-xa7-eth.urika-xa.com urika-xa7-eth
... View more
02-17-2015
01:29 AM
Hello, Be tinkering all weekend with Kerberos; still stuck on following during zookeeper start at org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:87)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:135)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:79)
2015-02-17 03:17:26,942 INFO org.apache.zookeeper.server.quorum.QuorumPeerConfig: Reading configuration from: /var/run/cloudera-scm-agent/process/2275-zookeeper-server/zoo.cfg
2015-02-17 03:17:26,952 INFO org.apache.zookeeper.server.quorum.QuorumPeerConfig: Defaulting to majority quorums
2015-02-17 03:17:26,955 INFO org.apache.zookeeper.server.DatadirCleanupManager: autopurge.snapRetainCount set to 5
2015-02-17 03:17:26,955 INFO org.apache.zookeeper.server.DatadirCleanupManager: autopurge.purgeInterval set to 24
2015-02-17 03:17:26,957 INFO org.apache.zookeeper.server.DatadirCleanupManager: Purge task started.
2015-02-17 03:17:26,965 INFO org.apache.zookeeper.server.quorum.QuorumPeerMain: Starting quorum peer
2015-02-17 03:17:26,969 INFO org.apache.zookeeper.server.DatadirCleanupManager: Purge task completed.
2015-02-17 03:17:27,037 ERROR org.apache.zookeeper.server.quorum.QuorumPeerMain: Unexpected exception, exiting abnormally
java.io.IOException: Could not configure server because SASL configuration did not allow the ZooKeeper server to authenticate itself properly: javax.security.auth.login.LoginException: mgmt4-ib.urika-xa.com
at org.apache.zookeeper.server.ServerCnxnFactory.configureSaslLogin(ServerCnxnFactory.java:207)
at org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:87)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:135)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116)
at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:79) Everything through wizard seems to work until it starts the cluster; kadmin yarn/urika-xa42@URIKA-XA.COM yarn/urika-xa43@URIKA-XA.COM yarn/urika-xa44@URIKA-XA.COM yarn/urika-xa45@URIKA-XA.COM yarn/urika-xa46@URIKA-XA.COM yarn/urika-xa47@URIKA-XA.COM yarn/urika-xa48@URIKA-XA.COM yarn/urika-xa4@URIKA-XA.COM yarn/urika-xa5@URIKA-XA.COM yarn/urika-xa6@URIKA-XA.COM yarn/urika-xa7@URIKA-XA.COM yarn/urika-xa8@URIKA-XA.COM yarn/urika-xa9@URIKA-XA.COM zookeeper/mgmt1-ib@URIKA-XA.COM zookeeper/mgmt2-ib@URIKA-XA.COM zookeeper/mgmt3-ib@URIKA-XA.COM kadmin: [libdefaults] default_realm = URIKA-XA.COM dns_lookup_realm = false dns_lookup_kdc = false ticket_lifetime = 86400 renew_lifetime = 604800 forwardable = true default_tgs_enctypes = rc4-hmac default_tkt_enctypes = rc4-hmac permitted_enctypes = rc4-hmac udp_preference_limit = 1 [realms] URIKA-XA.COM = { kdc = mgmt4-ib admin_server = mgmt4-ib } [root@mgmt4-ib cloudera-scm-server]# Tested kadmin with cloudera key: kadmin -k -t /etc/cloudera-scm-server/cmf.keytab -p cloudera-scm/admin@URIKA-XA.COM -r URIKA-XA.COM Authenticating as principal cloudera-scm/admin@URIKA-XA.COM with keytab /etc/cloudera-scm-server/cmf.keytab. kadmin: default security realm: URIKA-XA.COM [root@mgmt4-ib cloudera-scm-server]# cat /var/kerberos/krb5kdc/kdc.conf [realms] URIKA-XA.COM = { #master_key_type = aes256-cts acl_file = /var/kerberos/krb5kdc/kadm5.acl dict_file = /usr/share/dict/words admin_keytab = /var/kerberos/krb5kdc/kadm5.keytab supported_enctypes = aes256-cts:normal aes128-cts:normal des3-hmac-sha1:normal arcfour-hmac:normal des-hmac-sha1:normal des-cbc-md5:normal des-cbc-crc:normal max_life = 24h 0m 0s max_renewable_life = 7d 0h 0m 0s }
... View more
Labels:
- Labels:
-
Apache YARN
-
Apache Zookeeper
-
Kerberos
-
Security