Member since
01-16-2017
12
Posts
1
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4004 | 01-16-2017 02:53 AM |
03-06-2017
03:29 AM
@venkatsambath This is great information, many thanks!
... View more
03-01-2017
06:36 AM
Hi all, I'm mainly looking for advice here around Kafka and disaster recovery failover. Is there any way to use Kafka through CNAMEs/load balancer when using Kerberos? When trying it, I get the below SPN error. This makes sense and I would fully expect this behaviour. The only way I could picture this working would be to include a CNAME resolver into the Java client code before establishing a connection: #Using the New Consumer API
#On any new connections, do the following:
1) Provide CNAME hostname in config
2) Resolve CNAME to list of A records for broker hosts
3) Pass these into the New Consumer as the bootstrap servers This should work, however it would involve custom code. Are there any ideas that might work without having to resort to this? --------------- Consumer log 17/03/01 14:12:06 DEBUG consumer.KafkaConsumer: Subscribed to topic(s): build_smoke_test
17/03/01 14:12:06 DEBUG clients.NetworkClient: Initiating connection to node -1 at lb.cdh-poc-cluster.internal.cdhnetwork:9093.
17/03/01 14:12:06 DEBUG authenticator.SaslClientAuthenticator: Set SASL client state to SEND_HANDSHAKE_REQUEST
17/03/01 14:12:06 DEBUG authenticator.SaslClientAuthenticator: Creating SaslClient: client=alex@CDH-POC-CLUSTER.INTERNAL.CDHNETWORK;service=kafka;serviceHostname=lb.cdh-poc-cluster.internal.cdhnetwork;mechs=[GSSAPI]
17/03/01 14:12:06 DEBUG network.Selector: Connection with lb.cdh-poc-cluster.internal.cdhnetwork/172.3.1.10 disconnected
java.io.EOFException
at org.apache.kafka.common.network.SslTransportLayer.read(SslTransportLayer.java:488)
at org.apache.kafka.common.network.NetworkReceive.readFromReadableChannel(NetworkReceive.java:81) Broker log: 2017-03-01 14:12:08,330 DEBUG org.apache.kafka.common.security.authenticator.SaslServerAuthenticator: Set SASL server state to HANDSHA
KE_REQUEST
2017-03-01 14:12:08,330 DEBUG org.apache.kafka.common.security.authenticator.SaslServerAuthenticator: Handle Kafka request SASL_HANDSH
AKE
2017-03-01 14:12:08,330 DEBUG org.apache.kafka.common.security.authenticator.SaslServerAuthenticator: Using SASL mechanism 'GSSAPI' pr
ovided by client
2017-03-01 14:12:08,331 DEBUG org.apache.kafka.common.security.authenticator.SaslServerAuthenticator: Creating SaslServer for kafka/kf
0.cdh-poc-cluster.internal.cdhnetwork@CDH-POC-CLUSTER.INTERNAL.CDHNETWORK with mechanism GSSAPI
2017-03-01 14:12:08,331 DEBUG org.apache.kafka.common.security.authenticator.SaslServerAuthenticator: Set SASL server state to AUTHENT
ICATE
2017-03-01 14:12:08,334 DEBUG org.apache.kafka.common.security.authenticator.SaslServerAuthenticator: Set SASL server state to FAILED
2017-03-01 14:12:08,334 DEBUG org.apache.kafka.common.network.Selector: Connection with lb.cdh-poc-cluster.internal.cdhnetwork/172.3.1
.10 disconnected
java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: Failure unspecified at GSS-API le
vel (Mechanism level: Checksum failed)]
at org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.authenticate(SaslServerAuthenticator.java:243)
at org.apache.kafka.common.network.KafkaChannel.prepare(KafkaChannel.java:64)
at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:318)
at org.apache.kafka.common.network.Selector.poll(Selector.java:283)
at kafka.network.Processor.poll(SocketServer.scala:472)
at kafka.network.Processor.run(SocketServer.scala:412)
at java.lang.Thread.run(Thread.java:745)
Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: Failure unspecified at GSS-API level (Mecha
nism level: Checksum failed)]
at com.sun.security.sasl.gsskerb.GssKrb5Server.evaluateResponse(GssKrb5Server.java:199)
at org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.authenticate(SaslServerAuthenticator.java:228)
... 6 more
... View more
Labels:
- Labels:
-
Apache Kafka
02-20-2017
07:32 AM
1 Kudo
@venkatsambath Thanks for the confirmation. I'm just thinking in terms of high-availability, have we now introduced a single point of failure for Impala? How would you make the load balancer highly available? You could have multiple load balancers and use DNS CNAMEs to balance these, however since we are using Kerberos the domain name we point requests at must be an A record so this won't work. Is any of my thinking around the above incorrect? How would be make the service highly available when using load balancers and kerberos for impala?
... View more
02-17-2017
05:35 AM
Hi, I am configuring Impala to use a load balancer and Kerberos. I have this setup working, however I am unable to query each daemon directly. Is this normal behavior? Showing a successful and unsuccessful query: [centos@kf0 ~]$ klist
Ticket cache: FILE:/tmp/krb5cc_1000
Default principal: alex@CDH-POC-CLUSTER.INTERNAL.CDHNETWORK
Valid starting Expires Service principal
02/17/17 13:08:30 02/18/17 13:08:30 krbtgt/CDH-POC-CLUSTER.INTERNAL.CDHNETWORK@CDH-POC-CLUSTER.INTERNAL.CDHNETWORK
renew until 02/24/17 13:08:30
02/17/17 13:08:51 02/18/17 13:08:30 impala/dn1.cdh-poc-cluster.internal.cdhnetwork@CDH-POC-CLUSTER.INTERNAL.CDHNETWORK
renew until 02/24/17 13:08:30
02/17/17 13:14:00 02/18/17 13:08:30 impala/dn2.cdh-poc-cluster.internal.cdhnetwork@CDH-POC-CLUSTER.INTERNAL.CDHNETWORK
renew until 02/24/17 13:08:30
02/17/17 13:27:16 02/18/17 13:08:30 impala/lb.cdh-poc-cluster.internal.cdhnetwork@CDH-POC-CLUSTER.INTERNAL.CDHNETWORK
renew until 02/24/17 13:08:30
[centos@kf0 ~]$ impala-shell --ssl --impalad=lb.cdh-poc-cluster.internal.cdhnetwork:21000 -q "show tables" --ca_cert "/etc/ipa/ca.crt" -k -V
Starting Impala Shell using Kerberos authentication
Using service name 'impala'
SSL is enabled
Connected to lb.cdh-poc-cluster.internal.cdhnetwork:21000
Server version: impalad version 2.7.0-cdh5.10.0 RELEASE (build 785a073cd07e2540d521ecebb8b38161ccbd2aa2)
Query: show tables
Fetched 0 row(s) in 0.43s
[centos@kf0 ~]$ impala-shell --ssl --impalad=dn1.cdh-poc-cluster.internal.cdhnetwork:21000 -q "show tables" --ca_cert "/etc/ipa/ca.crt" -k -V
Starting Impala Shell using Kerberos authentication
Using service name 'impala'
SSL is enabled
Error connecting: TTransportException, TSocket read 0 bytes
Not connected to Impala, could not execute queries. In the logs I see: E0217 13:27:36.607559 6262 authentication.cc:160] SASL message (Kerberos (external)): GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (Request ticket server impala/dn1.cdh-poc-cluster.internal.cdhnetwork@CDH-POC-CLUSTER.INTERNAL.CDHNETWORK found in keytab but does not match server principal impala/lb.cdh-poc-cluster.internal.cdhnetwork@)
I0217 13:27:36.625763 6262 thrift-util.cc:111] SSL_shutdown: error code: 0
I0217 13:27:36.625901 6262 thrift-util.cc:111] TThreadPoolServer: TServerTransport died on accept: SASL(-13): authentication failure: GSSAPI Failure: gss_accept_sec_context However in the keytab file I see the dn1 princ is there: [root@dn1 impalad]# klist -kt /run/cloudera-scm-agent/process/64-impala-IMPALAD/impala.keytab
Keytab name: FILE:/run/cloudera-scm-agent/process/64-impala-IMPALAD/impala.keytab
KVNO Timestamp Principal
---- ------------------- ------------------------------------------------------
1 02/17/2017 12:03:52 impala/lb.cdh-poc-cluster.internal.cdhnetwork@CDH-POC-CLUSTER.INTERNAL.CDHNETWORK
1 02/17/2017 12:03:52 impala/lb.cdh-poc-cluster.internal.cdhnetwork@CDH-POC-CLUSTER.INTERNAL.CDHNETWORK
1 02/17/2017 12:03:52 impala/dn1.cdh-poc-cluster.internal.cdhnetwork@CDH-POC-CLUSTER.INTERNAL.CDHNETWORK
1 02/17/2017 12:03:52 impala/dn1.cdh-poc-cluster.internal.cdhnetwork@CDH-POC-CLUSTER.INTERNAL.CDHNETWORK
[root@dn1 impalad]# And the daemon princs are set correctly: [root@dn1 impalad]# cat /run/cloudera-scm-agent/process/64-impala-IMPALAD/impala-conf/impalad_flags | grep princ
-principal=impala/lb.cdh-poc-cluster.internal.cdhnetwork@CDH-POC-CLUSTER.INTERNAL.CDHNETWORK
-be_principal=impala/dn1.cdh-poc-cluster.internal.cdhnetwork@CDH-POC-CLUSTER.INTERNAL.CDHNETWORK
[root@dn1 impalad]# So is this normal behaviour that the daemons can no longer be queried directly once Kerberos has been enabled when using a load balancer, or am I doing something wrong? Thanks
... View more
Labels:
- Labels:
-
Apache Impala
-
Cloudera Manager
-
Kerberos
02-15-2017
08:07 AM
Okay, I have it. I was using the parcel_provisioner.sh script to preload the parcels into Docker images. However, when doing the pre-extraction the permissions on the container-executor weren't being set properly. For now, turning off the preextracting works however I'll test by manually setting the perms however I'm wondering how many other permissions aren't set properly. FYI, the root:hadoop 400 permissions work because of the setuid flag on the container-executor binary. Now everything makes sense. Thanks for the help!
... View more
02-15-2017
01:34 AM
Hi, Looking at that document, I see: conf/container-executor.cfg
The executable requires a configuration file called container-executor.cfg to be present in the configuration directory passed to the mvn target mentioned above.
The configuration file must be owned by the user running NodeManager (user yarn in the above example), group-owned by anyone and should have the permissions 0400 or r--------. This makes sense, as if the container-executor runs as yarn, how can it read the configuration? Does anyone have a running kerberos cluster to confirm the permissions?
... View more
02-14-2017
09:05 AM
Actually, digging into this a bit more I think it is the permissions on the container-executor.cfg that is causing the issue. The nodemanager is launched as the yarn user: yarn 17040 17035 0 16:53 ? 00:00:00 python2.7 /usr/lib64/cmf/agent/build/env/bin/cmf-redactor /usr/lib64/cmf/service/yarn/yarn.sh nodemanager And from here http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/SecureContainer.html: On Linux environment the secure container executor is the LinuxContainerExecutor. It uses an external program called the container-executor> to launch the container. This program has the setuid access right flag set which allows it to launch the container with the permissions of the YARN application user. This would explain why this is just happening to secure clusters built by CD. It seems that the container-executor.cfg is created and populated at nodemanager restart time, so I cannot change permissions on the cfg file to test. Is there a reason why these cfg files are created with 400 and not 444? Should they be 444 on secure clusters? Can this be changed, and where? Thanks
... View more
02-14-2017
07:29 AM
I used Cloudera Director to build a cluster without Kerberos. Yarn came up okay and the permissions were the following: [root@dn2 ~]# find / -name container-executor.cfg -exec ls -l {} \;
-rw-r--r--. 13 cloudera-scm cloudera-scm 318 Jan 20 21:38 /opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.41/etc/hadoop/conf.empty/container-executor.cfg
-r--------. 1 root hadoop 0 Feb 14 15:16 /run/cloudera-scm-agent/process/44-yarn-NODEMANAGER/container-executor.cfg
-r--------. 1 root hadoop 0 Feb 14 15:16 /etc/hadoop/conf.cloudera.CD-YARN-uMqvpvqg/container-executor.cfg They are the same permissions, so it seems the permissions are not the issue. Perhaps it is the contents. Any clues?
... View more
02-14-2017
05:34 AM
Hi, I am trying to build Kerberos-enabled clusters using Cloudera Director. During FirstRun pretty much all services come online except YARN. HDFS, Hue, Zookeeper and Kafka are all fine. When bringing up the nodemanager I see the following in the role logs: Error starting NodeManager
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to initialize container executor
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:251)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:544)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:591)
Caused by: java.io.IOException: Linux container executor not configured properly (error=24)
at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:198)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:249)
... 3 more
Caused by: ExitCodeException exitCode=24: Invalid conf file provided : /etc/hadoop/conf.cloudera.CD-YARN-VAJUGMaj/container-executor.cfg
at org.apache.hadoop.util.Shell.runCommand(Shell.java:601)
at org.apache.hadoop.util.Shell.run(Shell.java:504)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:786)
at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:192)
... 4 more Looking onto this node I see: [root@dn0 ~]# find / -name container-executor.cfg -exec ls -l {} \;
-rw-r--r--. 13 cloudera-scm cloudera-scm 318 Jan 20 21:38 /opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.41/etc/hadoop/conf.empty/container-executor.cfg
-r--------. 1 root hadoop 156 Feb 14 12:13 /run/cloudera-scm-agent/process/52-yarn-NODEMANAGER/container-executor.cfg
-r--------. 1 root hadoop 156 Feb 14 12:13 /etc/hadoop/conf.cloudera.CD-YARN-VAJUGMaj/container-executor.cfg
find: ‘/proc/17426’: No such file or directory
[root@dn0 ~]# ll /etc/hadoop/conf.cloudera.CD-YARN-VAJUGMaj/
total 52
-rw-r--r--. 1 root root 20 Feb 14 12:10 __cloudera_generation__
-rw-r--r--. 1 root root 67 Feb 14 12:10 __cloudera_metadata__
-r--------. 1 root hadoop 156 Feb 14 12:13 container-executor.cfg
-rw-r--r--. 1 root root 3895 Feb 14 12:10 core-site.xml
-rw-r--r--. 1 root root 617 Feb 14 12:10 hadoop-env.sh
-rw-r--r--. 1 root root 2684 Feb 14 12:10 hdfs-site.xml
-rw-r--r--. 1 root root 314 Feb 14 12:10 log4j.properties
-rw-r--r--. 1 root root 5011 Feb 14 12:10 mapred-site.xml
-rw-r--r--. 1 root root 315 Feb 14 12:10 ssl-client.xml
-rw-r--r--. 1 root hadoop 684 Feb 14 12:13 topology.map
-rwxr-xr-x. 1 root hadoop 1594 Feb 14 12:13 topology.py
-rw-r--r--. 1 root root 3872 Feb 14 12:10 yarn-site.xml And /etc/hadoop looks like: [root@dn0 ~]# ll /etc/hadoop
total 8
lrwxrwxrwx. 1 root root 29 Feb 14 12:10 conf -> /etc/alternatives/hadoop-conf
drwxr-xr-x. 2 root root 4096 Feb 14 12:10 conf.cloudera.CD-HDFS-gbUrTxBt
drwxr-xr-x. 2 root root 4096 Feb 14 12:13 conf.cloudera.CD-YARN-VAJUGMaj
[root@dn0 ~]# ll /etc/alternatives/hadoop-conf
lrwxrwxrwx. 1 root root 42 Feb 14 12:10 /etc/alternatives/hadoop-conf -> /etc/hadoop/conf.cloudera.CD-YARN-VAJUGMaj The yarn process runs as the yarn user I presume, so for some reason the wrong permissions are being given to container-executor.cfg. Just out of interest, the contents are: [root@dn0 ~]# cat /etc/hadoop/conf.cloudera.CD-YARN-VAJUGMaj/container-executor.cfg
yarn.nodemanager.linux-container-executor.group=yarn
min.user.id=1000
allowed.system.users=nobody,impala,hive,llama,hbase
banned.users=hdfs,yarn,mapred,bin When I look on our other cluster that doesn't use Kerberos and Cloudera Director, I see the following permissions: [root@????? ~]# find / -name container-executor.cfg -exec ls -l {} \;
-rw-r--r-- 1 root root 318 Jun 1 2016 /log/cloudera/parcels/CDH-5.7.1-1.cdh5.7.1.p0.11/etc/hadoop/conf.empty/container-executor.cfg
-r--r--r-- 1 root hadoop 0 Jan 23 05:37 /etc/hadoop/conf.cloudera.yarn/container-executor.cfg
-r-------- 1 root hadoop 0 Jan 23 05:37 /var/run/cloudera-scm-agent/process/ccdeploy_hadoop-conf_etchadoopconf.cloudera.yarn_593636475737667066/yarn-conf/container-executor.cfg
-r-------- 1 root hadoop 0 Jan 23 05:06 /var/run/cloudera-scm-agent/process/ccdeploy_hadoop-conf_etchadoopconf.cloudera.yarn_-6875379618642481202/yarn-conf/container-executor.cfg
-r-------- 1 root hadoop 0 Jan 23 05:37 /var/run/cloudera-scm-agent/process/1056-yarn-NODEMANAGER/container-executor.cfg
[root@????? ~]# ll /etc/hadoop
total 8
lrwxrwxrwx 1 root root 29 Jan 31 08:29 conf -> /etc/alternatives/hadoop-conf
drwxr-xr-x 2 root root 4096 Jan 23 05:37 conf.cloudera.hdfs
drwxr-xr-x 2 root root 4096 Jan 31 08:29 conf.cloudera.yarn
[root@????? ~]# These look more reasonable. Can anybody give me a clue how these permissions are getting (or not getting) set? Since this is Cloudera Director, its out of my control how they are being set.
... View more
Labels:
- Labels:
-
Apache YARN
-
Kerberos
01-16-2017
02:53 AM
I have the answer, it is blacklisted in both the server and client. Server: [centos@***** ~]$ cat /etc/cloudera-director-server/application.properties | grep blacklist
lp.plugin.configuration.blacklist: sandbox Client: /usr/bin/cloudera-director bootstrap-remote "/etc/cloudera-director-server/deployments/*******.conf" --lp.remote.username=****** --lp.remote.password=****** --lp.remote.hostAndPort=*********:7189 --lp.plugin.configuration.blacklist=sandbox The important part being: --lp.plugin.configuration.blacklist as a command-line option. This now lets me use the BYON plugin out of the box. Thanks
... View more