About bushnoh

bushnoh · ‎03-06-2017

@venkatsambath This is great information, many thanks!

bushnoh · ‎02-20-2017

@venkatsambath Thanks for the confirmation. I'm just thinking in terms of high-availability, have we now introduced a single point of failure for Impala? How would you make the load balancer highly available? You could have multiple load balancers and use DNS CNAMEs to balance these, however since we are using Kerberos the domain name we point requests at must be an A record so this won't work. Is any of my thinking around the above incorrect? How would be make the service highly available when using load balancers and kerberos for impala?

bushnoh · ‎02-17-2017

Hi, I am configuring Impala to use a load balancer and Kerberos. I have this setup working, however I am unable to query each daemon directly. Is this normal behavior? Showing a successful and unsuccessful query: [centos@kf0 ~]$ klist Ticket cache: FILE:/tmp/krb5cc_1000 Default principal: alex@CDH-POC-CLUSTER.INTERNAL.CDHNETWORK Valid starting Expires Service principal 02/17/17 13:08:30 02/18/17 13:08:30 krbtgt/CDH-POC-CLUSTER.INTERNAL.CDHNETWORK@CDH-POC-CLUSTER.INTERNAL.CDHNETWORK renew until 02/24/17 13:08:30 02/17/17 13:08:51 02/18/17 13:08:30 impala/dn1.cdh-poc-cluster.internal.cdhnetwork@CDH-POC-CLUSTER.INTERNAL.CDHNETWORK renew until 02/24/17 13:08:30 02/17/17 13:14:00 02/18/17 13:08:30 impala/dn2.cdh-poc-cluster.internal.cdhnetwork@CDH-POC-CLUSTER.INTERNAL.CDHNETWORK renew until 02/24/17 13:08:30 02/17/17 13:27:16 02/18/17 13:08:30 impala/lb.cdh-poc-cluster.internal.cdhnetwork@CDH-POC-CLUSTER.INTERNAL.CDHNETWORK renew until 02/24/17 13:08:30 [centos@kf0 ~]$ impala-shell --ssl --impalad=lb.cdh-poc-cluster.internal.cdhnetwork:21000 -q "show tables" --ca_cert "/etc/ipa/ca.crt" -k -V Starting Impala Shell using Kerberos authentication Using service name 'impala' SSL is enabled Connected to lb.cdh-poc-cluster.internal.cdhnetwork:21000 Server version: impalad version 2.7.0-cdh5.10.0 RELEASE (build 785a073cd07e2540d521ecebb8b38161ccbd2aa2) Query: show tables Fetched 0 row(s) in 0.43s [centos@kf0 ~]$ impala-shell --ssl --impalad=dn1.cdh-poc-cluster.internal.cdhnetwork:21000 -q "show tables" --ca_cert "/etc/ipa/ca.crt" -k -V Starting Impala Shell using Kerberos authentication Using service name 'impala' SSL is enabled Error connecting: TTransportException, TSocket read 0 bytes Not connected to Impala, could not execute queries. In the logs I see: E0217 13:27:36.607559 6262 authentication.cc:160] SASL message (Kerberos (external)): GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (Request ticket server impala/dn1.cdh-poc-cluster.internal.cdhnetwork@CDH-POC-CLUSTER.INTERNAL.CDHNETWORK found in keytab but does not match server principal impala/lb.cdh-poc-cluster.internal.cdhnetwork@) I0217 13:27:36.625763 6262 thrift-util.cc:111] SSL_shutdown: error code: 0 I0217 13:27:36.625901 6262 thrift-util.cc:111] TThreadPoolServer: TServerTransport died on accept: SASL(-13): authentication failure: GSSAPI Failure: gss_accept_sec_context However in the keytab file I see the dn1 princ is there: [root@dn1 impalad]# klist -kt /run/cloudera-scm-agent/process/64-impala-IMPALAD/impala.keytab Keytab name: FILE:/run/cloudera-scm-agent/process/64-impala-IMPALAD/impala.keytab KVNO Timestamp Principal ---- ------------------- ------------------------------------------------------ 1 02/17/2017 12:03:52 impala/lb.cdh-poc-cluster.internal.cdhnetwork@CDH-POC-CLUSTER.INTERNAL.CDHNETWORK 1 02/17/2017 12:03:52 impala/lb.cdh-poc-cluster.internal.cdhnetwork@CDH-POC-CLUSTER.INTERNAL.CDHNETWORK 1 02/17/2017 12:03:52 impala/dn1.cdh-poc-cluster.internal.cdhnetwork@CDH-POC-CLUSTER.INTERNAL.CDHNETWORK 1 02/17/2017 12:03:52 impala/dn1.cdh-poc-cluster.internal.cdhnetwork@CDH-POC-CLUSTER.INTERNAL.CDHNETWORK [root@dn1 impalad]# And the daemon princs are set correctly: [root@dn1 impalad]# cat /run/cloudera-scm-agent/process/64-impala-IMPALAD/impala-conf/impalad_flags | grep princ -principal=impala/lb.cdh-poc-cluster.internal.cdhnetwork@CDH-POC-CLUSTER.INTERNAL.CDHNETWORK -be_principal=impala/dn1.cdh-poc-cluster.internal.cdhnetwork@CDH-POC-CLUSTER.INTERNAL.CDHNETWORK [root@dn1 impalad]# So is this normal behaviour that the daemons can no longer be queried directly once Kerberos has been enabled when using a load balancer, or am I doing something wrong? Thanks

bushnoh · ‎02-15-2017

Okay, I have it. I was using the parcel_provisioner.sh script to preload the parcels into Docker images. However, when doing the pre-extraction the permissions on the container-executor weren't being set properly. For now, turning off the preextracting works however I'll test by manually setting the perms however I'm wondering how many other permissions aren't set properly. FYI, the root:hadoop 400 permissions work because of the setuid flag on the container-executor binary. Now everything makes sense. Thanks for the help!

bushnoh · ‎02-15-2017

Hi, Looking at that document, I see: conf/container-executor.cfg The executable requires a configuration file called container-executor.cfg to be present in the configuration directory passed to the mvn target mentioned above. The configuration file must be owned by the user running NodeManager (user yarn in the above example), group-owned by anyone and should have the permissions 0400 or r--------. This makes sense, as if the container-executor runs as yarn, how can it read the configuration? Does anyone have a running kerberos cluster to confirm the permissions?

bushnoh · ‎02-14-2017

Actually, digging into this a bit more I think it is the permissions on the container-executor.cfg that is causing the issue. The nodemanager is launched as the yarn user: yarn 17040 17035 0 16:53 ? 00:00:00 python2.7 /usr/lib64/cmf/agent/build/env/bin/cmf-redactor /usr/lib64/cmf/service/yarn/yarn.sh nodemanager And from here http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/SecureContainer.html: On Linux environment the secure container executor is the LinuxContainerExecutor. It uses an external program called the container-executor> to launch the container. This program has the setuid access right flag set which allows it to launch the container with the permissions of the YARN application user. This would explain why this is just happening to secure clusters built by CD. It seems that the container-executor.cfg is created and populated at nodemanager restart time, so I cannot change permissions on the cfg file to test. Is there a reason why these cfg files are created with 400 and not 444? Should they be 444 on secure clusters? Can this be changed, and where? Thanks

bushnoh · ‎02-14-2017

I used Cloudera Director to build a cluster without Kerberos. Yarn came up okay and the permissions were the following: [root@dn2 ~]# find / -name container-executor.cfg -exec ls -l {} \; -rw-r--r--. 13 cloudera-scm cloudera-scm 318 Jan 20 21:38 /opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.41/etc/hadoop/conf.empty/container-executor.cfg -r--------. 1 root hadoop 0 Feb 14 15:16 /run/cloudera-scm-agent/process/44-yarn-NODEMANAGER/container-executor.cfg -r--------. 1 root hadoop 0 Feb 14 15:16 /etc/hadoop/conf.cloudera.CD-YARN-uMqvpvqg/container-executor.cfg They are the same permissions, so it seems the permissions are not the issue. Perhaps it is the contents. Any clues?

bushnoh · ‎02-14-2017

Hi, I am trying to build Kerberos-enabled clusters using Cloudera Director. During FirstRun pretty much all services come online except YARN. HDFS, Hue, Zookeeper and Kafka are all fine. When bringing up the nodemanager I see the following in the role logs: Error starting NodeManager org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to initialize container executor at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:251) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:544) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:591) Caused by: java.io.IOException: Linux container executor not configured properly (error=24) at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:198) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:249) ... 3 more Caused by: ExitCodeException exitCode=24: Invalid conf file provided : /etc/hadoop/conf.cloudera.CD-YARN-VAJUGMaj/container-executor.cfg at org.apache.hadoop.util.Shell.runCommand(Shell.java:601) at org.apache.hadoop.util.Shell.run(Shell.java:504) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:786) at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:192) ... 4 more Looking onto this node I see: [root@dn0 ~]# find / -name container-executor.cfg -exec ls -l {} \; -rw-r--r--. 13 cloudera-scm cloudera-scm 318 Jan 20 21:38 /opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.41/etc/hadoop/conf.empty/container-executor.cfg -r--------. 1 root hadoop 156 Feb 14 12:13 /run/cloudera-scm-agent/process/52-yarn-NODEMANAGER/container-executor.cfg -r--------. 1 root hadoop 156 Feb 14 12:13 /etc/hadoop/conf.cloudera.CD-YARN-VAJUGMaj/container-executor.cfg find: ‘/proc/17426’: No such file or directory [root@dn0 ~]# ll /etc/hadoop/conf.cloudera.CD-YARN-VAJUGMaj/ total 52 -rw-r--r--. 1 root root 20 Feb 14 12:10 __cloudera_generation__ -rw-r--r--. 1 root root 67 Feb 14 12:10 __cloudera_metadata__ -r--------. 1 root hadoop 156 Feb 14 12:13 container-executor.cfg -rw-r--r--. 1 root root 3895 Feb 14 12:10 core-site.xml -rw-r--r--. 1 root root 617 Feb 14 12:10 hadoop-env.sh -rw-r--r--. 1 root root 2684 Feb 14 12:10 hdfs-site.xml -rw-r--r--. 1 root root 314 Feb 14 12:10 log4j.properties -rw-r--r--. 1 root root 5011 Feb 14 12:10 mapred-site.xml -rw-r--r--. 1 root root 315 Feb 14 12:10 ssl-client.xml -rw-r--r--. 1 root hadoop 684 Feb 14 12:13 topology.map -rwxr-xr-x. 1 root hadoop 1594 Feb 14 12:13 topology.py -rw-r--r--. 1 root root 3872 Feb 14 12:10 yarn-site.xml And /etc/hadoop looks like: [root@dn0 ~]# ll /etc/hadoop total 8 lrwxrwxrwx. 1 root root 29 Feb 14 12:10 conf -> /etc/alternatives/hadoop-conf drwxr-xr-x. 2 root root 4096 Feb 14 12:10 conf.cloudera.CD-HDFS-gbUrTxBt drwxr-xr-x. 2 root root 4096 Feb 14 12:13 conf.cloudera.CD-YARN-VAJUGMaj [root@dn0 ~]# ll /etc/alternatives/hadoop-conf lrwxrwxrwx. 1 root root 42 Feb 14 12:10 /etc/alternatives/hadoop-conf -> /etc/hadoop/conf.cloudera.CD-YARN-VAJUGMaj The yarn process runs as the yarn user I presume, so for some reason the wrong permissions are being given to container-executor.cfg. Just out of interest, the contents are: [root@dn0 ~]# cat /etc/hadoop/conf.cloudera.CD-YARN-VAJUGMaj/container-executor.cfg yarn.nodemanager.linux-container-executor.group=yarn min.user.id=1000 allowed.system.users=nobody,impala,hive,llama,hbase banned.users=hdfs,yarn,mapred,bin When I look on our other cluster that doesn't use Kerberos and Cloudera Director, I see the following permissions: [root@????? ~]# find / -name container-executor.cfg -exec ls -l {} \; -rw-r--r-- 1 root root 318 Jun 1 2016 /log/cloudera/parcels/CDH-5.7.1-1.cdh5.7.1.p0.11/etc/hadoop/conf.empty/container-executor.cfg -r--r--r-- 1 root hadoop 0 Jan 23 05:37 /etc/hadoop/conf.cloudera.yarn/container-executor.cfg -r-------- 1 root hadoop 0 Jan 23 05:37 /var/run/cloudera-scm-agent/process/ccdeploy_hadoop-conf_etchadoopconf.cloudera.yarn_593636475737667066/yarn-conf/container-executor.cfg -r-------- 1 root hadoop 0 Jan 23 05:06 /var/run/cloudera-scm-agent/process/ccdeploy_hadoop-conf_etchadoopconf.cloudera.yarn_-6875379618642481202/yarn-conf/container-executor.cfg -r-------- 1 root hadoop 0 Jan 23 05:37 /var/run/cloudera-scm-agent/process/1056-yarn-NODEMANAGER/container-executor.cfg [root@????? ~]# ll /etc/hadoop total 8 lrwxrwxrwx 1 root root 29 Jan 31 08:29 conf -> /etc/alternatives/hadoop-conf drwxr-xr-x 2 root root 4096 Jan 23 05:37 conf.cloudera.hdfs drwxr-xr-x 2 root root 4096 Jan 31 08:29 conf.cloudera.yarn [root@????? ~]# These look more reasonable. Can anybody give me a clue how these permissions are getting (or not getting) set? Since this is Cloudera Director, its out of my control how they are being set.

bushnoh · ‎01-16-2017

I have the answer, it is blacklisted in both the server and client. Server: [centos@***** ~]$ cat /etc/cloudera-director-server/application.properties | grep blacklist lp.plugin.configuration.blacklist: sandbox Client: /usr/bin/cloudera-director bootstrap-remote "/etc/cloudera-director-server/deployments/*******.conf" --lp.remote.username=****** --lp.remote.password=****** --lp.remote.hostAndPort=*********:7189 --lp.plugin.configuration.blacklist=sandbox The important part being: --lp.plugin.configuration.blacklist as a command-line option. This now lets me use the BYON plugin out of the box. Thanks

bushnoh · ‎01-16-2017

Hi all, We are currently using Cloudera Director on a supported cloud provider, but I am trying to adapt the build scripts to create clusters using the BYON plugin with pre-created VMs for infrastructure build testing. I can see the BYON plugin is build and distributed with Cloudera Director, however it seems to be blacklisted: [centos@**** ~]$ /usr/bin/cloudera-director bootstrap-remote "/etc/cloudera-director-server/deployments/**********.conf" --lp.remote.username=**** --lp.remote.password=***** --lp.remote.hostAndPort=*************:7189 Connecting to http://*************:7189 Current user roles: [ROLE_READONLY, ROLE_ADMIN] Found errors in provider configuration: * Unsupported provider type: byon: [2017-01-16 10:04:59] INFO [main] - c.c.l.p.c.PluggableProviderConfig: Overriding blacklist=[byon, sandbox] (default []) [2017-01-16 10:05:00] INFO [main] - c.c.l.p.c.PluggableCloudProviderFactory: Looking for providers in JAR file /var/lib/cloudera-director-plugins/byon-provider-example-1.0.0/byon-provider-example-1.0.0.jar [2017-01-16 10:05:00] INFO [main] - c.c.l.p.c.PluggableCloudProviderFactory: Loaded launcher com.cloudera.director.byon.BYONLauncher [2017-01-16 10:05:00] WARN [main] - c.c.l.p.c.PluggableCloudProviderFactory: Cannot read configuration file: /var/lib/cloudera-director-plugins/byon-provider-example-1.0.0/etc/labels.conf [2017-01-16 10:05:00] INFO [main] - c.c.l.p.c.PluggableCloudProviderFactory: Not loading blacklisted provider byon. [2017-01-16 10:05:00] WARN [main] - c.c.l.p.c.PluggableCloudProviderFactory: No providers registered from JAR byon-provider-example-1.0.0.jar [2017-01-16 10:05:00] INFO [main] - c.c.l.p.c.PluggableCloudProviderFactory: No plugin JAR found in plugin directory /var/lib/cloudera-director-plugins/byon-provider-example-1.0.0 [2017-01-16 10:05:04] ERROR [main] - c.c.l.b.v.GenericEnvironmentValidator: Unsupported environment instance provider type byon The plugin jar is found by Director and is loaded, but is concequently blacklisted by a pre-created list of blacklisted plugins: Overriding blacklist=[byon, sandbox] Loaded launcher com.cloudera.director.byon.BYONLauncher Not loading blacklisted provider byon. Is there any way to unblacklist this plugin? Is there an overriding config parameter for this? At the moment, the only way I can see this working is if I clone the code and build with a different ID. Many thanks

Online	Offline
Last Visited	‎10-31-2017 04:21 AM

Member Since	‎01-16-2017 02:01 AM
Last Visited	‎10-31-2017 04:21 AM
Posts	12
Kudos received	1

Cloudera Community

Re: Cloudera Director BYON plugin blacklisted

Re: Impala: Querying daemon directly when using Ke...

Re: Impala: Querying daemon directly when using Ke...

Impala: Querying daemon directly when using Kerber...

Re: Building kerberised cluster with Director: Yar...

Re: Building kerberised cluster with Director: Yar...

Re: Building kerberised cluster with Director: Yar...

Re: Building kerberised cluster with Director: Yar...

Building kerberised cluster with Director: Yarn No...

Re: Cloudera Director BYON plugin blacklisted

Cloudera Director BYON plugin blacklisted