About mtrepanier

mtrepanier · ‎02-01-2019

@David_Schwab it was my understanding that when submitting a job with a keytab, the spark Application Master would periodically renew the ticket using the principal and keytab, as per: https://www.cloudera.com/documentation/enterprise/5-15-x/topics/cm_sg_yarn_long_jobs.html Could it be possible that the ticket refresh rate is longer than that of the maximum ticket life?

mtrepanier · ‎02-01-2019

I am submitting a spark job and setting both the spark.yarn.keytab and spark.yarn.principal values. The logs indicate that these variables are being set correctly: 2019-01-28 16:48:45 +0000 [INFO] from org.apache.spark.launcher.app.MAINCLASS in launcher-proc-1 - 19/01/28 16:48:45 INFO Client: Attempting to login to the Kerberos using principal: USERNAME and keytab: /home/USERNAME/USERNAME.keytab 2019-01-28 16:48:58 +0000 [INFO] from org.apache.spark.launcher.app.MAINCLASS in launcher-proc-1 - 19/01/28 16:48:58 INFO HadoopFSCredentialProvider: getting token for: hdfs://nameservice1/user/USERNAME However, after about 8 hours of running, I receive the below exception related to not having a valid kerberos ticket. Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt) at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147) at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122) at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187) at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224) at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212) at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179) at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:192) ... 27 more The job is composed of 3 parts: NLP, indexing the data, and writing to parquet. Both the NLP and indexing stages complete, while the exception occurs during the parquet write. I was under the impression that when using a keytab, the ticket should be valid for the duration of the job. Is this not the case? (The job is being submitted using SparkLauncher and pointing to a jar. This is essentially the same as using spark-submit.)

mtrepanier · ‎01-29-2019

Note: The below process is easier if the node is a gateway node. The correct Spark version and the directories will be readily available for mounting to the docker container. The quick and dirty way is to have an installation of Spark which matches your cluster's major version installed or mounted in the docker container. As well, you will need to mount the yarn and hadoop configuration directories in the docker container. Mounting these will prevent you from needing to set a ton of config on submission. Eg: "spark.hadoop.yarn.resourcemanager.hostname","XXX" Often these both can be set to the same value: /opt/cloudera/parcels/SPARK2/lib/spark2/conf/yarn-conf. The SPARK_CONF_DIR, HADOOP_CONF_DIR and YARN_CONF_DIR environment variables need to reference be set if using spark-submit. If using SparkLauncher, they can be set like so: val env = Map( "HADOOP_CONF_DIR" -> "/example/hadoop/path", "YARN_CONF_DIR" -> "/example/yarn/path" ) val launcher = new SparkLauncher(env.asJava).setSparkHome("/path/to/mounted/spark") If submitting to a kerberized cluster, the easiest way is to mount a keytab file and the /etc/krb5.conf file in the docker container. Set the principal and keytab using spark.yarn.principal and spark.yarn.keytab, respectively. For ports, 8032 of the Spark Master's (Yarn ResourceManager External) definitely needs to be open to traffic from the docker node. I am not sure if this is the complete list of ports - could another user verify?

mtrepanier · ‎10-06-2017

Setting them to lower case didn't work immediately - what did work was going back and setting each hdfs file name to lowercase and refreshing the partitioning. Lesson learned, always set column partitioning names to lowercase when you need to build an external table on them.

mtrepanier · ‎10-06-2017

Hi, I currently have data sitting in an HDFS location at, say, /location. The data is paritioned by YEAR/MONTH/DAY, and the subfolder structure looks like YEAR=2017/MONTH=8/DAY=2. I am attempting to create an external table on this data, but upon doing so the partitioning is not being recognized. The two commands I've tried are: drop table if exists db.table; create external table db.table like parquet '/location/file.parquet' partitioned by (YEAR int, MONTH int, DAY int) stored as parquet location '/location'; alter table db.table recover partitions; compute incremental stats db.table; And... drop table if exists db.table create external table db.table( field1 string, field2 string, ... ) partitioned by (YEAR int, MONTH int, DAY int) stored as parquet location '/location/'; alter table db.table recover partitions; compute incremental stats db.table; In both cases, I end up with an empty table that is correctly partitioned. Calling invalidate metadata; after the fact did not resolve the issue. I've verifified that the impala user is on the facl lists for these areas. Does anyone know why it would not be finding the data? I should point out that if I ignore partitioning and instead just try and build a table on top of data from one day (IE. YEAR=2017/MONTH=8/DAY=2), the data shows.

mtrepanier · ‎08-07-2017

Found the issue. There was a typo in our DC which was set up as our DNS.

mtrepanier · ‎08-01-2017

Following the Cloudera Doc @ https://www.cloudera.com/documentation/enterprise/5-11-x/topics/impala_proxy.html one potential issue I see is: Choose the host you will use for the proxy server. Based on the Kerberos setup procedure, it should already have an entry impala/proxy_host@realm in its keytab. If not, go back over the initial Kerberos configuration steps for the keytab on each host running the impalad daemon. After modifying The Impala Daemons Load Balancer field, the keytab files of all the workers running Impala have the haproxy principal present. The calling klist on a worker's keytab file... 1 08/01/2017 15:25:11 impala/worker1.company.local@COMPANY.LOCAL 1 08/01/2017 15:25:11 impala/worker2.company.local@COMPANY.LOCAL 1 08/01/2017 15:25:11 impala/worker3.company.local@COMPANY.LOCAL 1 08/01/2017 15:25:11 impala/haproxy1.company.local@COMPANY.LOCAL 1 08/01/2017 15:25:11 impala/haproxy1.company.local@COMPANY.LOCAL 1 08/01/2017 15:25:11 impala/haproxy1.company.local@COMPANY.LOCAL It looks like impala principal for haproxy is correctly present. However, I don't believe there is a keytab present on the haproxy node itself. Does there need to be?

mtrepanier · ‎08-01-2017

Using the FQDN in the impala-shell statement results in the same error. SSL was configured following: https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Load_Balancer_Administration/install_haproxy_example1.html I've verified the changes outlined there were made to haproxy-https.xml and the SEL linux settings are correct. As well, a self-signed cert was used to contruct the pem file in /etc/ssl/private. As for the last part - how do I ensure that Impala is configured to use a particular PEM file? Is there a relevant config setting?

mtrepanier · ‎07-31-2017

Yes - the FQDN is of the format: haproxy.company.local As well, the principal looks like: impala/haproxy.company.local@COMPANY.LOCAL Which mirrors the other principals (for example: impala/master-123.company.local@COMPANY.LOCAL)

mtrepanier · ‎07-31-2017

I generated the missing credentials in CM and restarted the cluster's services, which led me to the above error. I believe the issue is tied to the fact that the haproxy node was added on to the cluster and isn't manged by CDH.

Online	Offline
Last Visited	‎02-07-2019 02:04 PM

Member Since	‎03-31-2017 02:34 PM
Last Visited	‎02-07-2019 02:04 PM
Posts	26
Kudos received	3

Cloudera Community

Re: Spark Kerberos Ticket Timing out Despite Provi...

Spark Kerberos Ticket Timing out Despite Providing...

Re: Submit spark job from outside cluster

Re: Impala Failing to Recognize Partitioning

Impala Failing to Recognize Partitioning

Re: Error Connecting to Impala via HA Proxy Node

Re: Error Connecting to Impala via HA Proxy Node

Re: Error Connecting to Impala via HA Proxy Node

Re: Error Connecting to Impala via HA Proxy Node

Re: Error Connecting to Impala via HA Proxy Node