About Learner_new

mpercy · ‎04-25-2019

Let's try to rule out various types of problems. 1. Are you able to read/write to Kerberos-enabled HDFS with PySpark? Is Kudu the only Kerberos-enabled service that is not working from within PySpark? 2. Have you checked to ensure that the Spark driver is running on the host and shell you kinited from instead of being started in a YARN container? If it's running in YARN you have to give YARN access to the keytab to run as. 3. Have you tried connecting to Kudu with the regular Spark shell? Does it work? For examples see https://kudu.apache.org/docs/developing.html#_kudu_integration_with_spark

Learner_new · ‎04-22-2019

Thanks Will try that,do you have any suggestion on best way to implementing dimension with scd2 type in hadoop, our dimension table has several sources and all should be able to load /update concurrently in dimension table. #- Please type your reply above this line -##

Online	Offline
Last Visited	‎05-01-2019 10:30 AM

Member Since	‎04-20-2019 04:58 AM
Last Visited	‎05-01-2019 10:30 AM
Posts	9

Cloudera Community

Re: Error while connecting to kudu via pyspark

Re: Impala date