About mszurap

mszurap · ‎05-02-2022

Hi @gfragkos, thanks for checking. Let's step back then. Is the Impala service TLS/SSL enabled at all? Can you verify that with openssl tools, like: echo | openssl s_client -connect cdp-tdh-de3-master0.cdp-tdh.u5te-1stu.cloudera.site:21050 -CAfile /var/lib/cloudera-scm-agent/agent-cert/cm-auto-global_cacerts.pem

mszurap · ‎04-28-2022

Hello Gozde @gfragkos , Have you checked whether the connectivity works with the given sslTrustStore file with a Java based client? (for example with beeline) As I see your application tries to use unixODBC to connect to a CDP / Impala service. However from the shared connection details I see that the truststore is a Java keystore file (JKS), and since the "nanodbc.cpp" is not a Java based application, it probably cannot recognize that as a valid truststore file. Please try to use a "pem" format trustrstore file instead. Please also review the Impala ODBC Driver documentation: https://downloads.cloudera.com/connectors/impala_odbc_2.6.14.1016/Cloudera-ODBC-Connector-for-Impala-Install-Guide.pdf Thanks Miklos

mszurap · ‎04-27-2022

Hi @jarededrake , that's a good track, the issue currently seems to be that the cluster has Kerberos enabled, and that needs an extra configuration. In the workflow editor, in the right upper corner of the Spark action you will find a cogwheel icon for advanced settings. There on the Credentials tab enable the "hcat" and "hbase" credentials to let the Spark client obtain delegation tokens for the Hive (Hive metastore) and HBase services - in case the spark application wants to use those services (Spark does not know this in advance, so it obtains those DTs). You can disable this behavior too if you are sure that the Spark applicatino will not connect to Hive (using Spark SQL) or HBase, just add the following to the Spark action option list: --conf spark.security.credentials.hadoopfs.enabled=false --conf spark.security.credentials.hbase.enabled=false --conf spark.security.credentials.hive.enabled=false but it's easier to just enable these credentials in the settings page. For similar Kerberos related issues in other actions, please see the following guide: https://gethue.com/hadoop-tutorial-oozie-workflow-credentials-with-a-hive-action-with-kerberos/

mszurap · ‎04-26-2022

Hi @jarededrake , sorry for the delay, I was away for a couple of days. You should use your thin jar (application only - without the dependencies) in the target directory ("SparkTutorial-1.0-SNAPSHOT.jar"). The NoClassDefFoundError for the SparkConf suggests that you've tried a Java action. It is highly suggested to use a Spark action in Oozie workflow editor when running a Spark application to make sure that the environment is set up properly for the application.

mszurap · ‎04-14-2022

So is it "/tmp/kbr5cc_dffe" or "krb5cc_cldr"? Or where do you see the "KRB5CCNAME=/tmp/kbr5cc_dffe"? The "krb5cc_cldr" is used for all (? not sure, but all which I've quickly verified had that) services - we can say it's hardcoded - it is anyways "private" to the process itself, that holds the kerberos ticket cache which only that process is using (and renewing if needed).

mszurap · ‎04-14-2022

I see. Have you verified that the built jar contains this package structure and class names? Can you also show where the jar is uploaded and how is it referenced in the oozie workflow? Thanks, Miklos

mszurap · ‎04-14-2022

Hi, I'm doing well, thank you, hope you're good too. That property usually points to a relative path - which exists in the process directory: KRB5CCNAME='krb5cc_cldr' if that's not the case, I would look into whether the root user's (or maybe the "cloudera-scm" user's) .bashrc file has overridden that KRB5CCNAME environment variable by any chance.

mszurap · ‎04-14-2022

Hi @yagoaparecidoti , in general, the "supervisor.conf" in the process directory (actually the whole process directory) is prepared by Cloudera Manager (server) before starting a process (CM server sends the whole package of information including config files to the CM agent which extracts it in a new process directory). The supervisor.conf file contains all the environment and command related information which is needed for the Supervisor daemon to start the process. There might be some default values taken from the cluster or from the service type. Do you have some specific questions about it?

mszurap · ‎04-13-2022

Hi @Seaport , the "RegexSerDe" is in the contrib package, which is not supported officially, and as such you can use it in some parts of the platform but the different components may not give you full support for that. I would recommend you to preprocess the datafiles to have a commonly consumable format (CSV) before ingesting them into the cluster. Alternatively you can ingest it into a table which has only a single (string) column, and then do the processing/validation/formatting/transforming of the data with inserting it into a proper final table with the columns you need. During the insert you can still use "regex" or "substring" type of functions / UDFs to extract the fields you need from the fixed-width datafiles (from the table with a single column). I hope this helps, Best regards, Miklos

mszurap · ‎04-13-2022

Hi @jarededrake , The "ClassNotFoundException: Class Hortonwork.SparkTutorial.Main not found" suggests that in the Java program's main class package name might have a typo (in your workflow definiton), the Hortonwork should be Hortonworks. Can you check that?

Online	Offline
Last Visited	‎12-10-2024 10:10 AM

Member Since	‎11-04-2015 11:53 PM
Last Visited	‎12-10-2024 10:10 AM
Posts	260
Kudos received	44

Cloudera Community

Re: Hive fails to start with "Caused by: java.lang...

Re: The heap memory usage of NameNode is much high...

Re: Hue and Sqoop white spaces in query

Re: straight SELECT and SELECT via CTE produce dif...

Re: Best practices for partition tables in Impala ...

Re: R client connections fails with CDP Impala

Re: R client connections fails with CDP Impala

Re: Oozie work flow not finding main class

Re: Oozie work flow not finding main class

Re: nifi manages the supervisor.conf file?

Re: Oozie work flow not finding main class

Re: nifi manages the supervisor.conf file?

Re: nifi manages the supervisor.conf file?

Re: Creating an Impala External Table from fixed w...

Re: Oozie work flow not finding main class