About Jeremy Beard

Jeremy Beard · ‎04-09-2020

That it was working on CDH 5.16.1 but not on CDH 6.3.3 tells me that there's some kind of classpath conflict that didn't exist before. I'd suggest modifying Envelope's top-level pom.xml file to point to the Cloudera Maven repository and to use the Spark version "2.4.0-cdh6.3.3".

Jeremy Beard · ‎04-08-2020

Hi, A few questions to see if we can narrow it down: - Which version of Spark are you using? Is this on CDH? - Are you modifying Envelope before compiling? - Are you using any of your own Envelope plugins? Jeremy

Jeremy Beard · ‎04-06-2020

You don't need to declare them in the conf file, but for environment variables you can't have them inside the SQL string because of the way the file format handles variable substitution. Concatenation with variables is reasonably easy though, for example: "SELECT * FROM tableA A INNER JOIN tableB B ON A."${primaryKey}" = B."${primaryKey}

Jeremy Beard · ‎04-06-2020

You just need to use local environment variables since you are running in client mode. For example, export tableA=dbA.tableA export tableB=dbB.tableB spark2-submit \ --master yarn \ --deploy-mode client \ envelope-0.7.2.jar comparison.conf For sudo you would need to use -E to pass the variables through, but it is not good practice to run jobs as the HDFS superuser instead of your own user.

Online	Offline
Last Visited	‎04-29-2020 02:17 PM

Member Since	‎08-26-2015 02:32 PM
Last Visited	‎04-29-2020 02:17 PM
Posts	54
Kudos received	6

Cloudera Community

Re: Envelope error while writing into hive table

Re: How do i pass variables to spark job using Env...

Re: How do i pass variables to spark job using Env...

Re: Envelope error while writing into hive table

Re: Envelope error while writing into hive table

Re: How do i pass variables to spark job using Env...

Re: How do i pass variables to spark job using Env...