Created on
11-08-2019
02:22 PM
- last edited on
11-09-2019
08:27 AM
by
ask_bill_brooks
Created 11-08-2019 02:24 PM
I create the pyspark code and ran it on my local windows machine and everything seems to be running fine. While moving to hadoop server, spark2-submit command is not able to pick up the driver for MSSQL server. Following are the steps done to make it to work. 1) Moved the MSSQL jar file (mssql-jdbc-7.4.1.jre8.jar) to the spark home under the jars folder. 2) Added -- jar "pathtothejdcjarfile" using the local file system file://pathto jar file and jars folder in step 1. 3) used the classpath module to set the path for the jar file as below --conf "spark.driver.extraClassPath=pathtojdbcjarfile" --conf "spark.executor.extraClassPath=pathtoJDCjarfile" 4) Final spark submit command. spark2-submit --jar "pathtoJDBCjarfile" --conf "spark.driver.extraClassPath=pathtojdbcjarfile" --conf "spark.executor.extraClassPath=pathtoJDCjarfile" --master yarn --deploy-mode client --executor-memory 1g /home/aab9010/SOMEECRIPT.py
All the efforts leads to the same error either Class not found or No suitable driver.
Any help is appreciated.