Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

submitting pyspark job with multiple python files and one configuration file

submitting pyspark job with multiple python files and one configuration file

I have 4 python scripts and one configuration file of .txt . out of 4 python files , one file has entry point for spark application and also importing functions from other python files . But configuration file is imported in some other python file that is not entry point for spark application . I want to write spark submit command in pyspark , but I am not sure how to provide multiple files along configuration file with spark submit command when configuration file is not python file but text file or ini file.

for demonstration: 4 python files : file1.py , file2.py , file3.py . file4.py

1 configuration file : conf.txt

file1.py : this file has spark session and calling to all other python files . file3.py : this python file is reading conf.txt .

I want to provide all these files with spark submit but not sure about command . One solution I have identified is :

spark-submit --master local  --driver-memory 2g --executor-memory  2g --py-files s3_path\file2.py,s3_path\file3.py,s3_path\file4.py  s3_path\file1.py

but with above spark submit I am not sure how to pass conf.txt .

Don't have an account?
Coming from Hortonworks? Activate your account here