I have installed spark client on my workstation to access HDFS and spark from cluster using following command:
sudo yum install spark2_3_1_0_0_78*
I have copied all configuration files from cluster nodes to workstation. I can able to connect to spark and retrieve data from the HDFS cluster.
Following is code that I am using for connecting to MongoDB using Pyspark: