question distributed processing operation of dataframe with Pyspark in Support Questions

distributed processing operation of dataframe with Pyspark

nanyim_alain — Fri, 13 May 2016 21:45:24 GMT

Hello,

I would like to know please, by what method (or line of code) is that I can be convinced that treatment is executed on all my cluster node with Pyspark?

thank you kindly help me

here is my code:

from pyspark.sql.types import *
from pyspark.sql import Row
           		   
rdd = sc.textFile('hdfs:../personne.txt') 
rdd_split = rdd.map(lambda x: x.split(','))
rdd_people = rdd_split.map(lambda x: Row(name=x[0],age=int(x[1]),ca=int(x[2])))
df_people = sqlContext.createDataFrame(rdd_people)
df_people.registerTempTable("people")
df_people.collect()

Re: distributed processing operation of dataframe with Pyspark

andrew_sears — Sat, 14 May 2016 02:27:02 GMT

If you are looking for a way to monitor the job and determine which nodes it ran on, how many executors, etc, you can see this in the Spark Web UI located at <sparkhost>:4040

http://spark.apache.org/docs/latest/monitoring.html

http://stackoverflow.com/questions/35059608/pyspark-on-cluster-make-sure-all-nodes-are-used

cheers,

Andrew

Re: distributed processing operation of dataframe with Pyspark

phargis — Sun, 18 Aug 2019 13:21:59 GMT

@Andrew Sears answer is correct, and once you bring up the Spark History Server URL (http://{driver-node}:4040), you can navigate to the Executors tab, which will show you lots of statistics about the driver and each executor, as shown below. Note that when running Hortonworks Data Platform (HDP), you can get here from the Spark services page, clicking on "Quick Links", and then clicking on the "Spark History Server UI" button. Following that, you will need to find your specific job under "App ID".

Re: distributed processing operation of dataframe with Pyspark

nanyim_alain — Thu, 19 May 2016 15:04:27 GMT

Thank very much

Re: distributed processing operation of dataframe with Pyspark

nanyim_alain — Thu, 19 May 2016 15:05:48 GMT

Very big thnak you