Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How many times does the script used in spark pipes gets executed.?

Solved Go to solution

How many times does the script used in spark pipes gets executed.?

Explorer
I tried the below spark scala code and got the output as mentioned below. I have tried to pass the inputs to script, but it didn't receive and when i used collect the print statement i used in the script appeared twice.

My simple and very basic perl script first:

#!/usr/bin/perl
print("arguments $ARGV[0] \n"); // Just print the arguments.

My Spark code:

object PipesExample {
  def main(args:Array[String]){
    val conf = new SparkConf();

    val sc = new SparkContext(conf);

    val distScript = "/home/srinivas/test.pl"
    sc.addFile(distScript)

    val rdd = sc.parallelize(Array("srini"))

    val piped = rdd.pipe(Seq(SparkFiles.get("test.pl")))

    println(" output " + piped.collect().mkString(" "));

  }
}

Output looked like this..

 output: arguments arguments

1) What mistake i have done to make it fail receiving the arguments.? 2) Why it executed twice.?

If it looks too basic, please apologize me. I was trying to understand to the best and want to clear my doubts.
1 ACCEPTED SOLUTION

Accepted Solutions

Re: How many times does the script used in spark pipes gets executed.?

Super Collaborator

How many executors do you have when you run this?

I see the same when I run it because it gets sent to each executor (2 in my case)

 

Wilfred

1 REPLY 1

Re: How many times does the script used in spark pipes gets executed.?

Super Collaborator

How many executors do you have when you run this?

I see the same when I run it because it gets sent to each executor (2 in my case)

 

Wilfred

Don't have an account?
Coming from Hortonworks? Activate your account here