Support Questions
Find answers, ask questions, and share your expertise
Check out our newest addition to the community, the Cloudera Innovation Accelerator group hub.

RDD take action doesnt work in Zeppelin


I am trying to load a csv file into a RDD using textFile function in Zeppelin and then do a take(10). But the take does not produce any result in Zepplin while the same commands outputs rows in SSH (shell)

I have attached the file, my zepplin notebook and some screenshots. Can you please suggest me how to resolve this error? (HortonWorks Sandbox HDP2.5 on Microsoft Azure)

data :






One more info : I could the same things on txt file, just now csv's.

Any pointers?

This could be due to lack of sufficient memory.

How did you launch the spark-shell? Is it in YARN mode or in standalone?

Also how is Zeppelin's Spark interpreter configured? YARN or Standalone?



This is the sandbox from hortonworks. so i suppose it is standalone mode; but how do i verify that?

Spark-shell - just launch Putty, ssh'd into root, issued pyspark to get to Spark, and issued command (rdd = sc.textFile(csv)) - Works like a charm

Zeppelin - exactly the same - used %pyspark interpreter . - doesnt work

My Azure VM (D12 v2) config is 4 cores, 28 GB RAM, 200GB HDD; My local VMWare sandbox has 16GB RAM, 8 cores and 1 TB HDDspace. Will this be not enough for Zeppelin?


If it helps: Screenshot from Amabari for Zeppelin Config



@vshukla , i did some more digging and found that whenever i run the commands, Yarn-Memory in Ambari dashboard goes upto 90%. What could be wrong?


Hi.. a gentle bump to the thread..

It likely is due to insufficient memory. You can try bumping up the memory allocated to Sandbox and also in sandbox shutdown the unneeded services.

Another option is to try out with Spark 2.1 in HDC


@vshukla here are the configs for my machines

Azure VM (D12 v2 config) is 4 cores, 28 GB RAM, 200GB HDD;

VMWare sandbox has 16GB RAM, 8 cores and 1 TB HDDspace.

Will this be not enough for Sandbox?

The size of file i am trying to upload - 1 MB. I am able to load 5 MB txt file just fine. Only for CSVs it cribs.

I will try out the HDC. Thanks for the link!