Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Cloudera VM Free to use Apache Hadoop with Spark

Solved Go to solution
Highlighted

Cloudera VM Free to use Apache Hadoop with Spark

Explorer

Hi,

 

There exists some free Vitual Machine to use Apache Hadoop and Spark? I need to do some taks with HDFS and Hive and next some analysis with Spark.

 

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Cloudera VM Free to use Apache Hadoop with Spark

Contributor
12 REPLIES 12

Re: Cloudera VM Free to use Apache Hadoop with Spark

Contributor

Re: Cloudera VM Free to use Apache Hadoop with Spark

Explorer

Sean, just many thanks for your response. This machine have pySpark ???

Re: Cloudera VM Free to use Apache Hadoop with Spark

Contributor

Yes it does.

Re: Cloudera VM Free to use Apache Hadoop with Spark

Explorer

Actually, I have a problem with Spark. I don't have Spark listed among installed applications. I can see a Spark folder, however clicking it  I get a message, That server is too busy, and it can't connect to ....:18080

I wonder it's just my experience. Everything else  works fine, I can perform training tasks. What can cause it? I thought Spark is not a part of Cloudera VM.

Re: Cloudera VM Free to use Apache Hadoop with Spark

Explorer

Ok, I missed, I have 5.5. I have to download 5.7

Re: Cloudera VM Free to use Apache Hadoop with Spark

Explorer

I downloaded Clouera Quickstart VM 5-7, it doesn't have Spark. Do you have any recommendation where to find istructions how to install spark on hadoop? 

 

 

[cloudera@quickstart ~]$ hadoop fs -ls /user/
Found 9 items
drwxr-xr-x - cloudera cloudera 0 2016-05-21 16:05 /user/cloudera
drwxr-xr-x - hdfs supergroup 0 2016-05-21 16:05 /user/hdfs
drwxr-xr-x - mapred hadoop 0 2016-04-06 01:25 /user/history
drwxrwxrwx - hive supergroup 0 2016-04-06 01:27 /user/hive
drwxrwxrwx - hue supergroup 0 2016-05-21 16:07 /user/hue
drwxrwxrwx - jenkins supergroup 0 2016-04-06 01:25 /user/jenkins
drwxrwxrwx - oozie supergroup 0 2016-04-06 01:26 /user/oozie
drwxrwxrwx - root supergroup 0 2016-04-06 01:25 /user/root
drwxr-xr-x - hdfs supergroup 0 2016-04-06 01:27 /user/spark
[cloudera@quickstart ~]$ hadoop fs -ls /user/spark/
Found 1 items
drwxr-xr-x - spark supergroup 0 2016-05-21 16:17 /user/spark/applicationHistory
[cloudera@quickstart ~]$ hadoop fs -ls /user/spark/applicationHistory/
[cloudera@quickstart ~]$

Re: Cloudera VM Free to use Apache Hadoop with Spark

Contributor
Spark is installed in the 5.7 VM. We support Spark-on-YARN, and spark-shell
and pyspark are both on the PATH.

Re: Cloudera VM Free to use Apache Hadoop with Spark

Explorer

I downloaded on 05-20-2016. As you can see there's no spark listed istalled, only empty directory. Can anybody check/verify? Or where to find the instructions to instal on VM? Thank you.

Re: Cloudera VM Free to use Apache Hadoop with Spark

Contributor

You're looking in HDFS directories - I expect those to be blank unless you've loaded some data to those directories or run some jobs. On the 5.7 VM I just successfully ran some Spark code by typing `pyspark` on the command-line or `spark-shell --master yarn-client` for the Scala shell. I confirmed that the spark-submit and spark-executor commands are also on the PATH.