Reply
Explorer
Posts: 7
Registered: ‎04-27-2016
Accepted Solution

Cloudera VM Free to use Apache Hadoop with Spark

Hi,

 

There exists some free Vitual Machine to use Apache Hadoop and Spark? I need to do some taks with HDFS and Hive and next some analysis with Spark.

 

Thanks!

Cloudera Employee
Posts: 28
Registered: ‎11-24-2015

Re: Cloudera VM Free to use Apache Hadoop with Spark

Explorer
Posts: 7
Registered: ‎04-27-2016

Re: Cloudera VM Free to use Apache Hadoop with Spark

Sean, just many thanks for your response. This machine have pySpark ???

Cloudera Employee
Posts: 28
Registered: ‎11-24-2015

Re: Cloudera VM Free to use Apache Hadoop with Spark

Yes it does.

Explorer
Posts: 9
Registered: ‎04-06-2016

Re: Cloudera VM Free to use Apache Hadoop with Spark

Actually, I have a problem with Spark. I don't have Spark listed among installed applications. I can see a Spark folder, however clicking it  I get a message, That server is too busy, and it can't connect to ....:18080

I wonder it's just my experience. Everything else  works fine, I can perform training tasks. What can cause it? I thought Spark is not a part of Cloudera VM.

Explorer
Posts: 9
Registered: ‎04-06-2016

Re: Cloudera VM Free to use Apache Hadoop with Spark

Ok, I missed, I have 5.5. I have to download 5.7

Explorer
Posts: 9
Registered: ‎04-06-2016

Re: Cloudera VM Free to use Apache Hadoop with Spark

I downloaded Clouera Quickstart VM 5-7, it doesn't have Spark. Do you have any recommendation where to find istructions how to install spark on hadoop? 

 

 

[cloudera@quickstart ~]$ hadoop fs -ls /user/
Found 9 items
drwxr-xr-x - cloudera cloudera 0 2016-05-21 16:05 /user/cloudera
drwxr-xr-x - hdfs supergroup 0 2016-05-21 16:05 /user/hdfs
drwxr-xr-x - mapred hadoop 0 2016-04-06 01:25 /user/history
drwxrwxrwx - hive supergroup 0 2016-04-06 01:27 /user/hive
drwxrwxrwx - hue supergroup 0 2016-05-21 16:07 /user/hue
drwxrwxrwx - jenkins supergroup 0 2016-04-06 01:25 /user/jenkins
drwxrwxrwx - oozie supergroup 0 2016-04-06 01:26 /user/oozie
drwxrwxrwx - root supergroup 0 2016-04-06 01:25 /user/root
drwxr-xr-x - hdfs supergroup 0 2016-04-06 01:27 /user/spark
[cloudera@quickstart ~]$ hadoop fs -ls /user/spark/
Found 1 items
drwxr-xr-x - spark supergroup 0 2016-05-21 16:17 /user/spark/applicationHistory
[cloudera@quickstart ~]$ hadoop fs -ls /user/spark/applicationHistory/
[cloudera@quickstart ~]$

Cloudera Employee
Posts: 28
Registered: ‎11-24-2015

Re: Cloudera VM Free to use Apache Hadoop with Spark

Spark is installed in the 5.7 VM. We support Spark-on-YARN, and spark-shell
and pyspark are both on the PATH.
Explorer
Posts: 9
Registered: ‎04-06-2016

Re: Cloudera VM Free to use Apache Hadoop with Spark

I downloaded on 05-20-2016. As you can see there's no spark listed istalled, only empty directory. Can anybody check/verify? Or where to find the instructions to instal on VM? Thank you.

Cloudera Employee
Posts: 28
Registered: ‎11-24-2015

Re: Cloudera VM Free to use Apache Hadoop with Spark

You're looking in HDFS directories - I expect those to be blank unless you've loaded some data to those directories or run some jobs. On the 5.7 VM I just successfully ran some Spark code by typing `pyspark` on the command-line or `spark-shell --master yarn-client` for the Scala shell. I confirmed that the spark-submit and spark-executor commands are also on the PATH.