Created on 05-11-2016 12:09 PM - edited 09-16-2022 03:18 AM
Hi,
There exists some free Vitual Machine to use Apache Hadoop and Spark? I need to do some taks with HDFS and Hive and next some analysis with Spark.
Thanks!
Created 05-11-2016 12:24 PM
Created 05-11-2016 12:24 PM
Yep, right here: http://www.cloudera.com/downloads/quickstart_vms/5-7.html
Created 05-11-2016 12:26 PM
Sean, just many thanks for your response. This machine have pySpark ???
Created 05-11-2016 12:29 PM
Yes it does.
Created 05-18-2016 06:16 AM
Actually, I have a problem with Spark. I don't have Spark listed among installed applications. I can see a Spark folder, however clicking it I get a message, That server is too busy, and it can't connect to ....:18080
I wonder it's just my experience. Everything else works fine, I can perform training tasks. What can cause it? I thought Spark is not a part of Cloudera VM.
Created 05-18-2016 06:43 AM
Ok, I missed, I have 5.5. I have to download 5.7
Created 05-21-2016 04:23 PM
I downloaded Clouera Quickstart VM 5-7, it doesn't have Spark. Do you have any recommendation where to find istructions how to install spark on hadoop?
[cloudera@quickstart ~]$ hadoop fs -ls /user/
Found 9 items
drwxr-xr-x - cloudera cloudera 0 2016-05-21 16:05 /user/cloudera
drwxr-xr-x - hdfs supergroup 0 2016-05-21 16:05 /user/hdfs
drwxr-xr-x - mapred hadoop 0 2016-04-06 01:25 /user/history
drwxrwxrwx - hive supergroup 0 2016-04-06 01:27 /user/hive
drwxrwxrwx - hue supergroup 0 2016-05-21 16:07 /user/hue
drwxrwxrwx - jenkins supergroup 0 2016-04-06 01:25 /user/jenkins
drwxrwxrwx - oozie supergroup 0 2016-04-06 01:26 /user/oozie
drwxrwxrwx - root supergroup 0 2016-04-06 01:25 /user/root
drwxr-xr-x - hdfs supergroup 0 2016-04-06 01:27 /user/spark
[cloudera@quickstart ~]$ hadoop fs -ls /user/spark/
Found 1 items
drwxr-xr-x - spark supergroup 0 2016-05-21 16:17 /user/spark/applicationHistory
[cloudera@quickstart ~]$ hadoop fs -ls /user/spark/applicationHistory/
[cloudera@quickstart ~]$
Created 05-23-2016 07:44 AM
Created 05-23-2016 09:47 AM
I downloaded on 05-20-2016. As you can see there's no spark listed istalled, only empty directory. Can anybody check/verify? Or where to find the instructions to instal on VM? Thank you.
Created 05-23-2016 10:53 AM
You're looking in HDFS directories - I expect those to be blank unless you've loaded some data to those directories or run some jobs. On the 5.7 VM I just successfully ran some Spark code by typing `pyspark` on the command-line or `spark-shell --master yarn-client` for the Scala shell. I confirmed that the spark-submit and spark-executor commands are also on the PATH.