Archives of Support Questions (Read Only)

bhupendra · ‎12-21-2015

awatson · ‎12-21-2015

Depending on your hardware availability for the POC, I would also look at just doing the POC in the Cloud (e.g. MSFT Azure, AWS, GCP). You can leverage Cloudbreak to quickly deploy a fully fledge distributed cluster running Spark, Yarn, the whole nine yards, in the cloud in a matter of minutes.

Here is the documentation on how to do so:

Cloudbreak Overview - http://hortonworks.com/hadoop/cloudbreak/

Cloudbreak Docs - http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-...

View solution in original post

nsabharwal · ‎12-21-2015

@Bhupendra Mishra

1) Get your cluster up - HDP 2.3.2 (latest version) (Sandbox is good start)

2) Get Zeppelin http://hortonworks.com/hadoop/zeppelin/#section_1

Step 2 will help you to configure spark and access data from Hive tables.

If you don't have the data then stick with this http://hortonworks.com/hadoop-tutorial/interacting...

bhupendra · ‎12-21-2015

I want to proceed with distributed cluster. not standalone or sandbox

Full flashed Production grade server

nsabharwal · ‎12-21-2015

@Bhupendra Mishra You are on the right track. http://docs.hortonworks.com/HDPDocuments/Ambari-2.1.2.1/bk_Installing_HDP_AMB/content/ch_Getting_Rea...

My email is [email protected]

Please feel free to email me and we can discuss

awatson · ‎12-21-2015