Created on 03-22-2020 10:41 AM - last edited on 03-22-2020 01:08 PM by ask_bill_brooks
I would like to know what is the difference between the HDP Sandbox VM and CDH.
I am looking to setup CDH on GCP,can anyone let me know would i still be able to use the services after the 60 days trial phase if not what is the cost like?This is for personal use.
Is it also possible with HDP Sandbox on GCP?Again for own use and not for any business account.
Created on 03-22-2020 09:13 PM - edited 03-22-2020 09:14 PM
Hi @SwasBigData
HDP was the fully open source distribution of the Apache Hadoop data analysis platform offered commercially by Hortonworks prior to its merger with Cloudera in January 2019. The "Sandbox" is just a version of the HDP binaries configured for VMware or VirtualBox machines (and Docker containers, as well).
Why is it called the sandbox? From Learning the Ropes of the HDP Sandbox:
The Sandbox is a straightforward, pre-configured learning environment that contains the latest developments from Apache Hadoop, specifically the Hortonworks Data Platform (HDP). The Sandbox comes packaged in a virtual environment that can run in the cloud or on your personal machine.
The intent of the HDP Sandbox is to facilitate learning by developers and operators new to Hadoop by drastically reducing the effort involved in getting a Hadoop cluster up and running.
The current product is CDP, Cloudera Data Platform, the last letter is P, not H (previously Cloudera marketed a distribution of Hadoop called CDH; as a completely open-source software distribution, CDH happens to still be available). The free trial you mentioned is probably for CDP, not CDH. I am not aware of a CDH free trial.
There are currently two "form factors" of CDP, one for Public Cloud Services using AWS or Azure, and another for Data Center deployment. The version of CDP Public Cloud for GCP has not yet been released.
If you have a lot of understanding of build and deployment tools, you can, with some effort, install and configure CDP Data Center (or HDP or CDH, for that matter) on GCE VMs. Once the 60 day trial was over, you'd need to acquire a subscription to continue using CDP Data Center. The way to get details for how much this would cost is to reach out to a Cloudera Sales Representative.
Created on 03-22-2020 09:13 PM - edited 03-22-2020 09:14 PM
Hi @SwasBigData
HDP was the fully open source distribution of the Apache Hadoop data analysis platform offered commercially by Hortonworks prior to its merger with Cloudera in January 2019. The "Sandbox" is just a version of the HDP binaries configured for VMware or VirtualBox machines (and Docker containers, as well).
Why is it called the sandbox? From Learning the Ropes of the HDP Sandbox:
The Sandbox is a straightforward, pre-configured learning environment that contains the latest developments from Apache Hadoop, specifically the Hortonworks Data Platform (HDP). The Sandbox comes packaged in a virtual environment that can run in the cloud or on your personal machine.
The intent of the HDP Sandbox is to facilitate learning by developers and operators new to Hadoop by drastically reducing the effort involved in getting a Hadoop cluster up and running.
The current product is CDP, Cloudera Data Platform, the last letter is P, not H (previously Cloudera marketed a distribution of Hadoop called CDH; as a completely open-source software distribution, CDH happens to still be available). The free trial you mentioned is probably for CDP, not CDH. I am not aware of a CDH free trial.
There are currently two "form factors" of CDP, one for Public Cloud Services using AWS or Azure, and another for Data Center deployment. The version of CDP Public Cloud for GCP has not yet been released.
If you have a lot of understanding of build and deployment tools, you can, with some effort, install and configure CDP Data Center (or HDP or CDH, for that matter) on GCE VMs. Once the 60 day trial was over, you'd need to acquire a subscription to continue using CDP Data Center. The way to get details for how much this would cost is to reach out to a Cloudera Sales Representative.
Created 03-23-2020 02:59 AM
Thank You Bill ,i was searching HDP deployment into GCP but the below link
https://www.cloudera.com/tutorials/sandbox-deployment-and-install-guide.html
only explains about VB,Vmware and Docker,is there any resource that can guide me how do i deploy the HDP Sandbox into GCP
Created 03-23-2020 01:13 PM
That is correct; there is no current documentation on Cloudera's site about deploying HDP into GCP GCE VMs. When I said above that:
If you have a lot of understanding of build and deployment tools, you can, with some effort, install and configure CDP Data Center (or HDP or CDH, for that matter) on GCE VMs.
…that meant if you (the person deploying) have a deep, thorough and richly contextualized understanding of GCP's tools as well as those tools used to build and deploy the various open source components of CDP Data Center (or HDP or CDH) then you can get any of those platforms to work on GCP GCE VMs. If it was already documented, then you wouldn't need that background.
That being said, there are all kinds of third- and fourth-party documentation on deploying any of the above distributions in any number of cloud environments and I would not be surprised if there's a website or youtube video out there on the internet that describes how to do it. If and when you find one, please post back here in this thread and let me know where it is so I (and other members of the community) can take advantage of the information.
Thanks in advance!