Member since
07-29-2019
640
Posts
114
Kudos Received
48
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 11304 | 12-01-2022 05:40 PM | |
| 2771 | 11-24-2022 08:44 AM | |
| 4166 | 11-12-2022 12:38 PM | |
| 1465 | 10-10-2022 06:58 AM | |
| 2118 | 09-11-2022 05:43 PM |
03-26-2020
07:33 PM
Hi @JB0000012345 ,
Over in this thread, a community member reportedly found a docker image based on cdh 6.3.0, which is much more up-to-date than the cdh 5.13-based "Quickstart" you were looking for. Hope this helps.
... View more
03-23-2020
01:13 PM
That is correct; there is no current documentation on Cloudera's site about deploying HDP into GCP GCE VMs. When I said above that:
If you have a lot of understanding of build and deployment tools, you can, with some effort, install and configure CDP Data Center (or HDP or CDH, for that matter) on GCE VMs.
…that meant if you (the person deploying) have a deep, thorough and richly contextualized understanding of GCP's tools as well as those tools used to build and deploy the various open source components of CDP Data Center (or HDP or CDH) then you can get any of those platforms to work on GCP GCE VMs. If it was already documented, then you wouldn't need that background.
That being said, there are all kinds of third- and fourth-party documentation on deploying any of the above distributions in any number of cloud environments and I would not be surprised if there's a website or youtube video out there on the internet that describes how to do it. If and when you find one, please post back here in this thread and let me know where it is so I (and other members of the community) can take advantage of the information.
Thanks in advance!
... View more
03-22-2020
09:13 PM
1 Kudo
Hi @SwasBigData
HDP was the fully open source distribution of the Apache Hadoop data analysis platform offered commercially by Hortonworks prior to its merger with Cloudera in January 2019. The "Sandbox" is just a version of the HDP binaries configured for VMware or VirtualBox machines (and Docker containers, as well).
Why is it called the sandbox? From Learning the Ropes of the HDP Sandbox:
The Sandbox is a straightforward, pre-configured learning environment that contains the latest developments from Apache Hadoop, specifically the Hortonworks Data Platform (HDP). The Sandbox comes packaged in a virtual environment that can run in the cloud or on your personal machine.
The intent of the HDP Sandbox is to facilitate learning by developers and operators new to Hadoop by drastically reducing the effort involved in getting a Hadoop cluster up and running.
The current product is CDP, Cloudera Data Platform, the last letter is P, not H (previously Cloudera marketed a distribution of Hadoop called CDH; as a completely open-source software distribution, CDH happens to still be available). The free trial you mentioned is probably for CDP, not CDH. I am not aware of a CDH free trial.
There are currently two "form factors" of CDP, one for Public Cloud Services using AWS or Azure, and another for Data Center deployment. The version of CDP Public Cloud for GCP has not yet been released.
If you have a lot of understanding of build and deployment tools, you can, with some effort, install and configure CDP Data Center (or HDP or CDH, for that matter) on GCE VMs. Once the 60 day trial was over, you'd need to acquire a subscription to continue using CDP Data Center. The way to get details for how much this would cost is to reach out to a Cloudera Sales Representative.
... View more
03-19-2020
10:26 PM
1 Kudo
@Mondi ,
If I am mistaken when I assume that as a minor release, CDH 6.3.3 does not change the overall Python version dependencies established by CDH 6.x in general, I'm sure someone more knowledgeable than me will weigh in here.
If I am not mistaken, you can check the dependencies at this page, under the subheading Software Dependencies. Hope this helps.
... View more
03-19-2020
05:45 PM
Hi @zakariadem
While we welcome your question, just based on the lack of responses to the original question in this thread since it was posted in Nov 2017, we think you would be much more likely to obtain a suitable answer if you posted it to the appropriate AWS forum for EMR.
... View more
03-15-2020
01:19 PM
Hi @somi,
Just looking at the error message alone, it seems to me like the problem is with your data file. Can you post the first few lines of the input data file myfile.txt in this thread?
... View more
03-15-2020
09:19 AM
1 Kudo
Hi @EltonMesquit
Previously, there was a so-called Quickstart VM installation that collected the bits for CDH 5.13 on a self-contained virtual machine, but now it is not available despite what you might think when you see various hyperlinks on the Cloudera website labeled "QuickStart VMs". To the best of my knowledge there was never a CDH 6.3.x-version of the Quickstart VM generally available to the public.
If you'd like to try installing CDH 6.3.x in a non-production environment for proof-of-concept purposes, you can find the instructions on the Cloudera docs site, but be sure to choose CDH 6.3.2 or earlier if your company does not have a current Cloudera support subscription (the repository is in a different place for versions 6.3.3 and later).
The up-to-date product is Cloudera Data Platform, and you can download a trial version of CDP to install on-premises here.
If, for whatever reason, you must have a VM-based installation of Hadoop, I strongly recommend you migrate to the HDP Sandbox, but that will not have the same version of each of the Hadoop ecosystem components as the now unavailable CDH Quickstart VM. Most of them will be more recent in the Sandbox.
... View more
03-11-2020
09:50 PM
@Vj1989
As this thread was marked 'Solved' in Dec 2017 you would have a better chance of receiving a useful response by starting a new thread. This will also provide the opportunity to provide details specific to your environment that could aid others in providing a more accurate answer to your question.
... View more
03-11-2020
04:31 PM
1 Kudo
Hi @SwasBigData
We understand that there exists a lot of third- and fourth-party documentation and videos that show the Cloudera Quickstart VMs, however Cloudera in particular and Hadoop in general have moved on. We are in a transition period away from that version of the Cloudera Quickstart VM, as it was extremely outdated and nearing end of support.
As you noted, the up-to-date product is Cloudera Data Platform, which you can download at that URL. If you really must have a VM-based distribution of Hadoop, I strongly recommend you migrate to the HDP Sandbox.
... View more
03-08-2020
09:45 AM
Hi @Kart,
As this is a thread which was marked 'Solved' over three years ago, you would have a better chance of receiving a resolution by posting a new question. This will also present you with the opportunity to include details specific to your environment that could aid other members in providing a more relevant answer to your question.
... View more