Member since
02-19-2018
99
Posts
29
Kudos Received
32
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
240 | 07-28-2020 07:46 AM | |
209 | 07-28-2020 07:45 AM | |
435 | 06-23-2020 11:15 PM | |
660 | 06-23-2020 11:12 PM | |
365 | 05-25-2020 02:41 AM |
07-28-2020
07:46 AM
1 Kudo
Hi @rok , Cloudera Manager has not been open-sourced yet but will be in the future. Regards, Steve
... View more
07-28-2020
07:45 AM
2 Kudos
Hi @kvinod , All of the supported versions of CDH and other Cloudera technology can be found here: https://www.cloudera.com/legal/policies/support-lifecycle-policy.html If you are going for a new deployment I recommend the latest distribution from Cloudera - the Cloudera Data Platform (CDP). CDP supersedes the CDH distribution. Regards, Steve
... View more
07-07-2020
11:00 PM
Hi @Archana , Please see this link which explains what you need to install CDH 6.3.3: Installation or Upgrade of Cloudera Manager 6.3.3 and CDH 6.3.3 now requires authentication to access downloads Regards, Steve
... View more
06-23-2020
11:15 PM
Hi @premv , The Kafka Connect documentation from Cloudera is available here: https://docs.cloudera.com/runtime/7.2.0/kafka-connect/topics/kafka-connect-setup.html Regards, Steve
... View more
06-23-2020
11:12 PM
Hi @Reza1962 , Cloudera Express has been deprecated from CDH 6.3.3 and we will not be supporting it with any future releases. https://docs.cloudera.com/documentation/enterprise/6/release-notes/topics/rg_cdh_633_new_features.html#d106996e75 Regards, Steve
... View more
05-25-2020
02:41 AM
Hi @SivaMalapati , CDP runs on Linux only. It does not run on the Window operating system. Regards, Steve
... View more
05-21-2020
01:31 AM
Hi @raghu9raghavend , I wouldn't recommend putting the node manager on the master nodes. You want the node managers to run where the data are stored. So putting the node manager on the master nodes implies that you would also be using the master nodes as data nodes. Ideally, you want to isolate the master nodes and services that manage the platform out from where the computation and processing are happening i.e. the data nodes. Regards, Steve
... View more
05-20-2020
06:30 AM
Hi @raghu9raghavend , Here is a reference to a topological layout of Cloudera roles: https://docs.cloudera.com/documentation/enterprise/latest/topics/cm_ig_host_allocations.html#concept_f43_j4y_dw__section_hmq_5hq_zdb I hope this helps. Regards, Steve
... View more
05-14-2020
05:48 AM
Hi @sang06 , Each of the individual projects within CDP has an API. So Cloudera Manager has an API, Ranger has an API and you can call and deploy models in CML using an API. However, to my knowledge, you would need to use the CLI to build the CDP environment and CML workspace in the first place. I'm aware that there isn't any documentation regarding the CLI - that is a bugbear of mine too - I have been working through the CLI the hard way to get it working. Regards, Steve
... View more
05-13-2020
08:48 AM
Hi @sang06 , I am assuming that you are referring to the provisioning of CML workspace on the CDP Public Cloud platform? If that is the case, then yes it is possible to automate the provisioning of a CML workspace using the CDP CLI. First, you will need to install the CDP CLI: https://docs.cloudera.com/management-console/cloud/cli/topics/mc-cli-client-setup.html Then you can get started with the CDP CLI for machine learning like this: cdp ml create-workspace help Regards, Steve
... View more
05-13-2020
12:36 AM
2 Kudos
Hi @BSST Here is a reference architecture for CDP: https://docs.cloudera.com/documentation/other/reference-architecture/PDF/cloudera_ref_arch_cdp_dc.pdf Here are the hardware requirements: https://docs.cloudera.com/cdpdc/latest/release-guide/topics/cdpdc-hardware-requirements.html I hope that helps. Regards, Steve
... View more
05-11-2020
12:33 AM
Hi @Mondi , No, Apache Sentry is not used for data as rest or in motion encryption. To address those requirements you need to follow this guidelines and steps: Encryption in Transit: https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/cm_sg_guide_ssl_certs.html Encryption at Rest: https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/encryption_planning.html Regards, Steve
... View more
05-11-2020
12:27 AM
Hi @BSST , The CDP trial installation documentation is here: https://docs.cloudera.com/cdpdc/7.0/installation/topics/cdpdc-trial-installation.html The hardware and operating system requirements are covered here: https://docs.cloudera.com/cdpdc/7.0/release-guide/topics/cdpdc-requirements-supported-versions.html Regards, Steve
... View more
04-24-2020
12:23 AM
Hi @denys_tyshetsky Yes, you are right - the CDP Management Console for public cloud is hosted by Cloudera in our VPC. CDP public cloud is a PaaS offering so we need a control plane through which we launch services. Those services are then launched into the customer's VPC. You don't have control over the CDP Management Console but you also don't have control over the AWS Management Console. The AWS Management Console doesn't live in your VPC either. If you want to try CDP in the public cloud then there is a far easier way of getting started now that we have released a CloudFormation template that performs a lot of the AWS setup: https://docs.cloudera.com/management-console/cloud/aws-quickstart/topics/mc-aws-quickstart.html My understanding is that when we launch CDP Private Cloud for on-premises deployments then you will be able to host the CDP Management Console on-premises. For the time being, the only way to get access to CDP Public Cloud is via the CDP Management Console which is hosted by Cloudera. Of course, if you are interested in an IaaS deployment of CDP then you can install CDP Data Center which uses Cloudera Manager for managing the deployment and does not use or require the CDP Management Console which is for cloud deployments only. Regards, Steve
... View more
04-19-2020
11:25 PM
Hi @denys_tyshetsky , I don't understand your question - you want to trial CDP in AWS public cloud environment but the fact that the CDP Management Console is in the public cloud is an issue? Regards, Steve
... View more
04-19-2020
11:22 PM
Hi @muslihuddin , The cluster templates are only available in the CDP public cloud form factor at the moment. So for CDP-DC you can install Nifi using a parcel / csd as you say. It's pretty easy to do. Regards, Steve
... View more
04-19-2020
11:19 PM
Hi @ebeb , Thanks for your question. There are a number of ways to get to CDP. If the public cloud is an option for you then I strongly recommend you explore doing that because there are so many advantages of this approach. If you are staying on-premises then you can either built a new CDP-DC cluster and move data to the new environment and migrate content to that environment using the inbuilt tools. Or, you can do an upgrade in place of your existing CDH cluster. To do the upgrade in place CDH needs to be at 5.13 or above. The upgrade in place approached will be available when we release CDP-DC 7.1. Regards, Steve
... View more
04-17-2020
05:24 AM
Hi @sarm , You could do this by using the Cloudera Manager API: https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/cm_intro_api.html Regards, Steve
... View more
04-17-2020
05:15 AM
Hi @muslihuddin , Currently, the CDP Management Console is only available in the public cloud. However, we are also planning to launch a CDP Private Cloud edition which would run on-premises including the Management Console. If you are looking to run CDP on-premises today, you can do that with the CDP Data Center (DC) edition. CDP-DC is managed by Cloudera Manager and does not use the Management Console. I hope that helps. Steve
... View more
04-16-2020
01:20 PM
Hi @BaluSaiD , Cloudera Manager does not support SQL Server as a database for its backend services. The databases that are supported are listed here: https://docs.cloudera.com/documentation/enterprise/6/release-notes/topics/rg_database_requirements.html#cdh_cm_supported_db Regards, Steve
... View more
04-16-2020
10:02 AM
Hi @Ashik , A good rule of thumb for the amount of HDFS storage required is 4 x the raw data volume. HDFS triple replicates data and then we need some headroom in the system which is why it is 4 x rather than 3 x . This formula is just a rough guide and can change for example if you compress the data on HDFS. You need to factor other data processing that you might do into this calculation. For example, if you built data marts on top of the raw data - that is additional data volume and then you have organic data growth over a period of time. Regarding cluster topology, there are some guidelines here: https://docs.cloudera.com/documentation/enterprise/5/latest/topics/cm_ig_host_allocations.html Regarding best practice for cluster sizing, those are here: https://docs.cloudera.com/documentation/other/reference-architecture/topics/ra_bare_metal_deployment.html#concept_lzl_vkl_f2b Regarding hardware recommendations, those are given here: https://docs.cloudera.com/documentation/enterprise/release-notes/topics/hardware_requirements_guide.html I would recommend that you have: 3 x Master Nodes (for high availability) N x Data Nodes (where N is number based on the storage capacity of the data nodes). You need a minimum of N=3 for triple replication of data and I would recommend N >= 5 for a production system. The more data nodes that you have and the more disks there are in each of those data nodes the higher the performance of your system will be because of the distributed throughput and higher disk I/O. 1 x Utility Node / Management Node 1 x Gateway Node Regards, Steve
... View more
04-16-2020
09:45 AM
In your example you cannot just give me a Cloudera license because I would not have a commercial relationship with Cloudera - I wouldn't be able to raise support tickets for example because Cloudera would have no record of me. You need to speak to your Cloudera sales account team to discuss your situation. Steve
... View more
04-16-2020
04:13 AM
1 Kudo
Hi @Cloudsupport , Thanks for clarifying the situation. If you are happy to message me privately via the Cloud Community messaging capability and tell me who company X and company Y are, I will see if I can get someone to help you. Regards, Steve
... View more
04-16-2020
02:20 AM
Hi @Cloudsupport , I don't completely follow the scenario that you described - are you saying the support department is handing over to another department? What do you mean by 'organization' - another company? That said, to discuss the Cloudera license and commercial arrangements you should reach out to the Cloudera account team for your organization. Regards, Steve
... View more
04-16-2020
12:42 AM
Hi @DataMike , I had a chat with some of my colleagues about this and it seems there is no easy way of stopping the HDFS starting when the cluster restarts. You might be able to do something via the Cloudera Manager API - but that probably quite complicated. If it's any consolidation, this is fixed in the next generation of technology from Cloudera i.e. CDP. When you upgrade to CDP Sentry is replaced with Ranger and this HDFS dependency for Kafka no longer exists. Regards, Steve
... View more
04-14-2020
02:14 AM
Hi @Anujcomm What version of HDP are you trying to download? Please could you share the link that you are trying to access. Regards, Steve
... View more
04-11-2020
11:38 AM
Hi @pavankad , If you are going to be deploying into Azure then I recommend that you use the Cloudera Data Platform (CDP) rather than HDP. CDP is the latest distribution from Cloudera and can be deployed into Azure using a cloud-native architecture and with cloud consumption-based pricing. Publicly available CDP pricing for Azure can be found here: https://www.cloudera.com/products/pricing/cdp-public-cloud-service-rates.html?tab=1 Regards, Steve
... View more
04-11-2020
11:21 AM
Hi @karthik_1984 , To my knowledge, Cloudera won't be providing any upgraded VMs. Therefore you would have to update your installation to support Spark 2. The installation instructions are here: https://docs.cloudera.com/documentation/spark2/latest/topics/spark2_installing.html Regards, Steve
... View more
04-10-2020
10:06 AM
Hi @mikejeezy ,
You shouldn't remove the HDFS component but you can stop the HDFS service in the scenario that you describe.
Please refer to the documentation here: Configuring Kafka to Use Sentry Authorization
Sentry requires that your cluster include HDFS. After you install and start Sentry with the correct configuration, you can stop the HDFS service.
Regards,
Steve
... View more