Member since
02-19-2018
99
Posts
29
Kudos Received
32
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1131 | 07-28-2020 07:46 AM | |
1035 | 07-28-2020 07:45 AM | |
2011 | 06-23-2020 11:15 PM | |
3139 | 06-23-2020 11:12 PM | |
1461 | 05-25-2020 02:41 AM |
02-05-2020
04:34 AM
Hi @ahmedalsaidi Apache Nifi does not have a generic CDC processor as such. One way you can achieve a CDC like approach is to use the QueryDatabaseTable processor. Please take a look at this article on how to use the QueryDatabaseTable processor to do an incremental fetch of new rows in the source database: https://community.cloudera.com/t5/Community-Articles/Incremental-Fetch-in-NiFi-with-QueryDatabaseTable/ta-p/247073 Alternatively, you use can use a vendor that specializes in CDC. Please accept this answer as a solution if it helps you. Steve
... View more
02-05-2020
02:32 AM
2 Kudos
Hi @carlaurrea The Cloudera Data Platform (CDP) has a number of form factors including a PaaS model in the cloud and a data center edition for on-premises deployments. I'm going to make the assumption that your HDP 2.6.5. deployment in on-premises. The Cloudera Data Platform Data Center edition (CDP-DC) is already available for download and installation: https://www.cloudera.com/downloads.html However, it is not possible right now to do an upgrade in place with this version - but we are planning to support this shortly with the release of CDP-DC 7.1. The plan is that when CDP-DC 7.1 is released you will be able to perform an upgrade in place from HDP 2.6.5. Regards, Steve
... View more
02-04-2020
07:39 AM
Hi @mranjank In your setup, your 4 data nodes are your worker nodes. HDFS i.e. where the data are stored is on the data nodes and because of data locality, this is where your 'compute' or 'worker' nodes run too. If you log into Cloudera Manager and click 'Hosts' at the top and then 'Roles' on the dropdown you will see what services are running on what nodes. I hope that helps. Steve
... View more
02-04-2020
04:11 AM
Hi @piosobc , The CDH pricing is under the 'Enterprise Data Hub' column on this page: https://www.cloudera.com/products/pricing.html So yes, it's the same as CDP. But as @cjervis mentioned in an earlier post it is best to discuss this with someone in our sales organization via Contact Us page on Cloudera.com. I think Cloudera Express will work > 100 nodes for CDH / CM 6.0 as this limit was introduced at 6.1. However, I don't recommend this as an approach as you would surely benefit from the enterprise-class features for more than 100 nodes. Also, as I mentioned in my earlier post, Cloudera is moving away from shipping Cloudera Express editions. Your best option is to consider a support version of CDH / CDP. Regards, Steve
... View more
02-03-2020
02:31 AM
2 Kudos
In part 1 of this series we talked about the growing relevance of streaming technologies and covered the need for existing Cloudera customers currently using Apache Flume, to consider moving over to Cloudera DataFlow (CDF).
Cloudera DataFlow is an umbrella term that covers the streaming technologies from Cloudera. CDF is supported on CDH 5 / CDH 6 and HDP 2 / HDP 3. So there is nothing stopping customers adopting Cloudera DataFlow right now so that they are in a supported configuration for when they upgrade to the new Cloudera Data Platform (CDP).
CDF includes the technology to address a number of areas:
Edge Flow Management
Core Flow Management
Stream Processing
Streaming Analytics
Enterprises Services
A good summary of these components can be found in the blog post Introducing Cloudera DataFlow (CDF).
Cloudera DataFlow - Data-In-Motion Platform
If you are a traditional Cloudera customer using the Cloudera Distribution of Apache Kafka, there are a number of new and exciting management technologies available via CDF. For example, the Cloudera Streams Management component includes:
Cloudera Stream Messaging Manager which provides a visual and interactive user interface for managing topics in Apache Kafka.
Cloudera Streams Replication Manager for managing replication between Kafka clusters based on MirrorMaker 2.
However, Apache Flume has been replaced in CDF by Apache Nifi and MiNiFi. There are a number of benefits of using Apache Nifi / MiNiFi over Apache Flume:
It is very simple to use with an intuitive user interface. This enhances user productivity with a drag and drop approach to designing data pipelines rather than having to develop lots of lines of code and configuration files.
There are 290+ pre-built processors for data source connectivity, ingestion, transformation, and content routing.
Nifi supports Nifi Registry for version controlling dataflows and also supports the software development lifecycle (SDLC) when it comes to promoting flows from one environment to another e.g. development to production.
Point-in-time capability - allowing you to go back to a previous point in time and inspect the data as it was at that point and replay it again downstream.
Scale-out architecture - adding more nodes increases the network and disk bandwidth for ingestion and transformation.
Data lineage and provenance are built-in features of Apache Nifi with graphical information and metrics that describe data on their journey from source to target.
Cloudera Edge Management (CEM) provides a management user interface for deploying and managing MiNiFI agents on edge devices.
Continuous data delivery, streaming applications and real-time analysis are becoming increasingly important and more widely adopted as part of a data architecture strategy. However, so is the need to adhere and comply with data regulation and protection laws such as GDPR in the EU and CCPA in California. This is why technologies such as Apache Nifi with graphical data pipelines and built-in support for data lineage and provenance provide a strong framework to work towards meeting regulatory compliance requirements.
One of the reasons that customers adopt Cloudera technology is because of the portfolio of technology that we offer all under a governed, secure and integrated data and analytics platform. This means that we can integrate and build differing streaming applications to address a variety of use cases. For example, Cloudera supports Apache HBase and Apache Kudu to use as the backend storage for real-time applications. In addition, Cloudera Machine Learning means that we can build predictive models and manage and deploy them into streaming applications. This is why we describe Cloudera as an end-to-end Edge2AI platform.
... View more
01-31-2020
09:48 AM
1 Kudo
Hello, You can not buy Cloudera Express and it does not support > 100 nodes from CDH 6.1. Please see this note: https://community.cloudera.com/t5/Product-Announcements/ANNOUNCE-Cloudera-Manager-6-1-Express-functionality-change/td-p/84180 It is also worth pointing out that with newer versions of the software Cloudera will not be shipping a Cloudera Express edition although you will still be able to download the software and do a trial. The list price for Cloudera's commercial products are here: https://www.cloudera.com/products/pricing.html Regards, Steve
... View more
01-31-2020
02:58 AM
1 Kudo
Hello, You can follow the steps outlined here. This is for CDH 6.3: https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/cm_mc_adding_hosts.html#cmug_topic_7_5_1__title_215 Regards, Steve
... View more
01-29-2020
07:53 AM
1 Kudo
Hello, Unfortunately, there is no way to independently upgrade Cloudera Hue within the VM. You would need to upgrade all of CDH 5.13 to the version that you require e.g. CDH 6.3. As far as I am aware there are no plans to provide new releases of the QuickStarts based on later versions of CDH. That said it is quite straightforward to install a trial proof-of-concept CDH cluster (for example you could do this on a single node using cloud-based infrastructure): https://docs.cloudera.com/documentation/enterprise/latest/topics/cm_ig_non_production.html#install_embedded_db Regards, Steve
... View more
01-29-2020
06:45 AM
Hi, No, CDH and the components under CDF are licensed separately. In your case, where you want to use Apache Nifi you would license Cloudera Flow Management which includes Apache Nifi and Nifi Registry. Regards, Steve
... View more
01-29-2020
12:31 AM
1 Kudo
Hi, Yes, NiFi is supported on both CDH 5 and CDH 6. Cloudera has a number of streaming products that are collectively referred to as Cloudera DataFlow. Cloudera DataFlow covers: Edge Management Cloudera Flow Management Stream Processing Streaming Analytics Apache Nifi and Nifi Registry are covered under Cloudera Flow Management (CFM). You can find the Cloudera Flow Management documentation here: https://docs.cloudera.com/cfm/1.0.1/index.html This documentation includes the system requirements and installation process to install CFM which would give you Apache Nifi on your environment. Regards, Steve
... View more
- « Previous
- Next »