Member since
06-26-2015
515
Posts
137
Kudos Received
114
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2022 | 09-20-2022 03:33 PM | |
5602 | 09-19-2022 04:47 PM | |
3037 | 09-11-2022 05:01 PM | |
3353 | 09-06-2022 02:23 PM | |
5297 | 09-06-2022 04:30 AM |
08-11-2025
10:26 AM
@AlokKumar User authentication using OpenID Connect: OpenID Connect If you found that any of the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more
07-21-2025
10:36 PM
Cloudera’s Data In Motion Team is pleased to announce the release of the Cloudera Streaming Analytics - Kubernetes Operator 1.3, an integral component of Cloudera Streaming - Kubernetes Operators. This release includes rebases to Apache Flink 1.20 and Apache Flink Kubernetes Operator 1.11.0. Other changes and updates are focused on enhancing security, usability, and making the product more robust. Release Highlights Rebase to Flink 1.20 For more information, see the Flink 1.20 Release Announcement and the Release Notes. Rebase to Flink Kubernetes Operator 1.11.0 For more information, see the Apache Flink Kubernetes Operator 1.11.0 Release Announcement and the Release Notes. Flink OpenTelemetry Metrics Reporter (Technical Preview) The OpenTelemetry metrics reporter is now included, in Tech Preview, in the operator image. It makes it easier and more efficient to aggregate job metrics to a central service like Prometheus or any OpenTelemetry-compatible service, and aggregate metrics using open standards. To learn more about using the OpenTelemetry reporter with Flink, see Using the OpenTelemetry Collector [Technical Preview]. Flink image with Hadoop, Hive, Iceberg, and Kudu connectors Cloudera now offers alternative images for Cloudera Streaming Analytics - Kubernetes Operator that include popular connectors and their dependencies. This makes it easier to launch jobs right after installation and saves time because there’s no need to create custom images. For more information, refer to the Operator's Installation Overview. Basic authentication for Flink UI and rest API The Operator now includes the Flink Basic Authentication Handler, enabling users to secure their deployments without the need for external or third-party JARs. Please see the Release Notes for the complete list of fixes and improvements. Getting the New Release To upgrade to Cloudera Streaming Analytics - Kubernetes Operator 1.3, check out the upgrade guide. If you are installing this operator for the first time, consult the installation overview. Use Cases Event-Driven Applications: Stateful applications that ingest events from one or more event streams and react to incoming events by triggering computations, state updates, or external actions Apache Flink excels in handling the concept of time and state for these applications, and can scale to manage very large data volumes (up to several terabytes). It has a rich set of APIs, ranging from low-level controls to high-level functionality, like Flink SQL, enabling developers to choose the most suitable options for the implementation of advanced business logic. One of Apache Flink’s most popular features for event-driven applications is its support for savepoints. A savepoint is a consistent state image that can be used as a starting point for compatible applications. With a savepoint, an application can be updated or adapt its scale, or multiple versions of an application can be started for A/B testing. Examples: Fraud detection Anomaly detection Rule-based alerting Business process monitoring Web application (social network) Data Analytics Applications: With a sophisticated stream processing engine, analytics can be performed in real time. Streaming queries or applications ingest real-time event streams and continuously produce and update results as events are consumed. The results are written to an external database or maintained as internal state. A dashboard application can read the latest results from the external database or directly query the internal state of the application. Apache Flink supports streaming as well as batch analytical applications. Examples: Quality monitoring of telco networks Analysis of product updates and experiment evaluation in mobile applications Ad-hoc analysis of live data in consumer technology Large-scale graph analysis Data Pipeline Applications: Streaming data pipelines serve a similar purpose as Extract-Transform-Load (ETL) jobs. They transform and enrich data and can move it from one storage system to another. However, they operate in a continuous streaming mode instead of being periodically triggered. Hence, they can read records from sources that continuously produce data and move it with low latency to their destination. Examples: Real-time search index building in e-commerce Continuous ETL in e-commerce Resources New - What’s New in Cloudera Streaming Analytics - Kubernetes Operator 1.3 Updated - Cloudera Streaming Analytics - Kubernetes Operator Documentation Cloudera Stream Processing Product Page Cloudera Kubernetes Operators documentation homepage Accelerate Streaming Pipeline Deployments with New Kubernetes Operators (webinar recording) Cloudera Stream Processing & Analytics Support Lifecycle Policy
... View more
Labels:
06-07-2025
07:03 AM
Hi @araujo , Thanks for the uploaded docker-compose.yml. I have tried the steps in the here, but the command "docker-compose up -d" failed on the proxy container due to the fact that ./nginx.conf should have been a file. Would you elaborate the volumes setting for the proxy? The proxy container in question taken from your repo is excerpted below. proxy: image: nginx:latest container_name: proxy # volumes: # - ./nginx.conf:/etc/nginx/nginx.conf:ro ports: - "8443:8443" networks: - nifi depends_on: - nifi0 - nifi1 Secondly, I need your help because after I brought up all containers with two line commented out above I cannot access this two-nodes NiFi cluster with the curl command below $ curl -v https://localhost:8443/nifi * Trying 127.0.0.1:8443... * Connected to localhost (127.0.0.1) port 8443 (#0) * ALPN, offering h2 * ALPN, offering http/1.1 * CAfile: /etc/ssl/certs/ca-certificates.crt * CApath: /etc/ssl/certs * TLSv1.0 (OUT), TLS header, Certificate Status (22): * TLSv1.3 (OUT), TLS handshake, Client hello (1): * TLSv1.0 (OUT), TLS header, Unknown (21): * TLSv1.3 (OUT), TLS alert, decode error (562): * error:0A000126:SSL routines::unexpected eof while reading * Closing connection 0 curl: (35) error:0A000126:SSL routines::unexpected eof while reading Thanks, David
... View more
04-29-2025
09:58 PM
The Data In Motion team is pleased to announce the release of Cloudera Streaming Analytics 1.15 for Cloudera 7.3.1. This release focuses on enhancing the user experience and adding new important features to the product, and includes improvements to SQL Stream Builder as well as a rebase to Apache Flink 1.20.1.. Release Highlights Cloudera platform support Cloudera Streaming Analytics 1.15 is supported on Cloudera 7.3.1.100 (Cumulative Hot Fix 1). Ensure that you review the 7.3.1.100 Release Notes and Support Matrix to understand which operating system, database, and JDK versions are supported for Cloudera Streaming Analytics as well. Rebase to Apache Flink 1.20 Streaming analytics deployments, including SQL Stream Builder, now support Apache Flink 1.20. For more information on what is included in the Apache Flink 1.20 version, see the Apache Flink 1.20 Release Announcement and Release Notes. Support for batch-mode queries in Cloudera SQL Stream Builder Users can select and use “batch” as the runtime mode for production execution mode jobs that are running in an isolated Flink cluster. For more information, see Executing SQL jobs in production mode. OpenTelemetry Metrics Reporter (Technical Preview) Cloudera Streaming Analytics now includes, in Technical Preview, the OpenTelemetry Metrics Reporter to aggregate metrics to a third-party tool using open standards. To learn more about using the OpenTelemetry reporter with Flink, see the documentation for Flink and Cloudera SQL Stream Builder, as well as the Apache Flink Documentation. Python for table transformations and webhook connector Support in Python UDFs for table transformations and the webhook connector have been added to Cloudera SQL Stream Builder and Flink. To learn more, see the Webhook Connector documentation Please see the Release Notes for the complete list of fixes and improvements. Getting to the New Release To upgrade to Cloudera Streaming Analytics 1.15.0 on Cloudera on premises, check out this upgrade guide. If you are using Cloudera on cloud, refer to the Upgrade advisor documentation. Use Cases Event-Driven Applications: Stateful applications that ingest events from one or more event streams and react to incoming events by triggering computations, state updates, or external actions. Apache Flink excels in handling the concept of time and state for these applications and can scale to manage very large data volumes (up to several terabytes) with exactly once consistency guarantees. Moreover, Apache Flink’s support for event-time, highly customizable window logic, and fine-grained control of time as provided by the ProcessFunction enable the implementation of advanced business logic. Moreover, Apache Flink features a library for Complex Event Processing (CEP) to detect patterns in data streams. However, Apache Flink’s outstanding feature for event-driven applications is its support for savepoints. A savepoint is a consistent state image that can be used as a starting point for compatible applications. Given a savepoint, an application can be updated or adapt its scale, or multiple versions of an application can be started for A/B testing. Examples: Fraud detection Anomaly detection Rule-based alerting Business process monitoring Web application (social network) Data Analytics Applications: With a sophisticated stream processing engine, analytics can also be performed in real time. Streaming queries or applications ingest real-time event streams and continuously produce and update results as events are consumed. The results are written to an external database or maintained as internal state. A dashboard application can read the latest results from the external database or directly query the internal state of the application. Apache Flink supports streaming as well as batch analytical applications. Examples: Quality monitoring of telco networks Analysis of product updates and experiment evaluation in mobile applications Ad-hoc analysis of live data in consumer technology Large-scale graph analysis Data Pipeline Applications: Streaming data pipelines serve a similar purpose as Extract-transform-load (ETL) jobs. They transform and enrich data and can move it from one storage system to another. However, they operate in a continuous streaming mode instead of being periodically triggered. Hence, they can read records from sources that continuously produce data and move it with low latency to their destination. Examples: Real-time search index building in e-commerce Continuous ETL in e-commerce Resources New - What’s New in Cloudera Streaming Analytics 1.15 Updated - Cloudera Streaming Analytics Documentation Cloudera Stream Processing Product Page Cloudera Kubernetes Operators documentation homepage Cloudera Stream Processing Community Edition Accelerate Streaming Pipeline Deployments with New Kubernetes Operators (webinar recording) Updated - Cloudera Stream Processing & Analytics Support Lifecycle Policy
... View more
Labels:
03-12-2025
09:33 PM
Release Highlights Rebase on Flink 1.19.2 For more information, see the Flink 1.19.2 Release Announcement and the Release Notes. General availability for Cloudera SQL Stream Builder Cloudera SQL Stream Builder, previously in Technical Preview, is now generally available in Cloudera Streaming Analytics - Kubernetes Operator. Cloudera SQL Stream Builder is a comprehensive interactive user interface for creating stateful stream processing jobs using SQL. For more information about SQL Stream Builder and its features, see the Getting started with SQL Stream Builder page. LDAP authentication in SQL Stream Builder LDAP-based authentication is available for Cloudera SQL Stream Builder. For more information, refer to LDAP authentication. Custom truststores You can now specify custom truststores when installing the Cloudera Streaming Analytics - Kubernetes Operator. For more information, refer to Security configurations. Secure TLS connections to the SSB UI and API Users can connect to Cloudera SQL Stream Builder and API via a TLS-encrypted channel. For more information, refer to Routing with ingress. Using Python UDFs for table transformations and webhook connector Due to the deprecation of Javascript User-defined Functions (UDFs) in Cloudera Streaming Analytics - Kubernetes Operator (see Deprecation notices), support in Python UDFs for table transformations and the webhook connector have been added to Cloudera SQL Stream Builder. For more information, refer to Creating Python User-defined Functions. Configurable requests and limits for all resources Resource requests/limits can now be defined in the resource configuration. For more information, refer to Resource requests and limits. Please see the Release Notes for the complete list of fixes and improvements. Getting to the New Release To upgrade to Cloudera Streaming Analytics - Kubernetes Operator 1.2, check out this upgrade guide. If you are installing for the first time use this installation overview. Use Cases Event-Driven Applications: Stateful applications that ingest events from one-or more-event streams and react to incoming events by triggering computations, state updates, or external actions. Apache Flink excels in handling the concept of time and state for these applications, and can scale to manage very large data volumes (up to several terabytes). It has a rich set of APIs, ranging from low-level controls to high-level functionality, like Flink SQL, enabling developers to choose the most suitable options for the implementation of advanced business logic. Apache Flink’s outstanding feature for event-driven applications is its support for savepoints. A savepoint is a consistent state image that can be used as a starting point for compatible applications. Given a savepoint, an application can be updated or adapt its scale, or multiple versions of an application can be started for A/B testing. Examples: Fraud detection Anomaly detection Rule-based alerting Business process monitoring Web application (social network) Data Analytics Applications: With a sophisticated stream processing engine, analytics can be performed in real-time. Streaming queries or applications ingest real-time event streams and continuously produce and update results as events are consumed. The results are written to an external database or maintained as internal state. A dashboard application can read the latest results from the external database or directly query the internal state of the application. Apache Flink supports streaming as well as batch analytical applications. Examples: Quality monitoring of telco networks Analysis of product updates & experiment evaluation in mobile applications Ad-hoc analysis of live data in consumer technology Large-scale graph analysis Data Pipeline Applications: Streaming data pipelines serve a similar purpose as Extract-Transform-Load (ETL) jobs. They transform and enrich data and can move it from one storage system to another. However, they operate in a continuous streaming mode instead of being periodically triggered. Hence, they can read records from sources that continuously produce data and move it with low latency to their destination. Examples: Real-time search index building in e-commerce Continuous ETL in e-commerce Public Resources New - What’s New in Cloudera Streaming Analytics - Kubernetes Operator 1.2 Updated - Cloudera Streaming Analytics - Kubernetes Operator Documentation Cloudera Stream Processing Product Page Cloudera Kubernetes Operators documentation homepage Cloudera Stream Processing Community Edition Accelerate Streaming Pipeline Deployments with New Kubernetes Operators (webinar recording) Updated - Cloudera Stream Processing & Analytics Support Lifecycle Policy
... View more
Labels:
03-10-2025
10:01 PM
Cloudera’s Data In Motion Team is pleased to announce the release of Cloudera Streams Messaging - Kubernetes Operator 1.3, an integral component of Cloudera Streaming - Kubernetes Operator. With this release, customers receive a rebase to Kafka 3.9, automatic cluster rebalance, better offset management capabilities for Kafka connectors, and more! Release Highlights Rebase to Kafka 3.9 For more information, see the Kafka 3.9 Release Notes and the list of notable changes. Rebase to Strimzi 0.45.0 For more information, see the Strimzi 0.44.0 Release Notes and Strimzi 0.45.0 Release Notes. KRaft (Kafka Raft) is generally available You can now deploy Kafka clusters that use KRaft instead of ZooKeeper for metadata management. Additionally, you can migrate existing ZooKeeper-based Kafka clusters to use KRaft. With the addition of KRaft, ZooKeeper is deprecated. Deploying new or using existing Kafka clusters running in ZooKeeper mode is deprecated. Additionally, ZooKeeper will be removed in a future release. When deploying new Kafka clusters, deploy them in KRaft mode. Cloudera encourages you to migrate existing clusters to KRaft. For cluster deployment instructions, see Deploying a Kafka cluster. For migration instructions, see Migrating Kafka clusters from ZooKeeper to KRaft. Auto-rebalancing when scaling the cluster You can now enable auto-rebalancing for Kafka clusters. If auto-rebalancing is enabled, the Strimzi Cluster Operator automatically initiates a rebalance with Cruise Control when you scale the Kafka cluster. Cloudera recommends that you enable this feature as it makes scaling easier and faster. For more information, see Scaling brokers. Offset management through KafkaConnector resources is now available Connector offsets can now be managed directly by configuring your KafkaConnector resources. Cloudera recommends that you use this feature over the Kafka Connect REST API to manage connector offsets. For more information, see Managing connector offsets and Configuring data replication offsets. These are the recommended methods for managing replication offsets when replicating data with Kafka Connect-based replication has also changed. Please see the Release Notes for the complete list of fixes and improvements Getting to the New Release To upgrade to Cloudera Streams Messaging - Kubernetes Operator 1.3, check out this upgrade guide. If you are installing for the first time use this installation overview. Use Cases Flexible, agile, and rapid Kafka deployments: Deploy Apache Kafka in seconds on existing Kubernetes infrastructure. Cloudera Streams Messaging-Kubernetes Operator has very lightweight dependencies and system requirements for Kafka-centric deployments. It simplifies and standardizes Kafka deployments and provides auto-scaling support for variable workloads. Operational efficiency with simple upgrades: The complexity of Kafka rolling upgrades is handled by Cloudera Streams Messaging - Kubernetes Operator, making them simpler and safer to execute. Loading and unloading data from Kafka: Kafka Connect gives Kafka users a simple way to access data quickly from a source and feed it to a Kafka topic. It also allows them to get data from a topic and copy it to an external destination. Cloudera Streams Messaging - Kubernetes Operator includes Kafka Connect support to give our customers a tool for moving data in and out of Kafka, efficiently. Replicating data to other sites: Disaster resilience is an important aspect of any Kafka production deployment. Cloudera Streams Messaging - Kubernetes Operator supports configuring and running Kafka replication flows across any two Kafka clusters. These clusters could be in different data centers to provide increased resilience against disasters. Kafka migrations: Customers can migrate or replicate data between containerized Kafka clusters and on-premesis or cloud-based clusters. Using Cloudera Streams Messaging - Kubernetes Operator, data can be replicated in any direction and between two or more clusters at a time. Resources New - What’s New in Cloudera Streams Messaging - Kubernetes Operator 1.3 Updated - Cloudera Streams Messaging - Kubernetes Operator Documentation Cloudera Stream Processing Product Page Cloudera Kubernetes Operators documentation homepage Cloudera Stream Processing Community Edition Accelerate Streaming Pipeline Deployments with New Kubernetes Operators (webinar recording) Updated - Cloudera Stream Processing & Analytics Support Lifecycle Policy
... View more
Labels:
01-16-2025
05:31 AM
Hi @SAMSAL @MaarufB, I had the same problem but solved it. The root cause was the CSRF security filter, which resulted in a 403 error. You can resolve this by setting the Request-Token header with the cookie value obtained during the access token issuance. For more details, please refer to the following link: NiFi Administration Guide - CSRF Protection Hope this helps!
... View more
12-23-2024
04:20 PM
We are pleased to announce the release of Cloudera Streams Messaging - Kubernetes Operator 1.2. With this release, customers receive better security integration and an update to Kafka 3.8, besides other improvements. Release Highlights Rebase on Kafka 3.8: For more information, see the Kafka 3.8 Release Notes and list of notable changes. Rebase on Strimzi 0.43.0: For more information, see the Strimzi 0.43.0 Release Notes. Apache Ranger authorization: Support for Apache Ranger authorization is now available. Customers can now integrate Kafka clusters, deployed with Cloudera Streams Messaging - Kubernetes Operator, with a remote Ranger service that is running on Cloudera Private Cloud Base. If configured, the Ranger service can provide authorization for your Kafka cluster. For more information, see Apache Ranger authorization. Improvements to Kafka replication: Rebased and backported changes to make Kafka replication more resilient and reliable when handling heartbeats and offset translation. Performance improvements for the Cloudera diagnostics tool: The report.sh tool, used by clients to provide Cloudera support with key information when dealing with support cases now runs its subprocesses in parallel, accelerating run times. For more information, see Diagnostics. For the complete list of fixes and improvements read these Release Notes. Getting to the new release To upgrade to Cloudera Streams Messaging - Kubernetes Operator 1.2, check out this upgrade guide. Please note, if you are installing for the first time use this installation overview. Use Cases Flexible, agile, and rapid Kafka deployments: Deploy Apache Kafka in seconds on existing Kubernetes infrastructure. Cloudera Streams Messaging - Kubernetes Operator has very lightweight dependencies and system requirements for Kafka-centric deployments. It simplifies and standardizes Kafka deployments and provides auto-scaling support for variable workloads. Operational efficiency with simple upgrades: The complexity of Kafka rolling upgrades is handled by Cloudera Streams Messaging - Kubernetes Operator, making them simpler and safer to execute. Loading and unloading data from Kafka: Kafka Connect gives Kafka users a simple way to access data quickly from a source and feed it to a Kafka topic. It also allows them to get data from a topic and copy it to an external destination. The operator includes Kafka Connect support to give our customers a tool for moving data in and out of Kafka, efficiently. Replicating data to other sites: Disaster resilience is an important aspect of any Kafka production deployment. Cloudera Streams Messaging - Kubernetes Operator supports configuring and running Kafka replication flows across any two Kafka clusters. These clusters could be in the same or in different data centers to provide increased resilience against disasters. Kafka migrations: Customers can migrate or replicate data between containerized Kafka clusters and on-prem or cloud-based clusters. Using Cloudera Streams Messaging - Kubernetes Operator, data can be replicated in any direction and between two or more clusters at a time. Public Resources New - What’s New in Cloudera Stream Operator 1.2 Updated - Cloudera Streams Messaging - Kubernetes Operator Documentation Cloudera Stream Processing Product Page Cloudera Kubernetes Operators documentation homepage Cloudera Stream Processing Community Edition Accelerate Streaming Pipeline Deployments with New Kubernetes Operators webinar recording Updated - Cloudera Stream Processing & Analytics Support Lifecycle Policy
... View more
Labels:
12-16-2024
07:17 PM
1 Kudo
We are pleased to announce the release of Cloudera Streaming Analytics 1.14 for Cloudera Public Cloud and Cloudera Private Cloud Base 7.3.1. This release includes improvements to SQL Stream Builder as well as updates to Apache Flink 1.19.1. Release Highlights Rebase to Apache Flink 1.19.1: Streaming analytics deployments, including SQL Stream Builder, now support Apache Flink 1.19.1, which includes the Apache Flink improvements below. For more information on these improvements and deprecations, please check the Apache Flink 1.19.1 release announcement. Custom Parallelism for Table/SQL Sources: The DataGen connector now supports setting of custom parallelism for performance tuning via the scan.parallelism option. Support for other connectors will come in future releases. Configure Different State Time to Live (TTLs) Using SQL Hint: Users have now a more flexible way to specify custom time-to-live (TTL) values for state of regular joins and group aggregations directly within their queries by utilizing the STATE_TTL hint. Named Parameters: Named parameters can now be used when calling a function or stored procedure in Flink SQL. Support for SESSION Window Table-Valued Functions (TVFs) in Streaming Mode: Users can now use SESSION Window table-valued functions (TVF) in streaming mode. Support for Changelog Inputs for Window TVF Aggregation: Window aggregation operators can now handle changelog streams (e.g., Change Data Capture [CDC] data sources, etc.). New UDF Type: AsyncScalarFunction: The new AsyncScalarFunction is a user-defined asynchronous ScalarFunction that allows for issuing concurrent function calls asynchronously. MiniBatch Optimization for Regular Joins: The new mini-batch optimization can be used in regular joins to reduce intermediate results, especially in cascading join scenarios. Dynamic Source Parallelism Inference for Batch Jobs: Allows source connectors to dynamically infer the parallelism based on the actual amount of data to consume. Standard Yet Another Markup Language (YAML) for Apache Flink Configuration: Apache Flink has officially introduced full support for the standard YAML 1.2 syntax in the configuration file. Profiling JobManager/TaskManager on Apache Flink Web: Support for triggering profiling at the JobManager/TaskManager level. New Config Options for Administrator Java Virtual Machine (JVM) Options: A set of administrator JVM options are available to prepend the user-set JVM options with default values for platform-wide JVM tuning. Using Larger Checkpointing Interval When Source is Processing Backlog: Users can set the execution.checkpointing.interval-during-backlog to use a larger checkpoint interval to enhance the throughput while the job is processing backlog if the source is backlog-aware. CheckpointsCleaner Clean Individual Checkpoint States in Parallel: Now, when disposing of no longer needed checkpoints, every state handle/state file will be disposed in parallel for better performance. Trigger Checkpoints through Command Line Client: The command line interface supports triggering a checkpoint manually. New Interfaces to SinkV2 That Are Consistent with Source API. New Committer Metrics to Track the Status of Committables. Support for Python User-Defined Functions (UDFs) in SQL Stream Builder: The current Javascript UDFs in SQL Stream Builder will not work in Java 17 and later versions due to the deprecation and removal of the Nashorn engine from the Java Development Kit (JDK). The addition of Python UDFs to SQL Stream Builder will allow customers to use Python to create new UDFs that will continue to be supported on future JDKs. Javascript UDFs are being deprecated in this release and will be removed in a future release. Cloudera recommends that customers start using Python UDFs for all new development and start migrating their JavaScript UDFs to Python UDFs to prepare for future upgrades. Note: Currently, Cloudera Streaming Analytics 1.14 only supports JDK versions 8 and 11. SQL Stream Builder support for load balancing via Knox for HA deployments: Knox now automatically discovers and provides a load balanced endpoint for SQL Stream Builder when multiple instances of the streaming engine are deployed. Global logging configuration for Configuring logs for all SSB jobs: A new global settings view enables default logging configurations to be set by the administrator. These settings will be applied to all streaming jobs by default and can be overridden at the job level. This ensures that a consistent logging standard can be applied by default for all users and developers. Please see the Release Notes for the complete list of fixes and improvements. Use Cases Event-Driven Applications: Stateful applications that ingest events from one or more event streams and react to incoming events by triggering computations, state updates, or external actions. Apache Flink excels in handling the concept of time and state for these applications and can scale to manage very large data volumes (up to several terabytes) with exactly once consistency guarantees. Moreover, Apache Flink’s support for event-time, highly customizable window logic, and fine-grained control of time as provided by the ProcessFunction enable the implementation of advanced business logic. Moreover, Apache Flink features a library for Complex Event Processing (CEP) to detect patterns in data streams. However, Apache Flink’s outstanding feature for event-driven applications is its support for savepoints. A savepoint is a consistent state image that can be used as a starting point for compatible applications. Given a savepoint, an application can be updated or adapt its scale, or multiple versions of an application can be started for A/B testing. Examples: Fraud detection Anomaly detection Rule-based alerting Business process monitoring Web application (social network) Data Analytics Applications: With a sophisticated stream processing engine, analytics can also be performed in real-time. Streaming queries or applications ingest real-time event streams and continuously produce and update results as events are consumed. The results are written to an external database or maintained as internal state. A dashboard application can read the latest results from the external database or directly query the internal state of the application. Apache Flink supports streaming as well as batch analytical applications. Examples: Quality monitoring of telco networks Analysis of product updates & experiment evaluation in mobile applications Ad-hoc analysis of live data in consumer technology Large-scale graph analysis Data Pipeline Applications: Streaming data pipelines serve a similar purpose as Extract-transform-load (ETL) jobs. They transform and enrich data and can move it from one storage system to another. However, they operate in a continuous streaming mode instead of being periodically triggered. Hence, they can read records from sources that continuously produce data and move it with low latency to their destination. Examples: Real-time search index building in e-commerce Continuous ETL in e-commerce Getting to the new release To upgrade to Cloudera Streaming Analytics 1.14, first ensure that your Cloudera Private Cloud Base environment is already upgraded to version 7.3.1 and then follow the instructions in the Cloudera Streaming Analytics upgrade guide. Resources New - What’s New in Cloudera Streaming Analytics 1.14 Updated - Cloudera Streaming Analytics Documentation Updated - Cloudera Stream Processing Product Page Cloudera Stream Processing Community Edition Accelerate Streaming Pipeline Deployments with New Kubernetes Operators webinar recording Updated - Cloudera Stream Processing & Analytics Support Lifecycle Policy
... View more
Labels: