Member since
02-01-2022
288
Posts
103
Kudos Received
60
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 1266 | 05-15-2025 05:45 AM | |
| 5297 | 06-12-2024 06:43 AM | |
| 8296 | 04-12-2024 06:05 AM | |
| 6138 | 12-07-2023 04:50 AM | |
| 3415 | 12-05-2023 06:22 AM |
05-10-2023
05:56 AM
@zzeng Great article. Reach out to me on internal channels. I would love to show you my oracle to kudu demo, using kafka and schema registry.
... View more
05-04-2023
06:30 AM
You do not install the Cloudera version on your laptop 🙂 You need the Cloudera DataFlow for Public Cloud (CDF-PC), meaning that we are talking here about a license and some services. As @steven-matison already provided you with the perfect answer for your question, he might also be in the position to further assist you with everything you need to know about the Cloudera Data Flow and their Public Cloud. Unfortunately I am still learning about what Cloudera offers and how, so I am not the best one to answer your question. If you are going to use NiFi for some real data processing, I strongly recommend you to have a look to Cloudera Data Flow, as this will solve many issues and headaches 🙂
... View more
04-14-2023
09:28 AM
1 Kudo
@kishan1 If you click into any property in the NiFI ui, it will indicate if parameters are accepted or not. This SAS Token does accept params. Reference:
... View more
03-31-2023
12:35 PM
hello, thanks for your help, i have tried your proposal using --useSSL=false, but it did not work for me. Unrecognized argument: --useSSL=false Unrecognized argument: -useSSL=false I have solved my issue by using: java version "1.8.0_60" Java(TM) SE Runtime Environment (build 1.8.0_60-b27) Java HotSpot(TM) 64-Bit Server VM (build 25.60-b23, mixed mode) Instead of openjdk version "1.8.0_352" OpenJDK Runtime Environment (build 1.8.0_352-b08) OpenJDK 64-Bit Server VM (build 25.352-b08, mixed mode)
... View more
03-30-2023
04:25 AM
Hi @ShobhitSingh You need to adjust the csv file sample.csv ========= COL1|COL2|COL3|COL4
1st Data|2nd|3rd data|4th data
1st Data|2nd \\P data|3rd data|4th data
"1st Data"|"2nd '\\P' data"|"3rd data"|"4th data"
"1st Data"|"2nd '\\\\P' data"|"3rd data"|"4th data" Spark Code: spark.read.format("csv").option("header","true").option("inferSchema","true").option("delimiter","|").load("/tmp/sample.csv").show(false) Output: +--------+--------------+----------+--------+
|COL1 |COL2 |COL3 |COL4 |
+--------+--------------+----------+--------+
|1st Data|2nd |3rd data |4th data|
|1st Data|2nd \\P data |3rd data |4th data|
|1st Data|2nd '\P' data |3rd data |4th data|
|1st Data|2nd '\\P' data|3rd data |4th data|
+--------+--------------+----------+--------+
... View more
03-21-2023
06:16 AM
Yes, I am doing the same. And how to maintain the order of the dynamic properties in Invokehttp processor as the order is also important else it is saying bad request. Whenever i am adding those, these are automatically arranging the alphabetical orders instead the order on which i am adding
... View more
03-20-2023
05:48 AM
@Fahmihamzah84 This appears to be an issue with your schema. The BigQuery error is suggesting an issue trying to cast a string into a collection (array/list/ect). It's hard to tell which array may be causing the issue as there are many. My suggestion is to set the processor to log level DEBUG and see if you can get more verbose error. This will help you figure out which field or fields is the culprit. Keep in mind it could be one of the empty arrays too. I do not suggest the following as a solution just as path to figuring out where the problem is. Sometimes when i have issues with type casting, i make everything a string temporarily and for development. If you do this carefully one at a time, when the error goes away, you can determine which field it is. This also helps you identify a working state for your flow and allow you to work from that operational base to find solution for the end schema being the format you need.
... View more
03-16-2023
08:16 AM
1 Kudo
Awesome news, +2 solutions here.
... View more
03-02-2023
06:44 AM
1 Kudo
@fahed What you see with the CDP Public Cloud Data Hubs using GCS (or object store) is a modernization of the platform around object storage. This removes differences across aws, azure, and on-prem (when Ozone is used). It is a change by customer demand so that workloads are able to be built and deployed with minimal changes from on prem to cloud or cloud to cloud. Unfortunately that creates a difference you describe above, but those are risks we are willing to take ourselves in favor of modern data architecture. If you are looking for performance, you should take a look at some of the newer options for databases: impala and kudu (this one uses local disk). Also we have Iceberg coming into this space too.
... View more