About steven-matison

steven-matison · ‎03-16-2023

@Sivagopal Check out this post for a similar scenario. It includes a solution: https://community.cloudera.com/t5/Support-Questions/Nifi-1-16-fails-to-start-with-Decryption-exception/m-p/358190

steven-matison · ‎03-13-2023

@Meeran Going out on a limb, but i think the conflict is related to "LogicalType" of uuid not preparing the downstream value to what casssandra expects. The error seems to think the logicalType of the string is the wrong format. I think the error on the cassandra driver level, not with the schema or nifi itself. One suggestion i have is that you could just make that a string (without the type) and see if the driver accepts. If it does, then you know the issue is just related to the LogicalType logic. As long as your uuid is not manipulated you do not need to re-confirm its actually a uuid in the nifi data flow.

steven-matison · ‎03-13-2023

@larsfrancke Unfortunately I do not have the exact solution or information you need. However, I do have multiple customers whom have gotten their CDP on Isilon kerberized and in production. There were some tickets on our support side leading through the kerberos setup, but the specific technical solution came from Dell's side since this is supported solution for Isilon. My recommendation is to work with Cloudera Support to see if they have suggestions, and then work with Dell Support coming out of that. Your Cloudera account team and Dell Partner should have access to deeper resources if both support's cannot resolve.

steven-matison · ‎03-02-2023

@fahed What you see with the CDP Public Cloud Data Hubs using GCS (or object store) is a modernization of the platform around object storage. This removes differences across aws, azure, and on-prem (when Ozone is used). It is a change by customer demand so that workloads are able to be built and deployed with minimal changes from on prem to cloud or cloud to cloud. Unfortunately that creates a difference you describe above, but those are risks we are willing to take ourselves in favor of modern data architecture. If you are looking for performance, you should take a look at some of the newer options for databases: impala and kudu (this one uses local disk). Also we have Iceberg coming into this space too.

steven-matison · ‎03-01-2023

Nice and Quick! Excellent!

steven-matison · ‎03-01-2023

@Pierro6AS First thing you should do is increase the size of the message queue. The default size is quite low (10,000 records and 1gb). It is possible to see this error if the flowfiles have been in the queue for too long. It is also possible to see this error if the file system has other usage outside of nifi. For best performance nifi's backing folder structure (content/flowfile repository) should be dedicated disks that are larger than the demand of the flow (especially during heavy unexpected volume). You can find more about this in these posts: https://community.cloudera.com/t5/Support-Questions/Unable-to-write-flowfile-content-to-content-repository/td-p/346984 https://community.cloudera.com/t5/Support-Questions/Problem-with-Merge-Content-Processor-after-switch-to-v-1-16/m-p/346096/highlight/true#M234750

steven-matison · ‎02-24-2023

@saketa Magic sauce right here, great article!!

steven-matison · ‎02-24-2023

@kishan1 In order to restart a specific processor group you will need to use some command line magic against the Nifi API. For example, this could be done by using a command to stop the processor group, then the restart nifi command, then start processor group. You can certainly be creative in how you handle that approache once you have experimented with the API. https://nifi.apache.org/docs/nifi-docs/rest-api/index.html

steven-matison · ‎02-24-2023

@mmaher22 You may want to run the python job inside of ExecuteScript. In this manner, you can send output to a flowfile during your loops iterations with: session.commit() This command is inferred at the end of the code execution in ExecuteScript to send output to next processor (1 flow file). So if you just put that in line with your loop, then the script will run, and send flowfiles for every instance. For a full rundown of how to use ExecuteScript be sure to see these great articles: https://community.hortonworks.com/articles/75032/executescript-cookbook-part-1.html https://community.hortonworks.com/articles/75545/executescript-cookbook-part-2.html https://community.hortonworks.com/articles/77739/executescript-cookbook-part-3.html

steven-matison · ‎02-23-2023

@fahed That size is to be able to grow and serve in production manner. At first that disk usage could be low. For DataHubs, My recommendation is to start small and grow as needed. Most of your work load data should be in object store(s) for the data hubs, so dont think of that "hdfs" disk as being size constrained to initial creations of the hub.

Online	Offline
Last Visited	‎10-28-2024 11:50 AM

Member Since	‎02-01-2022 01:27 PM
Last Visited	‎10-28-2024 11:50 AM
Posts	269
Kudos received	94

Cloudera Community

Re: Apache Nifi Release 2.0 M1 & M2 High CPU Utili...

Re: error nifi connecting as cluster

Re: Difficulty Sending GraphQL POST Requests Using...

Re: Should i have to restart entire cluster if CM ...

Re: NIFI ListenUDP with TLS support?

Re: Decryption Failed with Algorithm [PBEWITHMD5AN...

Re: NiFi Avro Schema for UUID to insert to Cassand...

Re: Isilon User Mapping Rules with CDP / Kerberos

Re: Changing data location from GCS to local disks...

Re: Installing CDP CLI within a CML Project

Re: LookupRecord queue processor suddenly turns fu...

Re: Open Data Lakehouse powered by Apache Iceberg ...

Re: Automatically running selected processor group...

Re: SSE Client in Apache NiFi

Re: CDP Public Cloud Datalake HDFS usage