Member since
08-08-2024
108
Posts
27
Kudos Received
10
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 416 | 04-15-2026 11:56 AM | |
| 996 | 04-07-2026 02:00 PM | |
| 434 | 03-12-2026 09:53 AM | |
| 457 | 03-04-2026 03:07 PM | |
| 651 | 02-10-2026 07:31 PM |
02-04-2026
09:02 AM
Hello @SalimAlhajri, The Transparent Authentication is built-in Cloudera AI. When the application is started, it will inject the REMOTE-USER and REMOTE-USER-PERM HTTP headers automatically, this is why it is transparent, no manual intervention is needed. https://docs.cloudera.com/machine-learning/1.5.5/applications/topics/ml-securing-applications.html
... View more
02-04-2026
07:53 AM
Hello @zzzz77, Maybe this blog can help you: https://community.cloudera.com/t5/Community-Articles/Understanding-how-NiFi-s-Content-Repository-Archiving-works/ta-p/249418 There explains how to handle the repository archive and it could work for what you need. Also, there are other options like the Reporting Tasks documented here: https://nifi.apache.org/docs/nifi-docs/ SiteToSiteProvenanceReportingTask is an option for your need.
... View more
02-04-2026
06:58 AM
1 Kudo
Hello @NadirHamburg Thanks for being part of our Community. I'm not an expert on Clickhouse, but was reading that it could be something on the DB causing the batches to repeat and causing that amount of duplicated records. From NiFi side, you can try to set the batch size at the same amount of records, this should work for you. But I know that for big databases it could be a problem. From Clickhouse, I found this documentation: https://clickhouse.com/docs/engines/table-engines/mergetree-family There talks about ReplicatedMergeTree, which should be a good option to avoid duplicates. Do you have your table with those settings? Do you see any errors on PutDatabaseRecord log? If so, can you share them?
... View more
02-04-2026
06:39 AM
Hello @garb, Thanks for being part of our community. I was reviewing the information and even though Calcite is the engine used for the SQL, looks like not all the queries are supported officially. Looking in several places, I do not see LPAD used anywhere. But something that may work for what you need is CONCAT, which should give you the correct format properly and is broadly used in the community: SELECT
MsgSeqNbr, PostTime, SSN, EmployeeID,
LName, FName, MName,
CAST(TRIM(LCN) AS BIGINT) AS LCN,
RIGHT(CONCAT('00000', PIN), 5) AS PIN,
EmployeeType, ValidityCode, AgencyOwner, AgencyLocated,
BadgeCreatedBy, BadgeCreatedTime,
BadgeModifiedBy, BadgeModifiedTime,
Clearance, Error, Status
FROM FLOWFILE Based on the error "No match found for function signature LPAD" it looks like the engine configured for NiFi does not support LPAD even when Calcite do support it. I was trying to find on the code the supported functions, but did not find LPAD. This looks to be the most accurate reference we have where we do see CONCAT: https://github.com/apache/nifi/blob/main/nifi-docs/src/main/asciidoc/record-path-guide.adoc
... View more
02-02-2026
08:53 AM
Hello @backtohome, So far I know, we do support GPUs for Spark workloads on CML. The documentation talks about that: Autoscaling: Cloudera AI also supports native cloud autoscaling via Kubernetes. When clusters do not have the required capacity to run workloads, they can automatically scale up additional nodes. Administrators can configure auto-scaling upper limits, which determine how large a compute cluster can grow. Since compute costs increase as cluster size increases, having a way to configure upper limits gives administrators a method to stay within a budget. Autoscaling policies can also account for heterogeneous node types such as GPU nodes. https://docs.cloudera.com/machine-learning/1.5.5/spark/topics/ml-apache-spark-overview.html You have to configure them by following this doc: https://docs.cloudera.com/machine-learning/1.5.5/gpu/topics/ml-gpu.html If you do not have the GPUs configured on CML, the UI will not show you the options, such like this:
... View more
01-30-2026
12:48 PM
Hello @zzzz77, Glad to have you on the community. What you are asking should be done with this kind of flow: GetFile → SplitContent → Transfer → MergeContent → PutFile The SplitContent will split the file and the attributes will be get duplicated, because they are saved on the FlowFile, not on the content. More attributes will be added for the fragmentation part. The MergeContent will rebuild the content and the original attributes properly. So the metadata will not be lost.
... View more
01-28-2026
05:32 AM
Hello @raghavhinduja26, The Cloudera documentation have the steps for Ubuntu installation. Take a look here: https://docs.cloudera.com/cloudera-manager/7.13.1/cloudera-manager-installation/topics/cdpdc-installing-cm-runtime.html On each of the steps you have 3 tabs: RHEL, SLES and Ubuntu. Follow those steps and you should be good to go.
... View more
01-19-2026
09:21 AM
Hello @Runa27, Thanks for being part of our community. Several reasons could lead on this, but the main two I can think are these: 1. You do not have the web proxy enabled. Add this setting and test: nifi.web.proxy.host=UIhostname:8443 2. The files are big and the browser is restricting the big response. Now, what can help here is to see the nifi-app.log at the moment of the failure, it may give us some clues.
... View more
01-08-2026
09:25 AM
Hello @Esteban2026, The Spark version is related to the VC created. You should be able to select the version when creating a new VC. On the trial, don't you have those options?
... View more
01-07-2026
03:01 PM
Hello @jame1997, Can you tell us which version of NiFi are you using? I want to confirm if there is any bug reported. Something you can try is to change the State Manager to Tracking Timestamps which is more reliable.
... View more