Member since
09-04-2019
62
Posts
17
Kudos Received
11
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
810 | 10-30-2023 06:50 AM | |
10755 | 02-27-2023 09:25 AM | |
1930 | 07-07-2022 09:17 AM | |
1795 | 01-26-2022 06:25 AM | |
2651 | 01-25-2022 06:19 AM |
10-29-2024
07:02 PM
Cloudera DataFlow (CDF) is an integrated data management platform designed to handle real-time streaming data pipelines. It allows organizations to ingest, process, analyze, and distribute data in motion across various environments such as on-premises, public clouds, or hybrid architectures. Cloudera CDF leverages Kubernetes as the foundational infrastructure for deploying, managing, and scaling data flow applications. Cloudera CDF also uses NiFi as the core engine for data movement, transformation, and management. When developing or troubleshooting DataFlows you might need to set processors on DEBUG mode. NiFi uses Logback as the default logging framework to handle logging of application activities and error messages. With the below examples you can set a processor in to DEBUG mode by just knowing the processor name. The below example assumes that you are pointing to your appropriate kubeconfig file a an environment variable on your shell named KUBECONFIG. However you can alway add the --kubeconfig=/path/to/your/kubeconfig-file flag You will need to know what name space your DataFlow is on: kubectl get namespaces *** Deployed DataFlows will have the naming convention of dfx-<Flow name>-ns *** If you need to set a DataFlow on DEBUG using Flow Designer the naming convention will be dfx-ts-<numbers>-ns You will also need to know the name of the processor you want to set to DEBUG Finally after running below command your processor will begin to log on DEBUG after 30 seconds The below is a one line command that will set DEBUG on the processor you want and on the pod named dfx-nifi-0 *** If your DataFlow autoscales you can set it on other pods that increment by the number 1 Just change the text in RED to meet your environment. In the below example I am setting the processor queryrecord to DEBUG kubectl exec -n dfx-logback-test-ns dfx-nifi-0 -c nifi -- bash -c \ "CLASS=\$(zcat /opt/nifi/nifi-current/data/flow.json.gz | \ jq -r '.. | .processors? // empty | .[].type | select(test(\"queryrecord\"; \"i\"))' | \ sort | uniq); sed -i '/<logger name=\"org.apache.nifi\" \ level=\"INFO\"\/>/a \ <logger name=\"'\$CLASS'\" level=\"DEBUG\"\/>' \ /opt/nifi/nifi-current/conf/logback.xml"
... View more
Labels:
09-09-2024
12:27 PM
Hello @mfhanif what region are you on? And just to make sure we are looking at same thing this InvokeHTTP is the labeled "Get Recent Wikipedia Changes" correct?
... View more
10-30-2023
06:50 AM
1 Kudo
Hello @ipson first sorry my previous comment was not posted for some reason. It is great to see you are exploring DataFlow Data Service on Cloudera Public Cloud. To answer your question: Does it persist state? Unfortunately there is currently no ability to change the flow on a running deployment, so you'd have to publish v2 to the catalog from flow designer terminate original deployment perform new deployment with v2 If you have to deploy a new version of the flow it means you are standing up a new namespace and in turn it will have a new zookeeper / NiFi / disk among other things contained within the kubernetes namespace.
... View more
04-11-2023
06:56 AM
@Jame1979 , like @cotopaul mentions you seem to have a hung thread that you tried to terminate. Restart NiFi should solve that. Additionally and important here is that you are never really "spreading" the work among you 3 nodes. The List happens on the primary node and it stays on the primary node. What you should do is right click on the connection between List and UpdateAttribute and select load balancing strategy of "Round Robin"
... View more
04-10-2023
12:02 PM
Could you explain more what limitations you have on CDP? Is this Public Cloud? We certainly expect there to be no limitations .... To get PG ID's you can run API call: flow/process-groups/root/status?recursive=true
... View more
02-27-2023
09:25 AM
2 Kudos
Yes SplitRecord is what you should use. Attached is a flow definition as an example. Note that I had to rename the file with a "txt" extension once you download it rename it to a .json extension You can then drag a processor group and it gives you an option to upload the flow definition. That example generates a file with 102 records and on SlitRecord we use a JsontTreeReader that will split by 3 records and writes the flowfiles out, In this case per 3 per flowFile generating 34 FlowFiles. 1-2 / 3 = 34 In your case and based on your screenshot I would change split count to be 1500000 ( or another number based on your needs )
... View more
07-18-2022
12:39 PM
nifi-api/flow/process-groups/root/status?recursive=true This end point will return the process groups
... View more
07-18-2022
12:20 PM
1 Kudo
Those processors are bundled in Clouderas CFM package, in particular CFM 2.1.4 they are not released on Apache NiFi bundle.
... View more
07-12-2022
12:35 PM
1 Kudo
Hi @rafy please refer to the user authentication portion of this doc: https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#user_authentication Given that you are familiar with the toolkit and you only want to autheticate 3 users, then creating 3 user certs is simplest approach. Just have them add the created certs to their browser.
... View more
07-07-2022
09:17 AM
1 Kudo
@linssab I am not well versed with the docker aspect of this however I see this: GET http://localhost:8443/nifi-api/flow/current-user That is going to http I feel the reason this is happening is because you have this in your yaml file: ports:
- '8443:8080'
cpus : 2
mem_limit: 2G
mem_reservation: 2G
environment:
- NIFI_WEB_HTTP_PORT=8080 That to me tells me that anything going in on 8443 route it to 8080 and 8080 is set on nifi.properties as the value for nifi.web.http.port= because you are passing in through your yaml file: NIFI_WEB_HTTP_PORT=8080 I recommend that you change ports:
- '8443:8080'
to
ports:
- '8443:8443'
AND
environment:
- NIFI_WEB_HTTP_PORT=8080
to
environment:
- NIFI_WEB_HTTPS_PORT=8443 notice the "S" in HTTP<S> Then NiFi should start up securely and you will get a login page where you can pass it the values you set for: - SINGLE_USER_CREDENTIALS_USERNAME=
- SINGLE_USER_CREDENTIALS_PASSWORD=
... View more