Created on 10-29-202407:02 PM - edited 10-29-202407:03 PM
Cloudera DataFlow (CDF) is an integrated data management platform designed to handle real-time streaming data pipelines. It allows organizations to ingest, process, analyze, and distribute data in motion across various environments such as on-premises, public clouds, or hybrid architectures.
Cloudera CDF leverages Kubernetes as the foundational infrastructure for deploying, managing, and scaling data flow applications. Cloudera CDF also uses NiFi as the core engine for data movement, transformation, and management.
When developing or troubleshooting DataFlows you might need to set processors on DEBUG mode. NiFi uses Logback as the default logging framework to handle logging of application activities and error messages.
With the below examples you can set a processor in to DEBUG mode by just knowing the processor name.
The below example assumes that you are pointing to your appropriate kubeconfig file a an environment variable on your shell named KUBECONFIG. However you can alway add the --kubeconfig=/path/to/your/kubeconfig-file flag
You will need to know what name space your DataFlow is on:
kubectl get namespaces
*** Deployed DataFlows will have the naming convention of dfx-<Flow name>-ns *** If you need to set a DataFlow on DEBUG using Flow Designer the naming convention will be dfx-ts-<numbers>-ns You will also need to know the name of the processor you want to set to DEBUG Finally after running below command your processor will begin to log on DEBUG after 30 seconds
The below is a one line command that will set DEBUG on the processor you want and on the pod named dfx-nifi-0
*** If your DataFlow autoscales you can set it on other pods that increment by the number 1
Just change the text in RED to meet your environment.
In the below example I am setting the processor queryrecord to DEBUG