Member since
03-01-2021
402
Posts
3
Kudos Received
4
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
167 | 04-22-2025 04:56 AM | |
398 | 03-27-2025 06:33 AM | |
1042 | 10-06-2023 06:36 AM | |
10052 | 06-22-2023 06:24 AM |
04-22-2025
04:56 AM
@broobalaji Not sure if the processor is validated successfully. Please check all the files ( config files ) provided are accessible by nifi user and the properties defined are correct in the put hdfs processor Cloudera Reference Doc : https://docs.cloudera.com/cfm/4.0.0/nifi-ozone/topics/cfm-ozone-target.html
... View more
04-15-2025
12:45 AM
@Mamun_Shaheed Did the response help resolve your query? If it did, kindly mark the relevant reply as the solution, as it will aid others in locating the answer more easily in the future.
... View more
03-27-2025
06:33 AM
1. Check if you have any Hints (Broadcast) set at the query level . 2. try increasing spark.sql.shuffle.partitions 3. you can set SQL HINTS such as MERGE to use sort merge join , instead of broadcast*
... View more
03-13-2025
11:13 PM
Hi @jaris How are you passing the kerberos ticket? Did you run a kinit , before the livy statement execution. Obtain a kerberos ticket via kinit and then try to execute the below sample job 1. Copy the JAR to HDFS: # hdfs dfs -put /opt/cloudera/parcels/CDH/jars/spark-examples<VERSION>.jar /tmp 2. Make sure the JAR is present. # hdfs dfs -ls /tmp/ 3. CURL command to run the Spark job using Livy API. # curl -v -u: --negotiate -X POST --data '{"className": "org.apache.spark.examples.SparkPi", "jars": ["/tmp/spark-examples<VERSION>.jar"], "name": "livy-test", "file": "hdfs:///tmp/spark-examples<VERSION>.jar", "args": [10]}' -H "Content-Type: application/json" -H "X-Requested-By: User" http://<LIVY_NODE>:<PORT>/batches 4. Check for the running and completed Livy sessions. # curl http://<LIVY_NODE>:<PORT>/batches/ | python -m json.tool NOTE: * Change the JAR version ( <VERSION> ) according your CDP version. * Replace the LIVY_NODE and PORT with the actual values. * If you are running the cluster in secure mode, then make sure you have a valid Kerberos ticket and use the Kerberos authentication in curl command.
... View more
03-12-2025
03:51 AM
@MarinaM the template is sent from netflow server. we need to check when the template would be sent by netflow server and it can be cached on the collector end .(listenNetflow processor)
... View more
08-08-2024
03:42 AM
1 Kudo
@Brunno In ExecuteSQL , Query result will be converted to Avro format. Therefore , use the Convert Avro* or related processors to have the avro data converted to correct formatted file and then update to the target DB https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/2.0.0/org.apache.nifi.processors.standard.ExecuteSQL/index.html
... View more
10-06-2023
06:36 AM
1 Kudo
You can check this CDP API references docs . https://docs.cloudera.com/cdp-public-cloud/cloud/api/topics/mc-api-overview.html#mc-api-overview Specific to Datahub : https://cloudera.github.io/cdp-dev-docs/api-docs/datahub/index.html#_healthcheck If this answers your query, please accept this post as Solution.
... View more
06-22-2023
06:24 AM
spark.kryoserializer.buffer.max limit is fixed to 2GB . It cannot be extended. You can try to repartition() the dataframe in the spark code.
... View more