Member since
12-14-2020
95
Posts
8
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
8224 | 02-29-2024 09:08 PM | |
1903 | 03-10-2023 12:55 AM |
11-07-2024
10:06 PM
1 Kudo
JSON_EXTRACT function in the QueryRecord processor may not be interpreting Src_obj__event_metadata as a JSON object. Instead, it likely sees Src_obj__event_metadata as a plain string, so it cannot directly access the "$.timestamp" field. We may need to use EvaluateJsonPath processor first to extract timestamp from Src_obj__event_metadata into a new attribute: • Destination: flowfile-content •Return Type: json •JSON Path Expression: Use the following configuration in the Properties tab: Property Value timestamp $.Src_obj__event_metadata.timestamp Once we extracted timestamp as a separate column, then we could call it directly in QueryRecord processor: SELECT * FROM flowfile ORDER BY timestamp ASC
... View more
11-06-2024
10:27 PM
1 Kudo
If CAST and JSON_PARSE functions are not supported in the Nifi processor you're using, we may try extracting the timestamp value as a string and sorting alphabetically SELECT * FROM flowfile ORDER BY JSON_EXTRACT_SCALAR(Src_obj__event_metadata, "$.timestamp") ASC
... View more
11-06-2024
09:56 PM
1 Kudo
The field Src_obj__event_metadata is a JSON string, so to access fields within it, you might need to parse it into a JSON object first. Some systems may require you to explicitly parse JSON strings before extracting fields. Please try: SELECT * FROM flowfile ORDER BY CAST(JSON_EXTRACT(Src_obj__event_metadata, "$.timestamp") AS TIMESTAMP) ASC
... View more
07-29-2024
10:38 PM
1 Kudo
Hi, Please see if you could access: http(s)://<CDSW_DOAMIN>/<USER>/<PROJECT>/settings/delete If it does then use the delete button here to delete it. If you cannot access the page then we can only remove it from the backend psql db: # kubectl exec -it $(kubectl get pods -l role=db -o jsonpath='{.items[*].metadata.name}') -- psql -P pager=off -U sense Find the project table and remove the related entry. Note: please be very careful when you're operating the backend db. Incorrect operation may cause irreversible loss.
... View more
02-29-2024
09:08 PM
1 Kudo
The Kafka broker address is specified as "localhost". Is the broker running on the same host as the producer?I guess not. prop.setProperty("bootstrap.servers", "localhost:9092"); Please use IP address or hostname of the host where your Kafka broker is running and try again.
... View more
02-28-2024
11:24 PM
1 Kudo
Hello @Andy1989 , "socket connection setup timeout" sounds like some network issue on client side. May I know you specify Kafka broker address and port in your code? Is the address solvable from client side and is the port number of Kafka broker correct?
... View more
03-10-2023
12:55 AM
1 Kudo
Hi @Hyeonseo , If we print package locations from predict.py respectively: python -c 'import site; print(site.getsitepackages())' do you find the result is different? If we add all the package locations is there still such issue?
... View more
01-30-2023
01:21 AM
May I know what values you have set for below properties: yarn.nodemanager.local-dirs yarn.nodemanager.log-dirs Also, please make sure you don't have noexec or nosuid flags set on the corresponding disk. You may check this using "mount" command.
... View more
06-17-2022
12:33 AM
1 Kudo
Hi, this is most likely due to some long running jobs like Spark Streaming, which will continuously generate logs while they are running. We need to modify the log level on the application side. Still, taking Spark Streaming as example, we can add rolling appender in the log4j.properties for the application, so the job will rotate the logs with limited size you set in the log4j file. For details steps please refer: https://my.cloudera.com/knowledge/Long-running-Spark-streaming-applications-s-YARN-container?id=90615 https://my.cloudera.com/knowledge/Video-KB-How-to-configure-log4j-for-Spark-on-YARN-cluster?id=271201 Regarding other types of jobs it's similar, we need to let application team tune the log level so they will not generate indefinite amount of logs.
... View more
04-22-2022
02:55 AM
Hi @reca., May I know have you specified Kerberos principal and keytab in your Flume conf: https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/cm_sg_use_subs_vars_s11.html If you have many long running jobs we would recommend you to increase default HDFS Delegation token max lifetime and renew time: add following properties into “HDFS Service Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml” dfs.namenode.delegation.token.max-lifetime 604800000 (7days) -> increase to 30 days
dfs.namenode.delegation.token.renew-interval 86400000 (1days) -> increase to 30 days You can set max-lifetime to even 1 year and the renew interval just need to be equal or smaller than max-lifetime.
... View more