Member since
07-29-2020
574
Posts
323
Kudos Received
176
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1974 | 12-20-2024 05:49 AM | |
2200 | 12-19-2024 08:33 PM | |
2020 | 12-19-2024 06:48 AM | |
1324 | 12-17-2024 12:56 PM | |
1885 | 12-16-2024 04:38 AM |
01-18-2023
12:26 PM
Hi, I was able to obtain the required result using the following processor: 1- SplitText : this is to help you split each json record into its own flowfile 2- UpdateRecord: This is used to update the dates fields and convert to the required format using Json Record Reader\Writer: The value used to convert the time for each field : ${field.value:toDate("yyyy-MM-dd'T'HH:mm:ss.SSS'Z'"):format("yyyy-MM-dd HH:mm:ss.SSS")} More info on UpdateRecord: https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.7.1/org.apache.nifi.processors.standard.UpdateRecord/additionalDetails.html Note: The only problem I noticed is that null values will be converted to "" . Not sure if that will cause you a problem but you can use replace text or json jolt to convert the values back to null. If you need the records to be merged back together before inserting into Hive, you can use MergeRecord processor. If that helps please accept solution. Thanks
... View more
01-17-2023
11:16 AM
Hi, Not sure if you are looking for the exact thing but this should give you the expected output from the sample you provided: [
// Flatten an array of photo objects into a prefixed
// soup of properties.
{
"operation": "shift",
"spec": {
"content": {
"*": {
"*": {
"*": {
"$": "error",
"$1": "product",
"$2": "ErrorType",
"@": "q"
}
}
}
}
}
}
] If that helps, please accept solution. Thanks
... View more
01-13-2023
06:53 AM
I was finally able to figure out the problem. To resolve this issue basically it seems like the py\jar file as specified in the "appResource" & ""spark.jars" needs to be accessible by all nodes in the cluster, for example if you have network path you can specify the network path in both attributes as follows: "appResource": "file:////Servername/somefolder/HelloWorld.jar", ... "spark.jars": "file:////Servername/someFolder/HelloWorld.jar", Note sure why if the job is being submitted to the master. If anybody knows please help me understand.
... View more
12-15-2022
05:16 AM
@Bello as this is an older post, you would have a better chance of receiving a resolution by starting a new thread. This will also be an opportunity to provide details specific to your environment that could aid others in assisting you with a more accurate answer to your question. You can link this thread as a reference in your new post.
... View more
12-03-2022
09:35 AM
I am dealing with Kafka dataset where there are multiple types of message data is processing (coming) Sample data: eventType 1- { "type": "record", "name": "Dispatch_Accepted", "namespace": "accepted.avro", "fields": [ { "name": "John", "type": "string", "doc": "Name of the user account" }, { "name": "email", "type": "string", "doc": "The email of the user logging message on the blog" }, { "name": "timestamp", "type": "long", "doc": "time in seconds" } ], "doc:": "A basic schema of Dispatch_Rejected" } EventType-2 { "type": "record", "name": "Dispatch_Rejected", "namespace": "rejected.avro", "fields": [ { "name": "Merry", "type": "string", "doc": "Name of the user" }, { "name": "email", "type": "string", "doc": "The email of the user logging message on the blog" }, { "name": "timestamp", "type": "long", "doc": "time in seconds" } ], "doc:": "A basic schema Rejected data" } Schema of the data getting validated from Confluent Schema Regisry (Working Fine), I need to apply filter on Schema name (Dispatch_Rejected and Dispatch_Accepted) and crete two separate data files for each so I am using QueryRecord Processor which below query <Dispatch_Rejected>=Select * from FLOWFILE WHERE name='Dispatch_Rejected' <Dispatch_Accepted>=Select * from FLOWFILE WHERE name='Dispatch_Accepted' This is not working.. can't identify the schema name. Controller service is working fine. 1- How I can pick the schema name from Controller service 2- Should I need to assign the value ${schema.name} in another variable <My_schema> and need to write SELECT Statement like <Dispatch_Rejected>=Select * from FLOWFILE WHERE My_Schema.name='Dispatch_Rejected' <Dispatch_Accepted>=Select * from FLOWFILE WHERE My_Schema.name='Dispatch_Accepted' Summary-- I want to filter the data based on eventType, and create separate data files Please help
... View more
12-01-2022
02:33 AM
Hi, thanks for the details. Unfortunately it is not working. I get an empty array [] as output. I have tried it with extract and split mode. I applied the schema text property as suggested with "NestedKey" and "nestedValue" as name. None gives me an output. Meanwhile I have achieved what I wanted using SplitContent and then again another jolt processor. Of course it would be more elegant if I could make it work with ForkRecord.
... View more
11-29-2022
09:27 AM
Hi , I think after you split your csv you need to extract the values of both columns: status and client_id to attributes and then use in the ExecuteSQL processor, for that you need to : 1- convert the record to from CSV to JSON format using ConvertRecord Processor 2- use EvaluateJsonPath to extract both columns into defined attribute (dynamic properties). Make sure to set the Destination property to "flowfile-attribute". After that you can reference those attribute in the SQL query as ${status} & ${client_id}, assuming thats how you called the attributes in step 2. Another option if you dont want to use two processor , you can use ExtractText processor and provide regex to extract each value but you have to be careful how you define your regex for each value to make sure you are only pulling those values and nothing else. Hope that helps. If that answers your question please accept solution. Thanks
... View more
11-28-2022
08:51 PM
@SAMSAL Thank you for your help.
... View more
11-28-2022
12:01 PM
@Mohamed_Shaaban I recommend starting a new community question with the details specific to your setup. This allows the community to address/assist with your specific setup versus comparing your issue to what was shared in this post. Thanks, Matt
... View more