Member since
11-28-2017
6
Posts
0
Kudos Received
0
Solutions
01-25-2018
08:51 PM
Thanks for the reply and help Matt, I was actually looking for a way to go through the output of a query one by one, one record at a time and I think was not able to explain properly in my question. I thought there is no way to do it in Nifi as the output of ExecuteSQL is a continuous stream and I might have to device a way to loop through the output, but then I found out about SplitAvro and used it with 1 record at a time after the ExecuteSQL processor and it did what I wanted :). Thanks, Samir
... View more
01-22-2018
08:58 PM
Here is my problem. I have multiple rows coming out of a table and I have to read each row one by one and execute a SQL query using values from each of the rows. For e.g. I have 10 ids and their related fields, I have to read each row once, execute an SQL query using those values, then move to the next id and its related fields and execute the SQL again with the 2nd id details and so on.. Is there a looping mechanism in Nifi I can use? I have seen examples of EvaluateJSONPath and then using RouteOnAttribute, however, as my values are unknowns I can't use RouteOnAttribute. I was thinking of writing them to a file and then read them one by one using a loop written in execute script. Is that the only choice? Appreciate your help! @Matt Burgess @Bryan Bende
... View more
Labels:
- Labels:
-
Apache NiFi
01-12-2018
09:48 PM
Thanks for the response Bryan, In the step ListHDFS -> RPG(this part only runs on primary node), the question is how do restrict this flow only to the primary node? Because whatever I create in one node is replicated in all the nodes, so when I create an RPG of say node2, its actually has the same flows as the node1 and then it looks like a loop. Actually, flows I have created are kind of complex and can't share it here, but I hope you understand what I am talking about through the diagram attached. I think we have to configure in a way that the primary node flows are not replicated in the slave nodes. Really appreciate your help. nifi-node-flow-replication-confusion.jpg Thanks, Samir
... View more
01-10-2018
10:30 PM
Hi Bryan, Really nice and helpful article! I am new to nifi and trying to use clustering for data pull scenario from SQL and Greenplum database servers. For this I was thinking the best approach for clustering will be similar to the hdfs example you've shown here. However, the part I am getting confused with is that, when I create an RPG its linking to another node of the cluster with the same data flow as the parent one(where the RPG is created), so it looks like an endless loop between first node feeding to 2nd node and back. So I am not sure, if I am doing it right thing? Moreover, the cluster is hanging when I do this, maybe its related to how we configured it, but still wanted to know if this is the way it should be done. Do you know any place here where they might have sample/example xml dataflows with RPG? Please let me know if you need more clarification on this? Appreciate your help! Thanks, Samir
... View more
12-12-2017
07:25 PM
Thanks so much for the reply and details, I removed the ConvertAvroToJSON and ConvertJSONToAvro and replaced them with ConvertRecord and it worked fine! However, After conversion to JSON, the decimal numbers are missing the trailing zeroes. For e.g. if the number in the database is 37.8531000000 then the number is getting stripped down to 37.8531 and removing the trailing zeroes. I know it should not matter, but still we're trying to pull data as it is without any changes, so just wanted to know if there is any way to retain them? Thanks again, Appreciate your help!
... View more
12-08-2017
10:01 PM
convertavrotojson.jpg convertjsontoavro.jpg Hi All, Am new to Nifi and trying to solve an issue am facing with Avro to Json and Json to Avro conversion using the nifi ConvertAvroToJson and ConvertJsonToAvro processors. Everything is working fine and dandy when I am using string or int or boolean datatypes, however, when I am throwing a long decimal number into the schema, its refusing to work. For 1. ConvertAvroToJson - Am receiving Array Index Out Of Bounds exception when I use the a avro schema. 2. ConvertJsonToAvro - receiving Failed to convert 1/1 records from Json to Avro(am running for one record to test) The processors I am using is straight forward and it goes like this - 1. ExecuteSQL(from MSSQL database) > ConvertAvroToJson(I am getting the above error when I mention the AVRO schema) > PutFile 2. ExecuteSQL(from MSSQL database) > ConvertAvroToJson(not using any schema in avro schema property) > ConvertJsonToAvro(here am mentioning the AVRO schema) > PutFile Following is the Avro schema format I am using for Decimal logical type- {
"type":"record",
"name":"test",
"namespace":"any.data",
"fields":
[
{
"name":"field1",
"type":[
"null","string"
]
},{
"name":"field2",
"type":[
"null",
{
"type":"bytes",
"logicalType":"decimal",
"precision":32,
"scale":10
}
]
},
{
"name":"field3",
"type":[
"null","boolean"
]
}]
} I've tried different versions of schema but its still not working. The one row am pulling has the field2 value as 37.8531000000 . PS - Am using all of this because I want to use DetectDuplicates. So, I am able to use the flows like - ExecuteSQL > ConvertAvroToJson > EvaluateJsonPath > DetectDuplicate > ConvertJsonToAvro > ... {Use more processors to do validations}... > PutFile. Using detect duplicate I am basically limiting ExecuteSQL to write only one file by setting the AgeOffDuration to 5 hours (we're just doing batch processing for now), instead of ExecuteSQL writing multiple files. Really appreciate your help!!
... View more
Labels:
- Labels:
-
Apache NiFi