Member since
06-08-2017
1049
Posts
517
Kudos Received
312
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
9904 | 04-15-2020 05:01 PM | |
5936 | 10-15-2019 08:12 PM | |
2414 | 10-12-2019 08:29 PM | |
9580 | 09-21-2019 10:04 AM | |
3506 | 09-19-2019 07:11 AM |
09-24-2017
04:55 AM
Hi @Biswajit Chakrabort, As you are having rolling logs for daily, i tried to TailFile processor by using File(s) to Tail property as follows /my/path/directory/my-app-${now():format("yyyy-MM-dd")}.log The above expression looks for my-app-2017-09-24.log file in /my/path/directory and tails the file if the file is presented in that directory.
... View more
09-24-2017
02:36 AM
@Shailesh Nookala i don't think header is causing these issues as i tried the same way as you and its working as expected. But as i have observed in your flow you have connected all the available relations on each processor to the next processor, that's causing issues for you and i have seen PutFile processor instead of PutSQL in your flow. Are you storing the prepared records to file? i thought you are pushing the prepared records to SQL server, is my assumption is wrong?. Connect only the following Relationships to the Next Processors:- GetFile(Sucputsql.xmlcess)-->InferAvroSchema(Success)-->ConvertCSVToAvro(Success)-->ConvertAvroToJSON(Success)-->SplitJson(Split)-->EvaluateJsonPath(Matched)-->ReplaceText(Success)-->PutSQL(Autoterminate Success). I have attached my .xml file use this make changes to that as per you needs. putsql.xml input :- name,adresse,zip
1,hi,8000
InferAvroSchema configs:-
infer-avro.png Convertcsvtoavro configs:- csvtoavro.png Output:- {
"name" : 1,
"adresse" : "hi",
"zip" : 8000
}
... View more
09-22-2017
08:04 PM
@Ravi Koppula, I think you haven't taken off run schedules which you are have used to trigger ExecuteStreamCommand processors before adding GenerateFlofile processor. Take off all the run schedules and keep them to default as Timer driven and Run Schedule to 0 Sec (or) you can use Event Driven when ever there is flow file it will triggers the processor.
... View more
09-22-2017
06:35 PM
1 Kudo
Hi @Ravi Koppula, I don't think there is a way to schedule processor groups but there is a way to schedule them all of them on one shot. 1.As you are using ExecuteStreamCommand processor,processor will accepts incoming connections as we can schedule the processor in Timer driven,Cron driven,Event Driven. 2.Just use GenerateFlowfile processor and give success relation to all ExecuteStreamCommand processors and schedule GenerateFlowfile to schedule a run on either cron (or) timer driven. 3.Generateflowfile processor runs on scheduled time and gives trigger flowfile to run all the other ExecuteStreamCommand processors. Sample Flow:-
... View more
09-22-2017
01:57 PM
Hi @Simon Jespersen, in your evalJsonpath processor you are using Path Not Found Behavior property as warn i.e it will generate a warning when a JSON path expression is not found, as in your csv file you are for some of the records wont have any data for zip. This warn message wont effect your flowfile, flowfile still routes to success relationship with all the available content will be extracted as attributes and for no content attributes values will be Empty string set. if you don't want to see those warn messages on the processor then just change Path Not Found Behavior property to ignore(default) which will ignore if the content is not found for any of the processor. Example:- i have recreated same WARN message as you are having with the below Json doc {
"name" : "else",
"adresse" : "route 66",
"by" : "Hadoop City" <br>} with ignore property This is my json doc to evaljson processor with ignore as path not found property processor wont return any warn messages as it ignore if there is no content for the jsonpath expression. With warn property:- If you change path not found property to warn processor will return the same warn message as you are having in the question. both cases results the same output as zip attribute value is Empty string set and routes to Success relation.
... View more
09-21-2017
10:03 PM
@Shailesh Nookala Sure, if you are thinking to insert the data to Sql Server we
can do that by using NiFi, You don’t have to download the file at all. You can
do that couple of ways by using ExtractText Processor and extract all the
contents of csv file into attributes and prepare insert statement by using
ReplaceText Processor and push the data to Sql server. Another method is preparing json document for input CSV file
and extracting all the attributes of flow file and prepare record to insert
data to SQL server.
I’m using Json method which is easy to extract attributes of flowfiles. Example:- Let’s consider you
are having the csv file having just 2 records in it.
1,test,1000 2,test1,2000 As I’m using NiFi 1.1 there is no direct processor which can
convert the incoming csv file to json. we need to follow InferAvroSchema--> ConvertCSVToAvro --> ConvertAvroToJSON the output would be in Json format with the same contents of our input
CSV file.
But in new versions of NiFi(I think from 1.3) there are processors to convert
CSV to Json directly using ConvertRecord Processor.
InferAvroSchema:- In this processor I’m giving the incoming csv file
definition (or) you can read header definition from incoming flowfile by
changing Get CSV Header Definition From Data property to true NiFi can
read definition from 1 line of file.
as I know my incoming file having id,name,salary I have given this definition
and keep Schema Output Destination to flowfile-attribute we can make use
of this schema in ConvertCSVToAvro
processor. ConvertCSVToAvro:- change the Record Schema property to ${inferred.avro.schema} Which can infer the schema from flowfile attribute.
ConvertAvroToJSON:- drag and drop ConvertAvroToJSON processor and leave the properties
to default.
Output:-
[{"id": 1, "name":
"test", "salary": 1000},{"id": 2,
"name": "test1", "salary": 2000}] SplitJson:- If you are having more than 1 record in your csv file use
splitjson processor because if ConvertAvroToJSON
processors find more than one record in the flowfile it will make an array and
keep all the records inside the array.
As we need only 1 record in the flowfile before inserting data to Sqlserver. If your flowfile having array then json path expression property should be $.* as it will dynamically splits the array of records to one record for one flowfile. Input:- [{"id": 1, "name": "test","salary": 1000},{"id": 2, "name":"test1", "salary": 2000}] Output:- flowfile1:-
{"id": 1, "name": "test", "salary":
1000}flowfile2:-
{"id": 2, "name": "test1", "salary":
2000} As Splijson splits the array into individual records.
EvaluateJsonPath configs:- in this processor extracting all the contents of flowfile to attributes
by changing the list of properties Destination as flowfile-attribute extracting contents of json file by adding new properties by clicking +
symbol on right corner. id as $.id
name as $.name
salary as $.salary ReplaceText Configs:- We are preparing insert statement in this processor and
change Replacement Strategy property as AlwaysReplace Use insert statement and give destination table name and use
the extracted attributes to replace the contents of values dynamically. insert into sqlserver_tablename (id,name,salary) values (${id},${name},${salary}) output:- flowfile1 content:-
insert into sqlserver_tablename (id,name,salary) values(1,test,1000)
flowfile2 content:-
insert into sqlserver_tablename (id,name,salary) values(2,test1,2000)
PutSQL Configs:-
use putsql processor and make a connection pool to sql server enable and use the connection pool.
so Putsql processor will executes all the insert statements that we are having
contents of the flowfile and all the data will get inserted to sql server. FlowScreenshot:-
... View more
09-21-2017
04:52 AM
1 Kudo
Hi @Shailesh Nookala, PutFile processor Writes the contents of a FlowFile to the local file system. 1.You can find the file in NiFi running nodes only as the property create missing directories is true, NiFi will creates those directories for you and store the csv file into your directory(i.e /Users/nraghuram/...). Example:- 1.Lets take you are having 2 NiFi nodes(E01,E02) and you are running GetFile processor on only primary node(consider E02 is primary), so when you run on primary node then the File got stored in the specified directory of your processor on primary node(in our case on E02 because it is primary node). 2.In this example im having E02 is primary node and running PutFile processor to store File in local directory. 3.As Highlighted below the File is on E02 node 4.When i store the file it is inside /d1 directory on E02 node only because i'm running GetFile only on Primary Node.
... View more
09-20-2017
06:31 PM
@sally sally, if you are having filename attribute then make use RouteonAttribute processor to get same filename files into one MergeContent processor. by this method all the same filename flowfiles comes to one mergecontent processor will help to resolve issues with merging the flowfiles. RouteonAttribute Properties:- To get only filename 1 from the flow files ${filename:equals('1')} Example:- RouteonAttribute:- Make use of Property 1, 2 and connect 1, 2 relationships to two seperate merge contents..
... View more
09-20-2017
04:03 PM
@Sami Ahmad, make sure you having you have given hive-site.xml,core-site.xml,hdfs-site.xml Paths in Hadoop Configuration Resources property and if you are using kerberos then you need to specify Kerberos Principal, Kerberos Keytab properties in PutHDFS processor. Example:-
... View more
09-19-2017
08:56 PM
@sally sally, can you make use of below search property ^<[^>]+>(.*)\<\/\?.*\>$ Replacetext Configs:- Input:- <?xml version="1.0" encoding="utf-8"?>abc</?xml version="1.0" encoding="utf-8"?> Output:- <DailyData>abc</DailyData>
... View more