About Shu_ashu

Shu_ashu · ‎09-24-2017

Hi @Biswajit Chakrabort, As you are having rolling logs for daily, i tried to TailFile processor by using File(s) to Tail property as follows /my/path/directory/my-app-${now():format("yyyy-MM-dd")}.log The above expression looks for my-app-2017-09-24.log file in /my/path/directory and tails the file if the file is presented in that directory.

Shu_ashu · ‎09-24-2017

@Shailesh Nookala i don't think header is causing these issues as i tried the same way as you and its working as expected. But as i have observed in your flow you have connected all the available relations on each processor to the next processor, that's causing issues for you and i have seen PutFile processor instead of PutSQL in your flow. Are you storing the prepared records to file? i thought you are pushing the prepared records to SQL server, is my assumption is wrong?. Connect only the following Relationships to the Next Processors:- GetFile(Sucputsql.xmlcess)-->InferAvroSchema(Success)-->ConvertCSVToAvro(Success)-->ConvertAvroToJSON(Success)-->SplitJson(Split)-->EvaluateJsonPath(Matched)-->ReplaceText(Success)-->PutSQL(Autoterminate Success). I have attached my .xml file use this make changes to that as per you needs. putsql.xml input :- name,adresse,zip 1,hi,8000 InferAvroSchema configs:- infer-avro.png Convertcsvtoavro configs:- csvtoavro.png Output:- { "name" : 1, "adresse" : "hi", "zip" : 8000 }

Shu_ashu · ‎09-22-2017

@Ravi Koppula, I think you haven't taken off run schedules which you are have used to trigger ExecuteStreamCommand processors before adding GenerateFlofile processor. Take off all the run schedules and keep them to default as Timer driven and Run Schedule to 0 Sec (or) you can use Event Driven when ever there is flow file it will triggers the processor.

Shu_ashu · ‎09-22-2017

Hi @Ravi Koppula, I don't think there is a way to schedule processor groups but there is a way to schedule them all of them on one shot. 1.As you are using ExecuteStreamCommand processor,processor will accepts incoming connections as we can schedule the processor in Timer driven,Cron driven,Event Driven. 2.Just use GenerateFlowfile processor and give success relation to all ExecuteStreamCommand processors and schedule GenerateFlowfile to schedule a run on either cron (or) timer driven. 3.Generateflowfile processor runs on scheduled time and gives trigger flowfile to run all the other ExecuteStreamCommand processors. Sample Flow:-

Shu_ashu · ‎09-22-2017

Hi @Simon Jespersen, in your evalJsonpath processor you are using Path Not Found Behavior property as warn i.e it will generate a warning when a JSON path expression is not found, as in your csv file you are for some of the records wont have any data for zip. This warn message wont effect your flowfile, flowfile still routes to success relationship with all the available content will be extracted as attributes and for no content attributes values will be Empty string set. if you don't want to see those warn messages on the processor then just change Path Not Found Behavior property to ignore(default) which will ignore if the content is not found for any of the processor. Example:- i have recreated same WARN message as you are having with the below Json doc { "name" : "else", "adresse" : "route 66", "by" : "Hadoop City" <br>} with ignore property This is my json doc to evaljson processor with ignore as path not found property processor wont return any warn messages as it ignore if there is no content for the jsonpath expression. With warn property:- If you change path not found property to warn processor will return the same warn message as you are having in the question. both cases results the same output as zip attribute value is Empty string set and routes to Success relation.

Shu_ashu · ‎09-21-2017

@Shailesh Nookala Sure, if you are thinking to insert the data to Sql Server we can do that by using NiFi, You don’t have to download the file at all. You can do that couple of ways by using ExtractText Processor and extract all the contents of csv file into attributes and prepare insert statement by using ReplaceText Processor and push the data to Sql server. Another method is preparing json document for input CSV file and extracting all the attributes of flow file and prepare record to insert data to SQL server. I’m using Json method which is easy to extract attributes of flowfiles. Example:- Let’s consider you are having the csv file having just 2 records in it. 1,test,1000 2,test1,2000 As I’m using NiFi 1.1 there is no direct processor which can convert the incoming csv file to json. we need to follow InferAvroSchema--> ConvertCSVToAvro --> ConvertAvroToJSON the output would be in Json format with the same contents of our input CSV file. But in new versions of NiFi(I think from 1.3) there are processors to convert CSV to Json directly using ConvertRecord Processor. InferAvroSchema:- In this processor I’m giving the incoming csv file definition (or) you can read header definition from incoming flowfile by changing Get CSV Header Definition From Data property to true NiFi can read definition from 1 line of file. as I know my incoming file having id,name,salary I have given this definition and keep Schema Output Destination to flowfile-attribute we can make use of this schema in ConvertCSVToAvro processor. ConvertCSVToAvro:- change the Record Schema property to ${inferred.avro.schema} Which can infer the schema from flowfile attribute. ConvertAvroToJSON:- drag and drop ConvertAvroToJSON processor and leave the properties to default. Output:- [{"id": 1, "name": "test", "salary": 1000},{"id": 2, "name": "test1", "salary": 2000}] SplitJson:- If you are having more than 1 record in your csv file use splitjson processor because if ConvertAvroToJSON processors find more than one record in the flowfile it will make an array and keep all the records inside the array. As we need only 1 record in the flowfile before inserting data to Sqlserver. If your flowfile having array then json path expression property should be $.* as it will dynamically splits the array of records to one record for one flowfile. Input:- [{"id": 1, "name": "test","salary": 1000},{"id": 2, "name":"test1", "salary": 2000}] Output:- flowfile1:- {"id": 1, "name": "test", "salary": 1000}flowfile2:- {"id": 2, "name": "test1", "salary": 2000} As Splijson splits the array into individual records. EvaluateJsonPath configs:- in this processor extracting all the contents of flowfile to attributes by changing the list of properties Destination as flowfile-attribute extracting contents of json file by adding new properties by clicking + symbol on right corner. id as $.id name as $.name salary as $.salary ReplaceText Configs:- We are preparing insert statement in this processor and change Replacement Strategy property as AlwaysReplace Use insert statement and give destination table name and use the extracted attributes to replace the contents of values dynamically. insert into sqlserver_tablename (id,name,salary) values (${id},${name},${salary}) output:- flowfile1 content:- insert into sqlserver_tablename (id,name,salary) values(1,test,1000) flowfile2 content:- insert into sqlserver_tablename (id,name,salary) values(2,test1,2000) PutSQL Configs:- use putsql processor and make a connection pool to sql server enable and use the connection pool. so Putsql processor will executes all the insert statements that we are having contents of the flowfile and all the data will get inserted to sql server. FlowScreenshot:-

Shu_ashu · ‎09-21-2017

Hi @Shailesh Nookala, PutFile processor Writes the contents of a FlowFile to the local file system. 1.You can find the file in NiFi running nodes only as the property create missing directories is true, NiFi will creates those directories for you and store the csv file into your directory(i.e /Users/nraghuram/...). Example:- 1.Lets take you are having 2 NiFi nodes(E01,E02) and you are running GetFile processor on only primary node(consider E02 is primary), so when you run on primary node then the File got stored in the specified directory of your processor on primary node(in our case on E02 because it is primary node). 2.In this example im having E02 is primary node and running PutFile processor to store File in local directory. 3.As Highlighted below the File is on E02 node 4.When i store the file it is inside /d1 directory on E02 node only because i'm running GetFile only on Primary Node.

Shu_ashu · ‎09-20-2017

@sally sally, if you are having filename attribute then make use RouteonAttribute processor to get same filename files into one MergeContent processor. by this method all the same filename flowfiles comes to one mergecontent processor will help to resolve issues with merging the flowfiles. RouteonAttribute Properties:- To get only filename 1 from the flow files ${filename:equals('1')} Example:- RouteonAttribute:- Make use of Property 1, 2 and connect 1, 2 relationships to two seperate merge contents..

Shu_ashu · ‎09-20-2017

@Sami Ahmad, make sure you having you have given hive-site.xml,core-site.xml,hdfs-site.xml Paths in Hadoop Configuration Resources property and if you are using kerberos then you need to specify Kerberos Principal, Kerberos Keytab properties in PutHDFS processor. Example:-

Shu_ashu · ‎09-19-2017

@sally sally, can you make use of below search property ^<[^>]+>(.*)\<\/\?.*\>$ Replacetext Configs:- Input:- <?xml version="1.0" encoding="utf-8"?>abc</?xml version="1.0" encoding="utf-8"?> Output:- <DailyData>abc</DailyData>

Online	Offline
Last Visited	‎04-04-2021 06:38 PM

Member Since	‎06-08-2017 08:15 PM
Last Visited	‎04-04-2021 06:38 PM
Posts	1,049
Kudos received	515

Cloudera Community

Re: Get column values in comma separated value

Re: nifi Json data using routeonattributeto to spl...

Re: HIVE MANAGED TABLE

Re: CSV file with Duplicate Headers

Re: NIFI - SQL Server Lookup

Re: NiFi with rolloing file pattern

Re: I used GetFTP processor to get a CSV file from...

Re: NIFI process group scheduling

Re: NIFI process group scheduling

Re: nifi EvaluateJsonPath could not find path in j...

Re: I used GetFTP processor to get a CSV file from...

Re: I used GetFTP processor to get a CSV file from...

Re: NIFI:Merging Flowfiles by filename in MergeCon...

Re: putHDFS processor errors in Nifi

Re: Nifi:How to remove xml tag from xml response d...