About Shu_ashu

Shu_ashu · ‎09-13-2018

@Mustafa Ali Qizilbash You can acheive this case by using MergeContent processor Merge Content Configs: By configuring Minimum Number of Entries to 2 processor will wait until it got 2 entries. Flow that i tried: But if we got 2 flowfiles from Location1 it self merge content is going to merge those flowfiles into 1. This flow only works when we are going to have one flowfile from each source then it works fine, if you haven't got any flowfile from location2 then processor just wait infinite time until it gets another flowfile. To avoid this case use reasonable Max bin age time for your use case then processor will forcefully keeps the flow file into merged relationship. Please refer to this link for configuring MergeContent processor. (or) If your header is always same: 1.With new record oriented processor capabilities you can ignore the header that is coming from Location1 and configure the ConvertRecord processor to add the header to the incoming data. 2.Using Replace text processor we can add the header from to the Location2 file. Refer to this link for more details regards to this method.

Shu_ashu · ‎09-12-2018

@sandra Alvarez ExecuteInfluxDBQuery processor has been added in NiFi-1.7.0+ version, I think you are using <NiFi-1.7. You need to update the HDF/NiFi version to get these new processors (or) if you are having HortonWorks support then contact that team to get this processor to your NiFi version.

Shu_ashu · ‎09-12-2018

@sandra Alvarez All put processors are used to store the data into filesystem/db's/table's in NiFi. There is ExecuteInfluxDBQuery processor to get data from influxdb. Refer to this and this link for more details regards to ExecuteInfluxDBQuery processor. Then you can use PutElasticSearch processors to store data into Elastic search. - If the Answer helped to resolve your issue, Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of issues.

Shu_ashu · ‎09-12-2018

@n c As you are having struct type for your first column in the table we need to use named_struct function while inserting the data. Table definition: hive> desc testtbl; +-----------+-------------------------------------------------------+----------+--+ | col_name | data_type | comment | +-----------+-------------------------------------------------------+----------+--+ | id | struct<tid:string,action:string,createdts:timestamp> | | | cid | string | | | anumber | string | | +-----------+-------------------------------------------------------+----------+--+ Inserting data into testtbl: hive> insert into testtbl select named_struct('tid',"1",'action',"post",'createdts',timestamp(150987427)),string("1241"),string("124") from(select '1')t; Selecting data from the table: hive> select * from testtbl; +--------------------------------------------------------------------+--------------+------------------+--+ | testtbl.id | testtbl.cid | testtbl.anumber | +--------------------------------------------------------------------+--------------+------------------+--+ | {"tid":"1","action":"post","createdts":"1970-01-02 12:56:27.427"} | 1241 | 124 | +--------------------------------------------------------------------+--------------+------------------+--+ - If the Answer helped to resolve your issue, Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of issues.

Shu_ashu · ‎09-11-2018

@Andrew Bailon By using PutDatabaseRecord processor we don't need to use any of(replace text,putsql,convertjsontosql) processors at all, as PutDatabaseRecord processor will read the incoming data based on the Record Reader controller service and creates the `insert/update statements` based on the strategy that we have selected and executes them in the target database.

Shu_ashu · ‎09-08-2018

@Saravanan Subramanian For this case please look into this link for storing and fetching the state from distributed cache map. By using this approach we are updating the state only when the pull has been succeeded, if the pull failed then we are not storing the state.

Shu_ashu · ‎09-07-2018

@Kamlesh Pant In your ExecuteSql processor keep the below property value to Normalize Table/Column Names true ExecuteSQL configs: enclose column names in square brackets [] - If the Answer helped to resolve your issue, Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of issues.

Shu_ashu · ‎09-07-2018

@Mitthu Wagh We can use Wait-Notify processors to wait for 2 Target Signal Count then wait processor releases the flowfile once target signal count number reaches to 2. Refer to this link for usage/configuring wait/notify processors. Another way is by using MergeContent processor with correlation attribute name and Minimum number of entries as 2 then processor will wait for 2 flowfiles with the same attribute name then merges the flowfiles and then send mail based on that. Refer to this and this links for same use case with mergecontent processor. - If the Answer helped to resolve your issue, Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of issues.

Shu_ashu · ‎09-05-2018

@Andrew Bailon We need to extract the values, add them to flow file attributes and in replace text processor we need to prepare insert statement mapping to values insert into employee values (${value1},${value2}..) Refer to this and this links to get familiar how to replace question marks and usage of PutSQL processor.

Shu_ashu · ‎09-05-2018

@Jatinmaya Choudhury Not able to recreate the same scenario on my end. Here is what I have tried is: Json file: {"user":{"userlocation":"Cinderford, Gloucestershire","id":230231618,"name":"Aimee","screenname":"Aimee_Cottle","geoenabled":true},"tweetmessage":"Gastroenteritis has pretty much killed me this week :( off work for a few days whilst I recover!","createddate":"2013-06-20T12:08:14","geolocation":null} {"user":{"userlocation":"Garena ID : NuraBlazee","id":635239939,"name":"Axyraf.","screenname":"Asyraf_Fauzi","geoenabled":false},"tweetmessage":"RT @abhigyantweets: Can't stop a natural disaster but so many lives in U'khand wouldn't have been lost if there was disaster preparedness. É","createddate":"2013-06-20T12:08:16","geolocation":null} {"user":{"userlocation":"Gemert,Netherlands","id":21418083,"name":"Ad van Steenbruggen","screenname":"torment00","geoenabled":true},"tweetmessage":"? Listening to 'The Constant' by 'Anthrax' from 'Worship Music","createddate":"2013-06-20T12:08:20","geolocation":null} HiveDDL:- Create external table tweets( `user` struct<userlocation:string,id:string,name:string,screenname:string,geoenabled:string>,tweetmessage string,createddate string,geolocation string) ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe' LOCATION '/user/hive/data/'; hive> select * from tweets; +----------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------+----------------------+--------------+--+ | user | tweetmessage | createddate | geolocation | +----------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------+----------------------+--------------+--+ | {"userlocation":"Cinderford, Gloucestershire","id":"230231618","name":"Aimee","screenname":"Aimee_Cottle","geoenabled":"true"} | Gastroenteritis has pretty much killed me this week :( off work for a few days whilst I recover! | 2013-06-20T12:08:14 | NULL | | {"userlocation":"Garena ID : NuraBlazee","id":"635239939","name":"Axyraf.","screenname":"Asyraf_Fauzi","geoenabled":"false"} | RT @abhigyantweets: Can't stop a natural disaster but so many lives in U'khand wouldn't have been lost if there was disaster preparedness. É | 2013-06-20T12:08:16 | NULL | | {"userlocation":"Gemert,Netherlands","id":"21418083","name":"Ad van Steenbruggen","screenname":"torment00","geoenabled":"true"} | ? Listening to 'The Constant' by 'Anthrax' from 'Worship Music | 2013-06-20T12:08:20 | NULL | +----------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------+----------------------+--------------+--+ Selecting specific field: select `user`.userlocation from tweets; +------------------------------+--+ | userlocation | +------------------------------+--+ | Cinderford, Gloucestershire | | Garena ID : NuraBlazee | | Gemert,Netherlands | +------------------------------+--+ As I'm able to get all the three rows data, Once try with the above hive ddl statement and check are u able to get the data or not. - If the Answer helped to resolve your issue, Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of issues.

Online	Offline
Last Visited	‎04-04-2021 06:38 PM

Member Since	‎06-08-2017 08:15 PM
Last Visited	‎04-04-2021 06:38 PM
Posts	1,049
Kudos received	516

Cloudera Community

Re: Get column values in comma separated value

Re: nifi Json data using routeonattributeto to spl...

Re: HIVE MANAGED TABLE

Re: CSV file with Duplicate Headers

Re: NIFI - SQL Server Lookup

Re: NiFi: Merge Header and Data coming from differ...

Re: read information influxdb database

Re: read information influxdb database

Re: How to insert data into this table

Re: How to use ReplaceText processor in Nifi to re...

Re: Fetch records from a database incrementally ba...

Re: How to handle special character quote and \ in...

Re: How to compare two attribute value from differ...

Re: How to use ReplaceText processor in Nifi to re...

Re: hive is unable to read multiple json data