Member since
06-08-2017
1049
Posts
518
Kudos Received
312
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 11242 | 04-15-2020 05:01 PM | |
| 7150 | 10-15-2019 08:12 PM | |
| 3129 | 10-12-2019 08:29 PM | |
| 11552 | 09-21-2019 10:04 AM | |
| 4360 | 09-19-2019 07:11 AM |
09-13-2018
05:27 PM
1 Kudo
@Mustafa Ali Qizilbash
You can acheive this case by using MergeContent processor Merge Content Configs: By configuring Minimum Number of Entries to 2 processor will wait until it got 2 entries. Flow that i tried: But if we got 2 flowfiles from Location1 it self merge content is going to merge those flowfiles into 1. This flow only works when we are going to have one flowfile from each source then it works fine, if you haven't got any flowfile from location2 then processor just wait infinite time until it gets another flowfile. To avoid this case use reasonable Max bin age time for your use case then processor will forcefully keeps the flow file into merged relationship. Please refer to this link for configuring MergeContent processor. (or) If your header is always same: 1.With new record oriented processor capabilities you can ignore the header that is coming from Location1 and configure the ConvertRecord processor to add the header to the incoming data. 2.Using Replace text processor we can add the header from to the Location2 file. Refer to this link for more details regards to this method.
... View more
09-12-2018
03:18 PM
@sandra
Alvarez
ExecuteInfluxDBQuery processor has been added in NiFi-1.7.0+ version, I think you are using <NiFi-1.7. You need to update the HDF/NiFi version to get these new processors (or) if you are having HortonWorks support then contact that team to get this processor to your NiFi version.
... View more
09-12-2018
02:55 PM
@sandra
Alvarez
All put processors are used to store the data into filesystem/db's/table's in NiFi. There is ExecuteInfluxDBQuery processor to get data from influxdb. Refer to this and this link for more details regards to ExecuteInfluxDBQuery processor. Then you can use PutElasticSearch processors to store data into Elastic search. - If the Answer helped to resolve your issue, Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of issues.
... View more
09-12-2018
01:12 PM
1 Kudo
@n c As you are having struct type for your first column in the table we need to use named_struct function while inserting the data. Table definition: hive> desc testtbl;
+-----------+-------------------------------------------------------+----------+--+
| col_name | data_type | comment |
+-----------+-------------------------------------------------------+----------+--+
| id | struct<tid:string,action:string,createdts:timestamp> | |
| cid | string | |
| anumber | string | |
+-----------+-------------------------------------------------------+----------+--+ Inserting data into testtbl: hive> insert into testtbl select named_struct('tid',"1",'action',"post",'createdts',timestamp(150987427)),string("1241"),string("124") from(select '1')t;
Selecting data from the table: hive> select * from testtbl;
+--------------------------------------------------------------------+--------------+------------------+--+
| testtbl.id | testtbl.cid | testtbl.anumber |
+--------------------------------------------------------------------+--------------+------------------+--+
| {"tid":"1","action":"post","createdts":"1970-01-02 12:56:27.427"} | 1241 | 124 |
+--------------------------------------------------------------------+--------------+------------------+--+
- If the Answer helped to resolve your issue, Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of issues.
... View more
09-11-2018
03:58 AM
@Andrew Bailon By using PutDatabaseRecord processor we don't need to use any of(replace text,putsql,convertjsontosql) processors at all, as PutDatabaseRecord processor will read the incoming data based on the Record Reader controller service and creates the `insert/update statements` based on the strategy that we have selected and executes them in the target database.
... View more
09-08-2018
01:27 PM
@Saravanan Subramanian For this case please look into this link for storing and fetching the state from distributed cache map. By using this approach we are updating the state only when the pull has been succeeded, if the pull failed then we are not storing the state.
... View more
09-07-2018
03:19 PM
2 Kudos
@Kamlesh
Pant
In your ExecuteSql processor keep the below property value to Normalize Table/Column Names
true ExecuteSQL configs: enclose column names in square brackets [] - If the Answer helped to resolve your issue, Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of issues.
... View more
09-07-2018
12:55 PM
@Mitthu Wagh We can use Wait-Notify processors to wait for 2 Target Signal Count then wait processor releases the flowfile once target signal count number reaches to 2. Refer to this link for usage/configuring wait/notify processors. Another way is by using MergeContent processor with correlation attribute name and Minimum number of entries as 2 then processor will wait for 2 flowfiles with the same attribute name then merges the flowfiles and then send mail based on that. Refer to this and this links for same use case with mergecontent processor. - If the Answer helped to resolve your issue, Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of issues.
... View more
09-05-2018
10:01 PM
@Andrew Bailon We need to extract the values, add them to flow file attributes and in replace text processor we need to prepare insert statement mapping to values insert into employee values (${value1},${value2}..) Refer to this and this links to get familiar how to replace question marks and usage of PutSQL processor.
... View more
09-05-2018
02:36 AM
@Jatinmaya
Choudhury
Not able to recreate the same scenario on my end. Here is what I have tried is: Json file: {"user":{"userlocation":"Cinderford, Gloucestershire","id":230231618,"name":"Aimee","screenname":"Aimee_Cottle","geoenabled":true},"tweetmessage":"Gastroenteritis has pretty much killed me this week :( off work for a few days whilst I recover!","createddate":"2013-06-20T12:08:14","geolocation":null}
{"user":{"userlocation":"Garena ID : NuraBlazee","id":635239939,"name":"Axyraf.","screenname":"Asyraf_Fauzi","geoenabled":false},"tweetmessage":"RT @abhigyantweets: Can't stop a natural disaster but so many lives in U'khand wouldn't have been lost if there was disaster preparedness. É","createddate":"2013-06-20T12:08:16","geolocation":null}
{"user":{"userlocation":"Gemert,Netherlands","id":21418083,"name":"Ad van Steenbruggen","screenname":"torment00","geoenabled":true},"tweetmessage":"? Listening to 'The Constant' by 'Anthrax' from 'Worship Music","createddate":"2013-06-20T12:08:20","geolocation":null} HiveDDL:- Create external table tweets(
`user` struct<userlocation:string,id:string,name:string,screenname:string,geoenabled:string>,tweetmessage string,createddate string,geolocation string)
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe' LOCATION '/user/hive/data/'; hive> select * from tweets;
+----------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------+----------------------+--------------+--+
| user | tweetmessage | createddate | geolocation |
+----------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------+----------------------+--------------+--+
| {"userlocation":"Cinderford, Gloucestershire","id":"230231618","name":"Aimee","screenname":"Aimee_Cottle","geoenabled":"true"} | Gastroenteritis has pretty much killed me this week :( off work for a few days whilst I recover! | 2013-06-20T12:08:14 | NULL |
| {"userlocation":"Garena ID : NuraBlazee","id":"635239939","name":"Axyraf.","screenname":"Asyraf_Fauzi","geoenabled":"false"} | RT @abhigyantweets: Can't stop a natural disaster but so many lives in U'khand wouldn't have been lost if there was disaster preparedness. É | 2013-06-20T12:08:16 | NULL |
| {"userlocation":"Gemert,Netherlands","id":"21418083","name":"Ad van Steenbruggen","screenname":"torment00","geoenabled":"true"} | ? Listening to 'The Constant' by 'Anthrax' from 'Worship Music | 2013-06-20T12:08:20 | NULL |
+----------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------+----------------------+--------------+--+
Selecting specific field: select `user`.userlocation from tweets;
+------------------------------+--+
| userlocation |
+------------------------------+--+
| Cinderford, Gloucestershire |
| Garena ID : NuraBlazee |
| Gemert,Netherlands |
+------------------------------+--+ As I'm able to get all the three rows data, Once try with the above hive ddl statement and check are u able to get the data or not. - If the Answer helped to resolve your issue, Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of issues.
... View more