Member since
07-14-2017
99
Posts
5
Kudos Received
4
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1420 | 09-05-2018 09:58 AM | |
1904 | 07-31-2018 12:59 PM | |
1406 | 01-15-2018 12:07 PM | |
1312 | 11-23-2017 04:19 PM |
09-17-2018
01:32 PM
Hi All, I have an use case where I want to find number of occurrences of the word and want to perform an action on it. example: 1. I have multiple flow files coming in 2. I want to extract a word (say, user_name) using extracttext processor 3. count the word 4. if user_name_count =10 5. do replacetext 10 as 1 6. putemail to user_name that user_name count is 10. Can you please let me know which processors can be helpful for the usecase. Suggestions are appreciated.
... View more
Labels:
- Labels:
-
Apache NiFi
09-12-2018
01:57 PM
@rtheron for some reason I cannot follow the first approach. I tried creating an intermediate orc with partitions and loaded the data in to it from external table. now when I load in to the destination from the intermediate table, puthiveql is taking a lot of time. any suggestions are appreciated.
... View more
09-07-2018
01:35 PM
Hi All, I have a 10GB file every minute coming to a location (/dir), and there is an external table for that location. The file is as below karlon,n_d_1,26,6234,2019-09-08,1536278400
d'lov,research,20,1001,2019-09-08,1536278400
kris'a,b_x_3,20,4532,2019-09-08,1536278400 external table name: ex_t name department age id date time karlon n_d_1 26 6234 2019-09-08 1536278400 d'lov research 20 1001 2018-09-08 1536278400 I have puthiveql processor in my flow which gets data from external table and inserts in to multiple ORC table. ORC : table_1, table_2, table_3,table_4,table_5, table_6 Every table(orc table) has same columns. name(string),department (string),age (int),id (int),date (string),partition_value (int) The puthiveql processor has multiple insert queries in it. INSERT INTO table_1 PARTITION(partition_value) SELECT name, department, age, id, date, cast(regexp_replace(date,'-','') as int) AS partition_value FROM ex_t WHERE department = 'research' AND time='1536278400';
INSERT INTO table_2 PARTITION(partition_value) SELECT name, department, age, id, date, cast(regexp_replace(date,'-','') as int) AS partition_value FROM ex_t WHERE department = 'n_d_1' AND time='1536278400';
INSERT INTO table_3 PARTITION(partition_value) SELECT name, department, age, id, date, cast(regexp_replace(date,'-','') as int) AS partition_value FROM ex_t WHERE department = 'b_x_3' AND time='1536278400';
INSERT INTO table_4 PARTITION(partition_value) SELECT name, department, age, id, date, cast(regexp_replace(date,'-','') as int) AS partition_value FROM ex_t WHERE department = 'research' AND time='1536278400';
INSERT INTO table_5 PARTITION(partition_value) SELECT name, department, age, id, date, cast(regexp_replace(date,'-','') as int) AS partition_value FROM ex_t WHERE department = 'x_in_1' AND time='1536278400';
INSERT INTO table_6 PARTITION(partition_value) SELECT name, department, age, id, date, cast(regexp_replace(date,'-','') as int) AS partition_value FROM ex_t WHERE department = 'z_e_3' AND time='1536278400'; The above is sent as a flowfile to puthiveql, which is scheduled every minute, as the file arrives every minute. Puthiveql is very slow process the above and the inserts are not happening frequently. Can you please suggest how to improve the performance of the puthiveql, I have increased the concurrent processor but it did not help, some times the flowfiles(which have insert statements) get queued and never execute. Suggestions are highly appreciated.
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache NiFi
09-06-2018
03:35 AM
@Bryan Bende I have checked my data, it has no new blank spaces, but was arriving like batches. I am merging the files and appending using puthdfs. When I use the configurations you suggested, some times I am getting a new blank line at the beginning of the file which is appended using puthdfs. Can you please help me how to avoid the blank line at the beginning of the file, also the file is big (1GB).
... View more
09-05-2018
09:58 AM
I find an alternate way of doing Thank you
... View more
09-05-2018
09:42 AM
Hi All, I have json data in multiple small files (some times only one line in a file). I want to merge all small files in to single large file. I am getting a large file in an unexpected format. ex:
file 1: {"code"="1", "color"="green"}
{"code"="2", "color"="blue"}
{"code"="3", "color"="orange"}
file 2: {"code"="4", "color"="yellow"}
{"code"="5", "color"="red"}
I am getting the below output after using MergeContent {"code"="1", "color"="green"}
{"code"="2", "color"="blue"}
{"code"="3", "color"="orange"}{"code"="4", "color"="yellow"}
{"code"="5", "color"="red"}
Expected output {"code"="1", "color"="green"}
{"code"="2", "color"="blue"}
{"code"="3", "color"="orange"}
{"code"="4", "color"="yellow"}
{"code"="5", "color"="red"}
... View more
Labels:
- Labels:
-
Apache NiFi
08-31-2018
04:18 AM
Looks like the jpgs are not aligned as expected, but the names of jpgs are listed below in order, Thank you
... View more
08-31-2018
04:14 AM
Hi, I am getting a plain json stream with '\n' delimiter through TCP. I am listening to TCP using listenTCP, set batchsize to 10000. My json is with variable values, ex: {"a":"20180831","b":"b"}
{"a":"20180831","b":"b","c":"c"}
I want to add a partition_value attribute to every line in json stream at once, which should look like The attribute a is always present in json, so I want to use partition_value in a {"a":"20180831","b":"b","partition_value":"20180831"}
{"a":"20180831","b":"b","c":"c","partition_value":"20180831"}
I have used "UpdateRecord" processor below are the configuration UpdateRecord JsonTreeReader AvroSchemaRegistry AvroRecordSetWriter I used UpdateRecord -> jsontreereader ->avroschemaregistry |_________ -> avrorecordsetwriter Then I have used avrotojson I am getting only one line as output, can you please suggest where it is happening wrong or let me know if there is a better way to do it Thank you {"a":"20180831","b":"b","c":null,"partition_value":"20180831"}
... View more
Labels:
- Labels:
-
Apache NiFi
08-14-2018
03:17 PM
@Felix Albani Thanks for the helping arm, I will go through them and could ask for suggsetions if required. Thank you.
... View more
08-14-2018
08:46 AM
@Felix Albani can you please suggest
... View more