Member since
11-07-2016
70
Posts
40
Kudos Received
16
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4101 | 02-22-2018 09:20 PM | |
7047 | 01-30-2018 03:41 PM | |
1281 | 10-25-2017 06:09 PM | |
10949 | 08-15-2017 10:54 PM | |
3423 | 06-26-2017 05:05 PM |
06-22-2017
07:48 PM
Hi Matt, thanks for your response. The solution by your link won't work for Jython in NiFi (actually, it created a lot of issues and I had to reboot NiFi services). But it gave me some ideas on what I can do and how. Once I complete all the tests, I'll put an answer with recommendations for others. As for the permanent solution, I think performance impact would be too big to check whether file have been changed every time the processor is being triggered by incoming flow file. Instead it can be done on "Start" only. But this still won't resolve issues with classes (modules) having the same name but deployed under different locations (paths), which make env sharing or versioning impossible (dev and qa, for example, or different builds/versions during dev stages). I would suggest to have custom class loader (modules defined) on processor level instead of global.
... View more
06-20-2017
05:30 PM
Hi All, We use ExecuteScript processor to run some python code. So, python file used in processor is just a wrapper, which invokes actual python process. My problem is that when I change a file with python code on file system, it's not been reloaded in NiFi flows until I fully restart NiFi cluster. I understand that happens due to the classes being loaded in classloader of JVM after first use (since it is actually Jython). Question: is there work around to reload classes after python code is changed on file system, instead of restarting NiFi Cluster? Thanks!
... View more
Labels:
- Labels:
-
Apache NiFi
06-08-2017
07:56 PM
Thanks for your comment. Yup, agreed. Gave my answer as an alternative for @Timo Burmeister
... View more
06-08-2017
07:39 PM
@Matt Burgess, Love JOLT transformation solution, we use it a lot with dynamic jsons, transpose, etc... but in this case, I would go simple, just replace text. let me know if I miss something (check my alternative)
... View more
06-08-2017
07:35 PM
1 Kudo
@Timo Burmeister, Keep it simple: use replaceText processor with this configuration: Search Value : [{]
Replacement Value : { "ctime":"${now()}",
Replacement Strategy : Regex Replace
Evaluation Mode : Entire text
... View more
05-25-2017
01:27 PM
@Dhanya Kumar Heballi Shivamurthy, please accept the answer to close the thread.
... View more
05-23-2017
05:29 PM
1 Kudo
@Dhanya Kumar Heballi Shivamurthy, assuming you have both arrays with the same size (per record): select
c1,
c21,
c31,
c4
from (
select 100 c1, split('Delta|Alpha|Beta','\\|') c2, split('Source|Varied|Volume','\\|') c3, 'AppData' c4
) foo
LATERAL VIEW posexplode(c2) n1 as c22, c21
LATERAL VIEW posexplode(c3) n2 as c32, c31
where c22=c32;
If array lengths can be different, then you need to add more conditions: select c1, c222 c2, c333 c3, c4
from (
select
c1,
c22, c32, -- keep indices
case when c32 < size(c2) then c21 else null end c222,
case when c22 < size(c3) then c31 else null end c333,
c4
from (
select 100 c1, split('Delta|Alpha|Beta','\\|') c2, split('Source|Varied|Volume|Owner','\\|') c3, 'AppData' c4
) foo
LATERAL VIEW posexplode(c2) n1 as c22, c21
LATERAL VIEW posexplode(c3) n2 as c32, c31
) bar
where c22=c32 or (c222 is null and c22=0) or (c333 is null and c32=0);
Result: +------+--------+---------+----------+--+
| c1 | c2 | c3 | c4 |
+------+--------+---------+----------+--+
| 100 | Delta | Source | AppData |
| 100 | Alpha | Varied | AppData |
| 100 | Beta | Volume | AppData |
| 100 | Owner | NULL | AppData |
+------+--------+---------+----------+--+
... View more
04-21-2017
08:15 PM
@Mushtaq Rizvi, if that worked for you, please accept the answer so your question will be marked as resolved. Thanks.
... View more
04-20-2017
12:59 PM
1 Kudo
@Mushtaq Rizvi, you can define tab delimited table. This will give you already 7 columns, without any regex for them. then, in order to extract particular parts from each product column use "(\d+):([^-\t]+)-(\d+)" as regex (except for the first one, which would be just "(\d+):([^-\t]+)"...) or you can also use split instead of regex. insert overwrite table recommendation SELECT
regexp_extract(col_p, '(\d+):', 1) productId,
regexp_extract(col_p, ':(.+)', 1) productName,
regexp_extract(col_p1, '(\d+):', 1) productId1,
regexp_extract(col_p1, ':([^-]+)', 1) productName1,
regexp_extract(col_p1, '-(.+)', 1) productCount1,
regexp_extract(col_p2, '(\d+):', 1) productId2,
regexp_extract(col_p2, ':([^-]+)', 1) productName2,
regexp_extract(col_p2, '-(.+)', 1) productCount2,
regexp_extract(col_p3, '(\d+):', 1) productId3,
regexp_extract(col_p3, ':([^-]+)', 1) productName3,
regexp_extract(col_p3, '-(.+)', 1) productCount3,
regexp_extract(col_p4, '(\d+):', 1) productId4,
regexp_extract(col_p4, ':([^-]+)', 1) productName4,
regexp_extract(col_p4, '-(.+)', 1) productCount4,
regexp_extract(col_p5, '(\d+):', 1) productId5,
regexp_extract(col_p5, ':([^-]+)', 1) productName5,
regexp_extract(col_p5, '-(.+)', 1) productCount5,
regexp_extract(col_p6, '(\d+):', 1) productId6,
regexp_extract(col_p6, ':([^-]+)', 1) productName6,
regexp_extract(col_p6, '-(.+)', 1) productCount6
from temp_recommendation;
... View more
04-19-2017
07:22 PM
1 Kudo
@Anusha Akula, Let me start with #2. Number of mappers is defined by number of input splits. Since you have 21 partitions, you have not less than 21 files, which determines the number of mappers you get (actually, you have exactly 21 files: 1 under each partition). Regarding #1. Regex you are using, is a bit complicated... Try to reduce its complexity: SELECT REGEXP_EXTRACT((get_json_object(line, '$.result._raw')),'.*X_Requested_With=(\\w.+?)\\s',1) as X_INF_RequestID FROM tablename;
... View more