About eberezitsky

eberezitsky · ‎06-22-2017

Hi Matt, thanks for your response. The solution by your link won't work for Jython in NiFi (actually, it created a lot of issues and I had to reboot NiFi services). But it gave me some ideas on what I can do and how. Once I complete all the tests, I'll put an answer with recommendations for others. As for the permanent solution, I think performance impact would be too big to check whether file have been changed every time the processor is being triggered by incoming flow file. Instead it can be done on "Start" only. But this still won't resolve issues with classes (modules) having the same name but deployed under different locations (paths), which make env sharing or versioning impossible (dev and qa, for example, or different builds/versions during dev stages). I would suggest to have custom class loader (modules defined) on processor level instead of global.

eberezitsky · ‎06-20-2017

Hi All, We use ExecuteScript processor to run some python code. So, python file used in processor is just a wrapper, which invokes actual python process. My problem is that when I change a file with python code on file system, it's not been reloaded in NiFi flows until I fully restart NiFi cluster. I understand that happens due to the classes being loaded in classloader of JVM after first use (since it is actually Jython). Question: is there work around to reload classes after python code is changed on file system, instead of restarting NiFi Cluster? Thanks!

eberezitsky · ‎06-08-2017

Thanks for your comment. Yup, agreed. Gave my answer as an alternative for @Timo Burmeister

eberezitsky · ‎06-08-2017

@Matt Burgess, Love JOLT transformation solution, we use it a lot with dynamic jsons, transpose, etc... but in this case, I would go simple, just replace text. let me know if I miss something (check my alternative)

eberezitsky · ‎06-08-2017

@Timo Burmeister, Keep it simple: use replaceText processor with this configuration: Search Value : [{] Replacement Value : { "ctime":"${now()}", Replacement Strategy : Regex Replace Evaluation Mode : Entire text

eberezitsky · ‎05-25-2017

@Dhanya Kumar Heballi Shivamurthy, please accept the answer to close the thread.

eberezitsky · ‎05-23-2017

eberezitsky · ‎04-21-2017

@Mushtaq Rizvi, if that worked for you, please accept the answer so your question will be marked as resolved. Thanks.

eberezitsky · ‎04-20-2017

@Mushtaq Rizvi, you can define tab delimited table. This will give you already 7 columns, without any regex for them. then, in order to extract particular parts from each product column use "(\d+):([^-\t]+)-(\d+)" as regex (except for the first one, which would be just "(\d+):([^-\t]+)"...) or you can also use split instead of regex. insert overwrite table recommendation SELECT regexp_extract(col_p, '(\d+):', 1) productId, regexp_extract(col_p, ':(.+)', 1) productName, regexp_extract(col_p1, '(\d+):', 1) productId1, regexp_extract(col_p1, ':([^-]+)', 1) productName1, regexp_extract(col_p1, '-(.+)', 1) productCount1, regexp_extract(col_p2, '(\d+):', 1) productId2, regexp_extract(col_p2, ':([^-]+)', 1) productName2, regexp_extract(col_p2, '-(.+)', 1) productCount2, regexp_extract(col_p3, '(\d+):', 1) productId3, regexp_extract(col_p3, ':([^-]+)', 1) productName3, regexp_extract(col_p3, '-(.+)', 1) productCount3, regexp_extract(col_p4, '(\d+):', 1) productId4, regexp_extract(col_p4, ':([^-]+)', 1) productName4, regexp_extract(col_p4, '-(.+)', 1) productCount4, regexp_extract(col_p5, '(\d+):', 1) productId5, regexp_extract(col_p5, ':([^-]+)', 1) productName5, regexp_extract(col_p5, '-(.+)', 1) productCount5, regexp_extract(col_p6, '(\d+):', 1) productId6, regexp_extract(col_p6, ':([^-]+)', 1) productName6, regexp_extract(col_p6, '-(.+)', 1) productCount6 from temp_recommendation;

eberezitsky · ‎04-19-2017

@Anusha Akula, Let me start with #2. Number of mappers is defined by number of input splits. Since you have 21 partitions, you have not less than 21 files, which determines the number of mappers you get (actually, you have exactly 21 files: 1 under each partition). Regarding #1. Regex you are using, is a bit complicated... Try to reduce its complexity: SELECT REGEXP_EXTRACT((get_json_object(line, '$.result._raw')),'.*X_Requested_With=(\\w.+?)\\s',1) as X_INF_RequestID FROM tablename;

Online	Offline
Last Visited	‎02-06-2019 04:58 PM

Member Since	‎11-07-2016 04:38 PM
Last Visited	‎02-06-2019 04:58 PM
Posts	70
Kudos received	40

Cloudera Community

Re: Why doesn't ListSFTP allow upstream connection...

Re: Hive in built function greatest is not workin...

Re: skip directories using nifi getftp processor

Re: Updating hive table with sqoop from mysql tabl...

Re: NiFi Execute Script - Reload Classes

Re: NiFi Execute Script - Reload Classes

NiFi Execute Script - Reload Classes

Re: Nifi - How to add key:value to json

Re: Nifi - How to add key:value to json

Re: Nifi - How to add key:value to json

Re: Hive transpose concatenated data in columns to...

Re: Hive transpose concatenated data in columns to...

Re: Getting exception while inserting the file hav...

Re: Getting exception while inserting the file hav...

Re: HIVE Regex extract taking time