About PratikGhatak

PratikGhatak · ‎01-27-2021

@sow I am also having the same issue, did you get any resolution for this issue?

luodonghua · ‎09-03-2018

After you create multiple input ports in the NiFi, when you link your processor with your RPG, it will promote for you to choose which input port to use with selection list "to input".

mburgess · ‎11-14-2017

I'm not familiar with the innards of either Groovy or Jython, but I am guessing that Jython is slower for the following reasons: 1) Groovy was built "for the JVM" and leverages/integrates with Java more cleanly 2) Jython is an implementation of Python for the JVM. Looking briefly at the code, it appears to go back and forth between the Java and Python idioms, so it is more "emulated" than Groovy. 3) Apache Groovy has a large, very active community that consistently works to improve the performance of the code, both compiled and interpreted. In my own experience, Groovy and Javascript (Nashorn) perform much better in the scripted processors than Jython or JRuby. If you choose Jython, there are still a couple of things you can do to improve performance: - Use InvokeScriptedProcessor (ISP) instead of ExecuteScript. ISP is faster because it only loads the script once, then invokes methods on it, rather than ExecuteScript which evaluates the script each time. I have an ISP template in Jython which should make porting your ExecuteScript code easier. - Use ExecuteStreamCommand with command-line Python instead. You won't have the flexibility of accessing attributes, processor state, etc. but if you're just transforming content you should find ExecuteStreamCommand with Python faster. - No matter which language you choose, you can often improve performance if you use session.get(int) instead of session.get(). That way if there are a lot of flow files in the queue, you could call session.get(1000) or something, and process up to 1000 flow files per execution. If your script has a lot of overhead, you may find handling multiple flow files per execution can significantly improve performance.

PratikGhatak · ‎10-23-2017

Yes i have connected hive using DBCPConnectionPool, But is there any way to replace those jars with a lower version one?

rishisaxena2018 · ‎12-17-2018

Hi @Pratik Ghatak @Rudolf Schimmel @Timothy Spann I am getting the similar error while connecting to Hive using DBCPConnectionPool in ExecuteSQL processor in NiFi 1.7.1 Hive version : Hive 1.1.0-cdh5.12.0 My DBCP Connection Pool: DB Connection URL: jdbc:hive2://MyServer:10000/default;AuthMech=GSSAPI;KrbRealm=HADOOP.TEST.COMPANY.COM;KrbHostFQDN=_HOST;KrbServiceName=hive DB Driver Location: /opt/flow/nifi/lib/hive-jdbc-1.1.0-cdh5.12.0.jar, /opt/flow/nifi/lib/hive-service-1.1.0-cdh5.12.0.jar, /opt/flow/nifi/lib/libthrift-0.9.3.jar DB Driver Class: org.apache.hive.jdbc.Hive Driver Database User : No value set Password : No value set Further, I have added below property in hive-site.xml <property> <name>hive.server2.transport.mode</name> <value>binary</value> <property> If possible, could you please share your working DBCPConnection Pool settings. I am able to connect to Hive in Python using pyhs2 and impala.dbapi modules providing pyhs2.connect(host='myserver',port=10000,authMechanism='KERBEROS') conn=connect(host='myserver', port=10000, kerberos_service_name='hive', auth_mechanism='GSSAPI') Kindly advise. Thanks so much.

Online	Offline
Last Visited	‎01-27-2021 05:17 AM

Member Since	‎10-06-2017 09:29 AM
Last Visited	‎01-27-2021 05:17 AM
Posts	12

Cloudera Community

Re: sqoop import of BLOB columns from oracle datab...

Re: Getting error: Does not have any input ports f...

Re: Performance of Python Script in NiFi is slower...

Re: How to replace jar from a nar file in NiFI?

Re: NiFi Error: failed to process session due to j...