Member since
10-06-2017
12
Posts
0
Kudos Received
0
Solutions
01-27-2021
01:54 AM
@sow I am also having the same issue, did you get any resolution for this issue?
... View more
09-03-2018
01:57 PM
After you create multiple input ports in the NiFi, when you link your processor with your RPG, it will promote for you to choose which input port to use with selection list "to input".
... View more
11-14-2017
02:06 PM
1 Kudo
I'm not familiar with the innards of either Groovy or Jython, but I am guessing that Jython is slower for the following reasons: 1) Groovy was built "for the JVM" and leverages/integrates with Java more cleanly 2) Jython is an implementation of Python for the JVM. Looking briefly at the code, it appears to go back and forth between the Java and Python idioms, so it is more "emulated" than Groovy. 3) Apache Groovy has a large, very active community that consistently works to improve the performance of the code, both compiled and interpreted. In my own experience, Groovy and Javascript (Nashorn) perform much better in the scripted processors than Jython or JRuby. If you choose Jython, there are still a couple of things you can do to improve performance: - Use InvokeScriptedProcessor (ISP) instead of ExecuteScript. ISP is faster because it only loads the script once, then invokes methods on it, rather than ExecuteScript which evaluates the script each time. I have an ISP template in Jython which should make porting your ExecuteScript code easier. - Use ExecuteStreamCommand with command-line Python instead. You won't have the flexibility of accessing attributes, processor state, etc. but if you're just transforming content you should find ExecuteStreamCommand with Python faster. - No matter which language you choose, you can often improve performance if you use session.get(int) instead of session.get(). That way if there are a lot of flow files in the queue, you could call session.get(1000) or something, and process up to 1000 flow files per execution. If your script has a lot of overhead, you may find handling multiple flow files per execution can significantly improve performance.
... View more
10-23-2017
02:29 PM
Yes i have connected hive using DBCPConnectionPool, But is there any way to replace those jars with a lower version one?
... View more
12-17-2018
08:02 PM
Hi @Pratik Ghatak @Rudolf Schimmel @Timothy Spann I am getting the similar error while connecting to Hive using DBCPConnectionPool in ExecuteSQL processor in NiFi 1.7.1 Hive version : Hive 1.1.0-cdh5.12.0 My DBCP Connection Pool: DB Connection URL: jdbc:hive2://MyServer:10000/default;AuthMech=GSSAPI;KrbRealm=HADOOP.TEST.COMPANY.COM;KrbHostFQDN=_HOST;KrbServiceName=hive DB Driver Location: /opt/flow/nifi/lib/hive-jdbc-1.1.0-cdh5.12.0.jar, /opt/flow/nifi/lib/hive-service-1.1.0-cdh5.12.0.jar, /opt/flow/nifi/lib/libthrift-0.9.3.jar DB Driver Class: org.apache.hive.jdbc.Hive Driver Database User : No value set Password : No value set Further, I have added below property in hive-site.xml <property> <name>hive.server2.transport.mode</name> <value>binary</value> <property> If possible, could you please share your working DBCPConnection Pool settings. I am able to connect to Hive in Python using pyhs2 and impala.dbapi modules providing pyhs2.connect(host='myserver',port=10000,authMechanism='KERBEROS') conn=connect(host='myserver', port=10000, kerberos_service_name='hive', auth_mechanism='GSSAPI') Kindly advise. Thanks so much.
... View more