I'm not familiar with the innards of either Groovy or Jython, but I am guessing that Jython is slower for the following reasons:
1) Groovy was built "for the JVM" and leverages/integrates with Java more cleanly
2) Jython is an implementation of Python for the JVM. Looking briefly at the code, it appears to go back and forth between the Java and Python idioms, so it is more "emulated" than Groovy.
3) Apache Groovy has a large, very active community that consistently works to improve the performance of the code, both compiled and interpreted.
- Use InvokeScriptedProcessor (ISP) instead of ExecuteScript. ISP is faster because it only loads the script once, then invokes methods on it, rather than ExecuteScript which evaluates the script each time. I have an ISP template in Jython which should make porting your ExecuteScript code easier.
- Use ExecuteStreamCommand with command-line Python instead. You won't have the flexibility of accessing attributes, processor state, etc. but if you're just transforming content you should find ExecuteStreamCommand with Python faster.
- No matter which language you choose, you can often improve performance if you use session.get(int) instead of session.get(). That way if there are a lot of flow files in the queue, you could call session.get(1000) or something, and process up to 1000 flow files per execution. If your script has a lot of overhead, you may find handling multiple flow files per execution can significantly improve performance.