Created on 12-27-2019 07:57 AM - edited 09-16-2022 07:34 AM
I have a use case scenario coming up next week where I will need to process some Parquet to CSV. I created a few demos with nifi 1.10 but unfortunately it is not possible to use this version of NiFi on the customers environment. I know I can still satisfy the actions I need to take with 1.9 without the new 1.10 features for Parquet, but I know these new features would be more efficient.
Is it possible to drop the required artifacts into 1.9 and achieve Parquet Reader functionality?
Created 12-27-2019 02:25 PM
I don't see any reason why you can not do this. Do NOT put the 1.10 nar in to NiFi's defualt lib directory.
Instead create a new custom lib directory by adding a new property to the nifi.properties file:
nifi.nar.library.directory.custom1=/<path>/custom-libs
Copy the 1.10 parquet nars in to this directory and make sure the permissions and ownership are set correctly for this custom path and libs so that the NiFi service user can access them.
Once NiFi is restarted, you should see both versions of the parquet processor available for adding to the NiFi canvas.
Note: Adding additional versions of the same nars can make upgrading a bit more challenging.
After copying the new nar in to custom nar lib, you now have both 1.9 and 1.10 versions of the NiFi components. Let's say later you upgrade to 1.11 when it becomes available for example. NIFi normally handles upgrading to new versions during startup (1.9 processors upgraded to 1.11), but in you case after upgrade you will have this custom lib nar 1.10 and 1.11 that cam with upgrade. Any 1.9 parquet components from your flow.xml.gz will become ghost processors on the canvas because NiFi does not have a version 1.9 and there are two options (1.10 and 1.11) so it picks neither. You would need to either drop ghost processors and add the 1.10 or 1.11 version in its place or manually edit the flow.xml.gz to change version number manually to desired available version.
Hope this helps,
Matt
Created 12-27-2019 02:25 PM
I don't see any reason why you can not do this. Do NOT put the 1.10 nar in to NiFi's defualt lib directory.
Instead create a new custom lib directory by adding a new property to the nifi.properties file:
nifi.nar.library.directory.custom1=/<path>/custom-libs
Copy the 1.10 parquet nars in to this directory and make sure the permissions and ownership are set correctly for this custom path and libs so that the NiFi service user can access them.
Once NiFi is restarted, you should see both versions of the parquet processor available for adding to the NiFi canvas.
Note: Adding additional versions of the same nars can make upgrading a bit more challenging.
After copying the new nar in to custom nar lib, you now have both 1.9 and 1.10 versions of the NiFi components. Let's say later you upgrade to 1.11 when it becomes available for example. NIFi normally handles upgrading to new versions during startup (1.9 processors upgraded to 1.11), but in you case after upgrade you will have this custom lib nar 1.10 and 1.11 that cam with upgrade. Any 1.9 parquet components from your flow.xml.gz will become ghost processors on the canvas because NiFi does not have a version 1.9 and there are two options (1.10 and 1.11) so it picks neither. You would need to either drop ghost processors and add the 1.10 or 1.11 version in its place or manually edit the flow.xml.gz to change version number manually to desired available version.
Hope this helps,
Matt
Created on 01-08-2020 07:36 AM - edited 01-08-2020 08:56 AM
To get the nar files I needed I downloaded NIFI 1.10 and copied the files from /root/nifi-1.10.0/lib to a custom folder on all my nifi nodes (/app/nifi/custom-libs/).
I had the following error and need another nar file:
While loading 'org.apache.nifi:nifi-parquet-nar:1.10.0' unable to locate exact NAR dependency 'org.apache.nifi:nifi-hadoop-libraries-nar:1.10.0'
Once I had both nars I was able to restart and see the following my nifi log (set to DEBUG)
/lib/nifi/work/nar/extensions/nifi-parquet-nar-1.9.0.3.4.1.1-4.nar-unpacked] org.apache.nifi.processors.parquet.FetchParquet org.apache.nifi:nifi-parquet-nar:1.10.0 || /var/lib/nifi/work/nar/extensions/nifi-parquet-nar-1.10.0.nar-unpacked org.apache.nifi:nifi-parquet-nar:1.9.0.3.4.1.1-4 || /var/lib/nifi/work/nar/extensions/nifi-parquet-nar-1.9.0.3.4.1.1-4.nar-unpacked org.apache.nifi.processors.parquet.FetchParquet org.apache.nifi:nifi-parquet-nar:1.10.0 || /var/lib/nifi/work/nar/extensions/nifi-parquet-nar-1.10.0.nar-unpacked org.apache.nifi:nifi-parquet-nar:1.9.0.3.4.1.1-4 || /var/lib/nifi/work/nar/extensions/nifi-parquet-nar-1.9.0.3.4.1.1-4.nar-unpacked org.apache.nifi.processors.parquet.PutParquet org.apache.nifi:nifi-parquet-nar:1.10.0 || /var/lib/nifi/work/nar/extensions/nifi-parquet-nar-1.10.0.nar-unpacked org.apache.nifi:nifi-parquet-nar:1.9.0.3.4.1.1-4 || /var/lib/nifi/work/nar/extensions/nifi-parquet-nar-1.9.0.3.4.1.1-4.nar-unpacked org.apache.nifi.processors.parquet.ConvertAvroToParquet org.apache.nifi:nifi-parquet-nar:1.10.0 || /var/lib/nifi/work/nar/extensions/nifi-parquet-nar-1.10.0.nar-unpacked org.apache.nifi:nifi-parquet-nar:1.9.0.3.4.1.1-4 || /var/lib/nifi/work/nar/extensions/nifi-parquet-nar-1.9.0.3.4.1.1-4.nar-unpacked org.apache.nifi.processors.parquet.PutParquet org.apache.nifi:nifi-parquet-nar:1.10.0 || /var/lib/nifi/work/nar/extensions/nifi-parquet-nar-1.10.0.nar-unpacked org.apache.nifi:nifi-parquet-nar:1.9.0.3.4.1.1-4 || /var/lib/nifi/work/nar/extensions/nifi-parquet-nar-1.9.0.3.4.1.1-4.nar-unpacked org.apache.nifi.processors.parquet.ConvertAvroToParquet org.apache.nifi:nifi-parquet-nar:1.10.0 || /var/lib/nifi/work/nar/extensions/nifi-parquet-nar-1.10.0.nar-unpacked org.apache.nifi:nifi-parquet-nar:1.9.0.3.4.1.1-4 || /var/lib/nifi/work/nar/extensions/nifi-parquet-nar-1.9.0.3.4.1.1-4.nar-unpacked org.apache.nifi.parquet.ParquetRecordSetWriter org.apache.nifi:nifi-parquet-nar:1.10.0 || /var/lib/nifi/work/nar/extensions/nifi-parquet-nar-1.10.0.nar-unpacked org.apache.nifi.parquet.ParquetReader org.apache.nifi:nifi-parquet-nar:1.10.0 || /var/lib/nifi/work/nar/extensions/nifi-parquet-nar-1.10.0.nar-unpacked
Anyone working with NiFi and Custom Nars or Other Version Nars will also need to be sure to remove the NIFI "work" directory after all changes. In my HDF cluster it was an rm -rf /var/lib/nifi/work on all 4 nifi nodes. If this folder is not deleted, the restart event will completely break startup or will not be recreated with the new nar files.
Now that I am getting NiFi started, I can see the 1.10 in /var/lib/nifi/work/extensions, but I am not seeing any of the 1.10 procs or controller services.... I did another work directory removal, and restarted all nifi again and was able to see all the 1.10 Parquet Procs and the Parquet Reader & Writer Controller Services (what I needed).
Once i was working with the Reader/Writer in NiFi ConvertRecord Processor I had the following error which stated that the 1.10 RecordReader was not compatible with the 1.9 RecordReaderFactory. This required more nar files:
Copied to all nodes, deleted work dir, restarted NiFi from Ambari. Looks like I am going to have to use 1.10 versions of everything related to the flow (ConvertRecord & CSVWriter).
Thanks @MattWho you are the boss!
Created 02-04-2020 11:41 AM
@MattWho do you know what version of Parquet is supported by the new readers?
Created 02-04-2020 12:46 PM