I have to first read a file from local directory using Getfile and then will be loading it into HDFS. Once the file is in HDFS, then I have copy it from hive based table and insert it into another hive table. For insert/select , I came to know that I need to use puthiveql through NIFI documentation. But I could not find any option in PutHiveQL where I can write my Insert-select query. Can anybody tell me that where is option in PutHiveQL to write the Hive query?
Thanks in advance.
PutHiveQL is the correct processor to use. However, the SQL/HiveQL has to be formatted and passed to it by the previous processor.
For example, you can use the ReplaceText processor to build a HiveQL statement (e.g. INSERT INTO nifitest ...), and then pass the output to the PutHiveQL processor.
The above is the correct way to do it. However, there is also a workaround; by using the SelectHiveQL processor, you can write an insert statement as below.
Take a look at the below two links that will clarify everything with examples:
I am using PutHiveql along with ReplaceText.
Data flow is running succesfully but my insert-select did not execute because final table is empty in which I am performing insert-select.
In replace text properties, I have left the search value as default and in Replacement Value I have written my insert-select query.
When I checked the log, I found below error:
Could not open client transport with JDBC Uri: jdbc:hive2://sandbox.hortonworks.com:2181/testdb: null
It looks like you are trying to connect via the Zookeeper port, but there are some issues with this:
For these reasons, I recommend you connect via the standard HiveServer2 port (10000), which is exposed by default for the sandbox's NAT configuration.