Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

how to use puthiveql in NIFI

avatar
Contributor

Hi,

I have to first read a file from local directory using Getfile and then will be loading it into HDFS. Once the file is in HDFS, then I have copy it from hive based table and insert it into another hive table. For insert/select , I came to know that I need to use puthiveql through NIFI documentation. But I could not find any option in PutHiveQL where I can write my Insert-select query. Can anybody tell me that where is option in PutHiveQL to write the Hive query?

Thanks in advance.

3 REPLIES 3

avatar

@Muhammad Umar

PutHiveQL is the correct processor to use. However, the SQL/HiveQL has to be formatted and passed to it by the previous processor.

For example, you can use the ReplaceText processor to build a HiveQL statement (e.g. INSERT INTO nifitest ...), and then pass the output to the PutHiveQL processor.

The above is the correct way to do it. However, there is also a workaround; by using the SelectHiveQL processor, you can write an insert statement as below.

16196-hiveql.jpg

Take a look at the below two links that will clarify everything with examples:

https://community.hortonworks.com/questions/52328/how-to-insert-data-into-hive-using-nifi.html

https://community.hortonworks.com/articles/45706/using-the-new-hiveql-processors-in-apache-nifi-070....

avatar
Contributor

@Eyad Garelnabi,

I am using PutHiveql along with ReplaceText.

Data flow is running succesfully but my insert-select did not execute because final table is empty in which I am performing insert-select.

In replace text properties, I have left the search value as default and in Replacement Value I have written my insert-select query.

When I checked the log, I found below error:

Could not open client transport with JDBC Uri: jdbc:hive2://sandbox.hortonworks.com:2181/testdb: null

avatar
Master Guru

It looks like you are trying to connect via the Zookeeper port, but there are some issues with this:

  1. You'll likely have trouble because of the issue described in NIFI-2575. Even if that were not the issue, the following two things would have to be done.
  2. I believe you'd have to set some variables in the URL (such as serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2)
  3. The Zookeeper port is not exposed via the sandbox by default, you would have to forward that port if using NAT.

For these reasons, I recommend you connect via the standard HiveServer2 port (10000), which is exposed by default for the sandbox's NAT configuration.