About KFredrickson

KFredrickson · ‎02-06-2022

This was a result of a bug in my code and not anything to do with Hive itself - please ignore.

KFredrickson · ‎02-03-2022

Hi, I am seeing some situations where I have two Hive SQL commands running concurrently and I'm getting a lost update. I am running Hive 2.3.6 on EMR with hive.support.concurrency = true and I believe this shouldn't be happening based on what I understand about Hive table locking. (I am not using ACID transactions but the table locking should still prevent lost update as far as I know;) Specifically I have a "load data" statement loading data into table T from an S3 location. I have an "insert overwrite T select * from T" table running concurrently from another Hive connection that deletes some rows from T but should not be affecting rows from the load data statement. I am seeing that the data from the load data statement disappears after the insert overwrite finishes. My understanding is that the load data and insert overwrite should create an exclusive table lock on T so they should allow each other to finish before reading or writing data from T. (I checked this using "show locks" and they do definitely create an exclusive lock.) Has anyone seen this issue before and are there any Hive settings I can try changing to prevent this behavior?

KFredrickson · ‎02-12-2021

Looks like the files are available here: https://repo.hortonworks.com/content/repositories/releases/org/apache/nifi/nifi-hive-nar/

KFredrickson · ‎01-01-2020

I found that it's possible to fix this problem (as well as a different problem we were having with accessing Hive via a zookeeper connection string) by doing the following: Use a custom NiFi Hive NAR file that has the Hortonworks versions of the hive, hadoop and zookeeper jars. This will get rid of the problem with backticks and the problem with the ZooKeeper connection string. To create the NAR file I just unzipped nifi-hive-nar-1.10.0.nar that comes with the Apache NiFi distro, then replaced all the the hive-*, hadoop-*, and zookeeper-* jars with the ones in http://repo.spring.io/hortonworks/org/apache/nifi/nifi-hive-nar/1.9.0.3.4.1.9-2/ You can just treat the NAR files as regular ZIP files. There is no need to compile anything or use Maven. We have been using this custom NAR for a few weeks and the NiFi Hive processors seem to be working without any problems.

KFredrickson · ‎12-23-2019

When running Hive queries from NiFi 1.10 that contain backticks, I get the following error: 2019-12-23 15:17:00,191 WARN [Timer-Driven Process Thread-2] o.a.nifi.processors.hive.SelectHiveQL SelectHiveQL[id=075b3a1e-7632-1647-68d9-338231b5921b] Failed to parse query: select 1 as `asdf` due to java.lang.NullPointerException: I thought Hive queries allowed backticks to escape column names, so I'm not sure why NiFi can't parse this. The actual query runs fine on the Hive server, and I get a valid flow file with the query results, but it still raises a red NiFi bulletin (which we would prefer not to have if there is not a real problem). The Hive server running the query is the one that comes with HDP 2.5.

Online	Offline
Last Visited	‎07-28-2024 09:03 PM

Member Since	‎10-24-2017 04:49 PM
Last Visited	‎07-28-2024 09:03 PM
Posts	17
Kudos received	2

Cloudera Community

Re: Hive concurrency - lost update

Re: Hive concurrency - lost update

Hive concurrency - lost update

Re: NiFi 1.10 Hive processors and backticks

Re: NiFi 1.10 Hive processors and backticks

NiFi 1.10 Hive processors and backticks