Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Presto (Prestosql) can't read ORC ACID transactional tables written by NiFi's PutHive3Streaming

avatar
Rising Star

So the title basically states it, but I'm currently running into an issue when leveraging Presto to ready from a Hive3 environment if the table is populated with ORC data by Nifi's PutHive3Streaming processor.

 

Presto is able to read ORC ACID tables if Hive 3 and populated via command line or other nifi processors. I attempted to write data using PutHive3Streaming from later versions of Nifi (1.11.4) to no avail. 

 

Error:

io.prestosql.spi.PrestoException: Error opening Hive split hdfs://path/to/bucket (offset=0, length=29205493): rowsInRowGroup must be greater than zero

 

Versions:
Nifi HDF 1.9
PrestoSQL 331/332

2 REPLIES 2

avatar
Super Guru

@Eric_B Are the tables Presto cannot read owned by NiFi?   The error you share seems like a permissions issue to the underlying files.  Also if you can, please share screen shots of your processor configurations.

 

 

avatar
Rising Star

@stevenmatison 

Thanks for responding!

 

I did think this was a file permissions issue on the start, but I ran some tests.

 

Test 1: I chown'd/chmod'd the underlining files to match ORC files that presto could read from (those not written by PutHive3Streaming). Didn't work.

 

Test 2: I ran Nifi's SelectHive3QL (which supports inserts). This wrote the data with file permissions and ownership similar to the other processor. Presto is able to read that data.

 

Were you able to get to work?

Additionally here's a snippet of puthive3streaming (minus the specifics like table, pathways, dbs). Using an avroreader to write.

Eric_B_0-1586790329125.png