Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

Presto (Prestosql) can't read ORC ACID transactional tables written by NiFi's PutHive3Streaming

Explorer

So the title basically states it, but I'm currently running into an issue when leveraging Presto to ready from a Hive3 environment if the table is populated with ORC data by Nifi's PutHive3Streaming processor.

 

Presto is able to read ORC ACID tables if Hive 3 and populated via command line or other nifi processors. I attempted to write data using PutHive3Streaming from later versions of Nifi (1.11.4) to no avail. 

 

Error:

io.prestosql.spi.PrestoException: Error opening Hive split hdfs://path/to/bucket (offset=0, length=29205493): rowsInRowGroup must be greater than zero

 

Versions:
Nifi HDF 1.9
PrestoSQL 331/332

2 REPLIES 2

@Eric_B Are the tables Presto cannot read owned by NiFi?   The error you share seems like a permissions issue to the underlying files.  Also if you can, please share screen shots of your processor configurations.

 

 

Explorer

@stevenmatison 

Thanks for responding!

 

I did think this was a file permissions issue on the start, but I ran some tests.

 

Test 1: I chown'd/chmod'd the underlining files to match ORC files that presto could read from (those not written by PutHive3Streaming). Didn't work.

 

Test 2: I ran Nifi's SelectHive3QL (which supports inserts). This wrote the data with file permissions and ownership similar to the other processor. Presto is able to read that data.

 

Were you able to get to work?

Additionally here's a snippet of puthive3streaming (minus the specifics like table, pathways, dbs). Using an avroreader to write.

Eric_B_0-1586790329125.png

 

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.