Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

ORC Format Slow JDBC Fetch

ORC Format Slow JDBC Fetch

New Contributor

We have a Hive Table stored in ORC format. The query:

select * from myTable where field = 'value'

executes in less than a second, which is great.

However when try to load the data using a JDBC recordset it takes ~50 secs to respond with data when we have a small number of rows (<1000). The smaller the recordset the worse the performance is...

while rs.Next() <-- this takes ~50 secs on the first call
    ...load the data

This behavior is limited to ORC tables only and doesn't seem to be influenced by compression (we've tried both SNAPPY and ZLIB).

Any ideas what might be happening here? Are there any tuning options which we could employ here?

1 REPLY 1

Re: ORC Format Slow JDBC Fetch

New Contributor

I've resolved this myself. Application was actually submitting a

select * from myTable where field LIKE = '%value%'

This was of course causing a full table scan.