Reply
New Contributor
Posts: 2
Registered: ‎11-23-2015

Impala query parquet files from S3

Hi, 

 

I am curious, for using Impala to query parquet files from S3, does it seek only download the needed columns, or it download the whole file first? I remember S3 files being an object so that it doesnt allow to seek specific bytes which is needed to efficiently use parquet files.

 

 

Thanks.

Cloudera Employee
Posts: 4
Registered: ‎08-26-2014

Re: Impala query parquet files from S3

Impala uses a range get via the S3A connector to download only the column chunks needed.

Announcements