Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

How to load existing partitoned parquet data in hive table from S3 bucket ?

New Contributor

Hi,

 

Currently i am having a setup where i am already having partitioned parquet data in my s3 bucket, which i want to dynamically bind with hive table.

 

I am able to achieve it for single partition, but i need help loading data from all the partitions in the table from existing partitioned parquet data from s3 bucket.

 

Thank you.

1 REPLY 1

Expert Contributor

@codiste_m  By default hive will be using Static Partitioning.   With Hive you can do Dynamic Partitioning, but i am not sure how well that works with existing data in existing folders.  I believe this creates the correct partitions based on the schema, and is creating those partition folders as the data inserts into the storage path.

 

It sounds like you will need to execute a load data command for all partitions you want to query.

 

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.