Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Save a Dataframe in Hive with specific format and compression and correct schema (Pyspark)

Save a Dataframe in Hive with specific format and compression and correct schema (Pyspark)

Explorer

Hi Guys

 

I want to save a Dataframe (read in CSV format) to a Hive table (parquet format and snappy).

 

Also i wanna keep the correct schema and pass it to hive table as well.

I tried

df.write.saveAsTable('db_name.table_name',format='parquet',compression='snappy',inferSchema=True)

Did i in a correct way? Or there are a most efficient way?

 

Put the inferSchema is the correct way to pass the correct schema?

Don't have an account?
Coming from Hortonworks? Activate your account here