Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

SPARK HIVE - Parquet and Snappy format - Table issue

Highlighted

SPARK HIVE - Parquet and Snappy format - Table issue

Contributor

I am trying to create a hive table in parquet format with snappy compression. Instead of sqlContext I am using HiveContext to directly save my dataframe results into a table using saveAsTable("<table name>").

I set the format using "hc.setConf('spark.sql.parquet.compression.codec','snappy')"

But the hive table is always created as parquet with gz compression instead of parquet with snappy compression codec. Is there any solution for this?

2 REPLIES 2

Re: SPARK HIVE - Parquet and Snappy format - Table issue

Super Guru

@Mahendiran Palani Samy

Try with .option instead of hc.setConf

Example:

dataframe.write()
.format("parquet")
.option("compression","snappy")
.saveAsTable("<table_name>")


Re: SPARK HIVE - Parquet and Snappy format - Table issue

Contributor

Thanks, Shu.. It is working now

Don't have an account?
Coming from Hortonworks? Activate your account here