Support Questions
Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Innovation Accelerator group hub.

Cannot set spark parquet compression codec to "uncompressed"

New Contributor

Hi! I am preparing myself for the CCA175 certification, and I cannot set the spark parquet compression codec to "uncompressed".

 

In particular, I write in spark-shell:

 

scala> sqlContext.setConf("spark.sql.parquet.compression.codec", "uncompressed")

 

and I get:

 

18/05/27 10:53:43 WARN metastore.ObjectStore: Version information not found in metastore. hivescala> sqlContext.setConf("spark.sql.parquet.compression.codec", "uncompressed")
18/05/27 10:53:43 WARN metastore.ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.1.0-cdh5.13.0
18/05/27 10:53:44 WARN metastore.ObjectStore: Failed to get database default, returning NoSuchObjectException

 

Any help would be much appreciated. Thank you very much in advanced! 🙂

2 REPLIES 2

Cloudera Employee

I tried in my CDH 6.3, found no errors.

 

scala> spark.sql("set spark.sql.parquet.compression.codec=uncompressed")
res8: org.apache.spark.sql.DataFrame = [key: string, value: string]

scala>

Cloudera Employee

Or you can also use the option to do this:

 

 

rs.write.option("compression", "uncompressed").parquet("/user/output01/")