Created 05-27-2018 01:20 PM
Hi! I am preparing myself for the CCA175 certification, and I cannot set the spark parquet compression codec to "uncompressed".
In particular, I write in spark-shell:
scala> sqlContext.setConf("spark.sql.parquet.compression.codec", "uncompressed")
and I get:
18/05/27 10:53:43 WARN metastore.ObjectStore: Version information not found in metastore. hivescala> sqlContext.setConf("spark.sql.parquet.compression.codec", "uncompressed") 18/05/27 10:53:43 WARN metastore.ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.1.0-cdh5.13.0 18/05/27 10:53:44 WARN metastore.ObjectStore: Failed to get database default, returning NoSuchObjectException
Any help would be much appreciated. Thank you very much in advanced! 🙂
Created 10-01-2019 02:18 AM
I tried in my CDH 6.3, found no errors.
scala> spark.sql("set spark.sql.parquet.compression.codec=uncompressed")
res8: org.apache.spark.sql.DataFrame = [key: string, value: string]
scala>
Created on 10-01-2019 03:00 AM - edited 10-01-2019 03:01 AM
Or you can also use the option to do this:
rs.write.option("compression", "uncompressed").parquet("/user/output01/")