Options
- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
how to compress bzip2 format and insert into hive
Labels:
- Labels:
-
Apache Spark
Explorer
Created ‎02-27-2018 03:06 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
hi,
I am trying to insert my dataframe using orc and bzip2 compression but it is throwing me the error
java.lang.IllegalArgumentException: Codec [bzip2] is not available. Available codecs are uncompressed, lzo, snappy, zlib, none. at org.apache.spark.sql.hive.orc.OrcOptions.<init>(OrcOptions.scala:49) at org.apache.spark.sql.hive.orc.OrcOptions.<init>(OrcOptions.scala:32) at org
My code is
fields.write.format("orc").option("compression","bzip2").saveAsTable("prasadtest.descargatest") I am using spark 2 for this.
3 REPLIES 3
Guru
Created ‎02-27-2018 03:25 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Prasad
You need to import the BZip2Codec class in your code. Simply add the following line to your code and it should work fine.
import org.apache.hadoop.io.compress.BZip2Codec;
Expert Contributor
Created ‎02-27-2018 04:10 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, @prasad raju
Unfortunately, ORC doesn't support BZip2, so Hive and Spark doesn't.
Master Guru
Created ‎02-27-2018 04:18 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Use Snappyas your compression
