how to compress bzip2 format and insert into hive


I am trying to insert my dataframe using orc and bzip2 compression but it is throwing me the error

java.lang.IllegalArgumentException: Codec [bzip2] is not available. Available codecs are uncompressed, lzo, snappy, zlib, none.
  at org.apache.spark.sql.hive.orc.OrcOptions.<init>(OrcOptions.scala:49)
  at org.apache.spark.sql.hive.orc.OrcOptions.<init>(OrcOptions.scala:32)
My code is


I am using spark 2 for this.

Hi Prasad

You need to import the BZip2Codec class in your code. Simply add the following line to your code and it should work fine.


Hi, @prasad raju

Unfortunately, ORC doesn't support BZip2, so Hive and Spark doesn't.

- ORC Source Code

- HIVE-5067

Use Snappyas your compression