- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Regarding the text file compression
- Labels:
-
Apache Hive
-
Apache Impala
Created 03-29-2018 03:32 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Is it possible to compress a 'TEXTFILE' in hive/impala, without converting to other formats (like parquet and orc)?
Thanks
Created 03-30-2018 03:05 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In your Hive terminal, set the following properties
set hive.exec.compress.output=true; set mapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.GzipCodec;
This will enable the compression and will set the compression codec, gzip in this case.
Now you can insert the data into an HDFS directory and the output will be in gzip format.
insert overwrite directory 'myHDFSDirectory' row format delimited fields terminated by ',' select * from myTable;
This will store the output of my select * query in the HDFS directory.
Let know if that works for you.
Created 04-01-2018 03:23 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Did the answer help in the resolution of your query? Please close the thread by marking the answer as Accepted!
