Created 04-26-2016 10:06 PM
Assuming compression is enabled of course.
Created 04-26-2016 10:44 PM
@Terry Padgett These are stored as uncompressed text files.
Created 04-26-2016 10:44 PM
@Terry Padgett These are stored as uncompressed text files.
Created 04-27-2016 01:38 PM
Which temporary tables are we talking about?
Tables you create with CREATE TEMPORARY TABLE?
These can have any storage format you want. So you you create it as ORC it definitely WILL be compressed.
Or what do you mean with "compression is enabled" ?
There are also some internal structures for example the dataset that is generated by the Tez job before Hiveserver2 returns it to the client. This can be text or sequence file ( configurable ) but I heard there is a jira to use ORC for it instead.
Created 04-27-2016 02:18 PM
@Benjamin Leonhardi Yes, these are Hive temporary tables. The feature is new'ish and I wanted to know if there are any surprises not mentioned in the language manual. Memory is one of the options for temporary table storage and I want to see if it is possible to fit the tables into memory. The tables are short-lived so I don't think ORC is a realistic choice at the moment but that could change.
Created 04-27-2016 02:49 PM
What do you mean with memory? As far as I know a temporary table is just like any other table with the one exception that it will be cleaned up when the session ends. So you can choose any storage format but it will be HDFS. So it depends. If you only need it once I would agree ORC is most likely not good but if you create a temp tables once and then query it a couple of times ORC definitely makes sense to me .
Edit: Interesting You could use the HDFS storage policies here. Do you have a cluster that has been setup like this? You could still use any kind of storage you want compressed or not and I still think that ORC will be good if you use your temporary table a couple times.
Starting in Hive 1.1.0 the storage policy for temporary tables can be set tomemory
,ssd
, ordefault
with the hive.exec.temporary.table.storage configuration parameter (see HDFS Storage Types and Storage Policies).
Created 04-27-2016 03:14 PM
If you want to store these temporary tables as ORC, it is still possible. Here is an example.
create temporary table tp1 stored as orcfile as select count(*) from table_params;
My earlier answer was whether the text format which is default is compressed on hdfs.
Created 04-27-2016 04:31 PM
Ya @Ravi Mutyala , the temporary tables are only in use for a few minutes. My concern is also about any additional time being spent when writing the table as ORC. Probably have to run a bake off to see how it works in this case.