Community Articles

Find and share helpful community-sourced technical articles.
Labels (2)
avatar
Expert Contributor

A temporary table is a convenient way for an application to automatically manage intermediate data generated during a complex query. Rather than manually deleting tables needed only as temporary data in a complex query, Hive automatically deletes all temporary tables at the end of the Hive session in which they are created.

The data in these tables is stored in the user's scratch directory rather than in the Hive warehouse directory. The scratch directory effectively acts as the user' data sandbox, located by default in /tmp/hive-<username>.

Hive users create temporary tables using the TEMPORARY keyword

CREATE TEMPORARY TABLE tmp1(c1 string);
CREATE TEMPORARY TABLE tmp2 AS..
CREATE TEMPORARY TABLE tmp3 LIKE..

Multiple Hive users can create multiple Hive temporary tables with the same name because each table resides in a separate session.

Temporary tables support most table options, but not all. The following features are not supported:

  • Partition columns
  • Indexes

A temporary table with the same name as a permanent table will cause all references to that table name to resolve to the temporary table. The user cannot access the permanent table during that session without dropping or renaming the temporary table.

71,996 Views
Comments
avatar
New Contributor

what if the same user has two different sessions and uses the same table name. Will the table in each session be given its own scratch space?

avatar
New Contributor

@rbanerjee, to answer your question, yes, temporary tables are independent and isolated between sessions for the same user. Here, https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-TemporaryTable..., it says "A table that has been created as a temporary table will only be visible to the current session"

 

I have verified that. As the same user I created a temporary table of the same name in two concurrent Beeline sessions. They are independent, as confirmed by INSERT/SELECT statements and conclusively by their locations in HDFS:

 

Session 1: SHOW CREATE TABLE...

LOCATION ... /tmp/hive/hive/4d4e879d-5806-425e-9051-d764b117c71d/_tmp_space.db/d3b5f635-1251-4cda-be59-79555c8643fd

 

Session 2: SHOW CREATE TABLE...

LOCATION  .. /tmp/hive/hive/15049889-fc99-4890-b3eb-0a5d22c73083/_tmp_space.db/ac5a2d3c-a61f-4797-afa4-c88940a10147