Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Loading data into Hive Table from HDFS deletes the file from source directory(HDFS).

avatar
Rising Star

Hi All,

When we Load data into Hive table from HDFS, it deletes the file from source directory(HDFS) is there a way we can keep the file in the source directory and load the data into hive table as well.

I used the below query;

LOAD DATA INPATH 'source_file_path' OVERWRITE INTO TABLE TABLENAME;

1 ACCEPTED SOLUTION

avatar
Super Guru

@Ravikumar Kumashi

If you don't want to loss the source data copy while loading then the best way would be to create external table over that existing hdfs directory OR you can also make a copy of your source directory and create an external hive table that should point to new dir location.

hadoop fs -cp /path/old/hivetable /path/new/hivetable
create external table table_name ( id int, myfields string )
 location '/path/new/hivetable';

View solution in original post

3 REPLIES 3

avatar
Super Guru

@Ravikumar Kumashi

If you don't want to loss the source data copy while loading then the best way would be to create external table over that existing hdfs directory OR you can also make a copy of your source directory and create an external hive table that should point to new dir location.

hadoop fs -cp /path/old/hivetable /path/new/hivetable
create external table table_name ( id int, myfields string )
 location '/path/new/hivetable';

avatar
Rising Star

@Jitendra Yadav

Thank you!!! that works for me.

I thought there is a way to keep the file in source directory and load the data into managed table as well and looks like there is no way for that.

avatar
Visitor

In my case, source file gets removed, when I load a single file with 'OVERWRITE' clause.

files stay when I load without 'OVERWRITE' clause for a set of files with a pattern (say _*.txt)