- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Data vanishing in HDFS after moving to Hive table ?
- Labels:
-
Apache Hive
-
Apache Impala
-
HDFS
-
Quickstart VM
Created on ‎05-24-2017 08:25 AM - edited ‎09-16-2022 04:39 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi all,
I am using Quickstart VM 5.8.
I have loaded some flat files in HDFS .
I have created external table in hive as below :
CREATE External TABLE abc (ID int, Price double, Start_DTTM string, DEL_DT_TM string)
row format delimited fields terminated by ',' stored as textfile;
load data inpath '/user/cloudera/CPC/QSM/QSM_MarToApr2016.csv' into table abc;
Data loaded successfully in Hive table.
But in HDFS data is vanishing .
Please suggest
Thanks,
Syam.
Created ‎05-25-2017 10:47 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You did not specify a location in the create external table.. I believe it then defaults to the warehouse directory.
The load data inpath does move the data from the path specified to the tables location. I think it move it from /user/cloudera/CPC/QSM/QSM_MarToApr2016.csv to /user/hive/warehouse/abc/...
Created ‎05-28-2017 11:33 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Mbigelow,
Thanks for the reply.
I have uploaded the falt file in HDFS location. (/user/clouder/QSM/)
And i created a table as above and loaded the data.
Data loaded successfully to hive.
But I dont want to move data to Hive warehouse.
Without vanishing data in HDFS. Hive results should come.
Please guide me.
Thanks,
Syam.
Created ‎05-28-2017 11:48 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
On the last statement, are you saying that after loading the data in the table it was no longer in the original location but you also weren't getting data returned from the table?
Created ‎05-28-2017 11:52 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Okay, I will set the HDFS location while creating the table.
Data vanishing from HDFS location, But in Hive location data is there.
Thanks,
Syam.
Created ‎05-29-2017 04:25 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
While you create external table - mention the LOCATION ' ' ( i,e The default location of Hive table is overwritten by using LOCATION )
Then load data from HDFS using ' inpath ' - if you drop the table it will only remove the pointer from the hdfs and will not delete the data in the hdfs.
CREATE EXTERNAL TABLE text1 ( wban INT, date STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘,’ LOCATION ‘ /hive/data/text1’;
LOAD DATA INPATH ‘hdfs:/data/2000.txt’ INTO TABLE TABLE_NAME ;
