- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Add filename to import in Hive
- Labels:
-
Apache Hive
-
Apache Impala
-
Cloudera Hue
Created on ‎06-13-2016 05:22 AM - edited ‎09-16-2022 03:24 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi guys,
I am new to Cloudera and have only used Impala and Hive through Hue thus far.
I have a folder of CSVs that I imported, but would like to add the filename as a column to each row that I import.
How do I do this?
I came across the function
INPUT__FILE__NAME
but I don't see it actually doing anything.
My import is very easy - hoping you can help me?
create external table hive_magnum
(
col1 string,
col2 string,
col3 string
)
row format delimited fields terminated by ","
location '/user/xxxxxx/magnum';
Created on ‎06-17-2016 02:56 PM - edited ‎06-17-2016 02:57 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Lucille,
For the example you provided, you could get the file names with a query like this:
SELECT hive_magnum.col1, hive_magnum.col2, hive_magnum.col3, hive_magnum.INPUT__FILE__NAME FROM hive_magnum;
It will actually provide the full HDFS location, which includes the file name.
I hope this is helpful.
Created on ‎06-17-2016 02:56 PM - edited ‎06-17-2016 02:57 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Lucille,
For the example you provided, you could get the file names with a query like this:
SELECT hive_magnum.col1, hive_magnum.col2, hive_magnum.col3, hive_magnum.INPUT__FILE__NAME FROM hive_magnum;
It will actually provide the full HDFS location, which includes the file name.
I hope this is helpful.
Created ‎06-20-2016 01:58 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
thank you so much - this worked 🙂
I just have one more questions.. to get this into the another table - it doesn't seem to accept this.. what is wrong with it?
It also did not like the <pre> bit... If I take it out, it runs..
create table tester
(
SELECT hive_magnum.col1, hive_magnum.col2, hive_magnum.col3, hive_magnum.INPUT__FILE__NAME as file_name FROM hive_magnum)
Created ‎06-20-2016 06:59 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
bit was an attempt to format the post. It wasn't supposed to be
part of the query, sorry about that.
Try adding "as" after "create table tester" and before the nested select
query.
