Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Add filename to import in Hive

SOLVED Go to solution

Add filename to import in Hive

New Contributor

Hi guys,

 

I am new to Cloudera and have only used Impala and Hive through Hue thus far.

 

I have a folder of CSVs that I imported, but would like to add the filename as a column to each row that I import.

 

How do I do this?

 

I came across the function 

INPUT__FILE__NAME

 but I don't see it actually doing anything.

 

My import is very easy - hoping you can help me?

 

create external table hive_magnum
(
col1 string,
col2 string,
col3 string
)
row format delimited fields terminated by "," 
location '/user/xxxxxx/magnum';

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Add filename to import in Hive

Cloudera Employee

Hi Lucille,

 

For the example you provided, you could get the file names with a query like this:

 

SELECT hive_magnum.col1, hive_magnum.col2, hive_magnum.col3, hive_magnum.INPUT__FILE__NAME
FROM
hive_magnum;

It will actually provide the full HDFS location, which includes the file name. 

 

I hope this is helpful.

3 REPLIES 3

Re: Add filename to import in Hive

Cloudera Employee

Hi Lucille,

 

For the example you provided, you could get the file names with a query like this:

 

SELECT hive_magnum.col1, hive_magnum.col2, hive_magnum.col3, hive_magnum.INPUT__FILE__NAME
FROM
hive_magnum;

It will actually provide the full HDFS location, which includes the file name. 

 

I hope this is helpful.

Re: Add filename to import in Hive

New Contributor

thank you so much - this worked :)

 

I just have one more questions.. to get this into the another table - it doesn't seem to accept this.. what is wrong with it?

It also did not like the <pre> bit... If I take it out, it runs..


create table tester
(
SELECT hive_magnum.col1, hive_magnum.col2, hive_magnum.col3, hive_magnum.INPUT__FILE__NAME as file_name FROM hive_magnum)

 

 

Re: Add filename to import in Hive

Cloudera Employee
The
 bit was an attempt to format the post. It wasn't supposed to be
part of the query, sorry about that.

Try adding "as" after "create table tester" and before the nested select
query.