Support Questions
Find answers, ask questions, and share your expertise

how to build HDFS path to load data from ?!?!?!

Solved Go to solution

how to build HDFS path to load data from ?!?!?!

Guru

Hi,

 

I have a 'simple' usecase (I thought it is simple ;) ).

I want to call a hive script with a parameter in format YYYY-MM-DD. Inside the script I want to load data from HDFS path /a/b/c/d_${parameter}

 

How to build the path dynamically to be able to place it into LOAD DATA INPATH '...' OVERWRITE INTO TABLE data_staging;

 

Even setting the path to a variable (hardcoded) and place it as INPATH doesn't work ?!?!

 

I tried:

 

set p="/a/b/c/d_2013-07-08/data.tsv";
LOAD DATA INPATH "${hiveconf:p}" OVERWRITE INTO TABLE data_staging;

=> Error while compiling statement: FAILED: ParseException line 1:19 mismatched input '/' expecting INTO near '""' in load statement

 

If I remove the quotes from the parameter placeholder I receive an error, too. File not found, even if it does exist in HDFS, for sure, I checked it many times...

 

set p="/a/b/c/d_2013-07-08/data.tsv";
LOAD DATA INPATH ${hiveconf:p} OVERWRITE INTO TABLE data_staging;

=> Error while compiling statement: FAILED: SemanticException Line 1:17 Invalid path '"/a/b/c/d_2013-07-08/data.tsv"': No files matching path hdfs://nameservice1/a/b/c/d_2013-07-08/data.tsv

 

How can I build the path variable inside the hive script to be able to load data files from that path ?

I want to pass the day as parameter and read files from HDFS path /a/b/c/d_YYYY-MM-DD/

 

thanks in advance...

 

PS: using latest CDH5.0.1

1 ACCEPTED SOLUTION

Accepted Solutions

Re: how to build HDFS path to load data from ?!?!?!

Guru

sorry for bothering ;)

 

Issue has been solved, the error was the result of "work in concurrency" of many people in the same folder. Thereby the message "file not found" did make sense....

View solution in original post

1 REPLY 1

Re: how to build HDFS path to load data from ?!?!?!

Guru

sorry for bothering ;)

 

Issue has been solved, the error was the result of "work in concurrency" of many people in the same folder. Thereby the message "file not found" did make sense....

View solution in original post