Created on 05-16-2014 05:25 AM - edited 09-16-2022 01:59 AM
Hi,
I have a 'simple' usecase (I thought it is simple 😉 ).
I want to call a hive script with a parameter in format YYYY-MM-DD. Inside the script I want to load data from HDFS path /a/b/c/d_${parameter}
How to build the path dynamically to be able to place it into LOAD DATA INPATH '...' OVERWRITE INTO TABLE data_staging;
Even setting the path to a variable (hardcoded) and place it as INPATH doesn't work ?!?!
I tried:
set p="/a/b/c/d_2013-07-08/data.tsv";
LOAD DATA INPATH "${hiveconf:p}" OVERWRITE INTO TABLE data_staging;
=> Error while compiling statement: FAILED: ParseException line 1:19 mismatched input '/' expecting INTO near '""' in load statement
If I remove the quotes from the parameter placeholder I receive an error, too. File not found, even if it does exist in HDFS, for sure, I checked it many times...
set p="/a/b/c/d_2013-07-08/data.tsv";
LOAD DATA INPATH ${hiveconf:p} OVERWRITE INTO TABLE data_staging;
=> Error while compiling statement: FAILED: SemanticException Line 1:17 Invalid path '"/a/b/c/d_2013-07-08/data.tsv"': No files matching path hdfs://nameservice1/a/b/c/d_2013-07-08/data.tsv
How can I build the path variable inside the hive script to be able to load data files from that path ?
I want to pass the day as parameter and read files from HDFS path /a/b/c/d_YYYY-MM-DD/
thanks in advance...
PS: using latest CDH5.0.1
Created 05-16-2014 11:34 AM
sorry for bothering 😉
Issue has been solved, the error was the result of "work in concurrency" of many people in the same folder. Thereby the message "file not found" did make sense....
Created 05-16-2014 11:34 AM
sorry for bothering 😉
Issue has been solved, the error was the result of "work in concurrency" of many people in the same folder. Thereby the message "file not found" did make sense....