Hi,
I have a 'simple' usecase (I thought it is simple 😉 ).
I want to call a hive script with a parameter in format YYYY-MM-DD. Inside the script I want to load data from HDFS path /a/b/c/d_${parameter}
How to build the path dynamically to be able to place it into LOAD DATA INPATH '...' OVERWRITE INTO TABLE data_staging;
Even setting the path to a variable (hardcoded) and place it as INPATH doesn't work ?!?!
I tried:
set p="/a/b/c/d_2013-07-08/data.tsv";
LOAD DATA INPATH "${hiveconf:p}" OVERWRITE INTO TABLE data_staging;
=> Error while compiling statement: FAILED: ParseException line 1:19 mismatched input '/' expecting INTO near '""' in load statement
If I remove the quotes from the parameter placeholder I receive an error, too. File not found, even if it does exist in HDFS, for sure, I checked it many times...
set p="/a/b/c/d_2013-07-08/data.tsv";
LOAD DATA INPATH ${hiveconf:p} OVERWRITE INTO TABLE data_staging;
=> Error while compiling statement: FAILED: SemanticException Line 1:17 Invalid path '"/a/b/c/d_2013-07-08/data.tsv"': No files matching path hdfs://nameservice1/a/b/c/d_2013-07-08/data.tsv
How can I build the path variable inside the hive script to be able to load data files from that path ?
I want to pass the day as parameter and read files from HDFS path /a/b/c/d_YYYY-MM-DD/
thanks in advance...
PS: using latest CDH5.0.1