Apologies in advance, I am just starting out in the land of Big Data, Hive, Cloudera ....
Possibly to make things worse, until now, I have only used DAS for queries.
I would like to know if I can use a text file as a source for the Where statement in my query, ie to obtain the string (actually a phone number) from each row, and to query using that string (and obviously to repeat the query for each row in the file).
The input file is actually delimited with semi-colon, although I would be happy to awk the file if necessary, to create an input file with one phone number per line.
As you would expect, I have googled, but it only resulted in a discussion using OPENROWSET in SQL. I couldn't find a hive alternative, but my results repeatedly mention OPENQUERY (I am wondering if linking OPENROWSET to HiveQL has sent me in the wrong direction).
Is it possible to use a file in this manner from Hive QL?
I guess I would have to leave the comfort of the DAS GUI so that I can reference the file location from command line. As you can tell, I would appreciate some detail on how to achieve this!
A further warning, the terminology and concepts of Hadoop are new to me as well. I have effectively considered Big Data to be an SQL database, and so introducing new ideas may need some hand holding.
in Hive you may have to create one external table pointing to your text file
and JOIN between your source table and the newly created external table
A colleague has just suggested the same thing. A shame there is no way to use the file directly, but I very much appreciate your answer.