Support Questions
Find answers, ask questions, and share your expertise

nifi ftp to hdfs to hive :: problems in hive

Hello. I'm creating a flow where i have to take files (*.csv.bz2) from an Ftp server. Next i have to put this files in the hdfs and next in hive (first in a temporal table after that in a partitionate table (orc format).

Resume:

1) list ftp

2) routeAnAttribute

3)FetchFtp

4)UpdateAttribute

5)PutHdfs -----At this point my flow works good and the files are in the hdfs-----

6) ListHdfs (in the path where point 5 Stored the files (*.csv.bz2) )

7) Rout an attribut

8)ReplaceText

9)PutHiveQl job's

Image:

12262-captura-de-pantalla-2017-02-07-a-las-214545.jpg

My problem is when y try to load the files into hive, because first i try to load the file into a temporal table: load data inpath "${path}/${filename}" overwrite into table staging.callis_ccm_test

and next i have to load this data from this table into the definitive table: insert into table staging.callis_ccm_test_hist partition (day=${filename:substring(12,16)}${filename:substring(17,19)}${filename:substring(20,22)} , hour=${filename:substring(23,25)}) select * from staging.callis_ccm_test

So, the problem is, the ReplaceText does not admit 2 (two) Sentences in the Replaement Value (place where i put the sql statement).

image replace text:

12263-captura-de-pantalla-2017-02-09-a-las-095901.jpg

12264-captura-de-pantalla-2017-02-09-a-las-100139.jpg

the error:

12265-eeecaptura-de-pantalla-2017-02-09-a-las-103456.jpg

note:

The load sentence works fine, but the insert sentence does not work and the flow entry in the "retry" flow. I checked the path in the hdfs and its ok, the file arrive ok.

I thinked the solution is create another one replaceText and puthiveql to with de insert sentence, but the insert statement must be executed with the same flowfile that I executed and before the next flowfile executes the Load statement (this has the overwrite function).

Error image when the replaceText has two sentences: This procedure is easy in bash or java, But in nifi is complicating me. Some one can help me?

2 REPLIES 2

sorry (update): the error with the two sentences in the replaceText is:

12267-eeeecaptura-de-pantalla-2017-02-09-a-las-104947.jpg

Hi @MARTIN GATTO,

The issue is that PutHiveQL processor does not support multi statements operations in the version you are using (it'll be possible in the next release of Apache NiFi). In your case you, I can see two potential options:

-> ReplaceText (to create the load statement) -> PutHiveQL -> ReplaceText (to create the insert statement) -> PutHiveQL

or

-> ReplaceText (as you did) -> SplitText (line count = 1, to have one flow file per line) -> PutHiveQL

Be careful with the last one: it supposes mono threaded processors, and sequential treatments to ensure order in flow files.

Hope this helps.