About Shu_ashu

Shu_ashu · ‎04-03-2018

@Faheem Shoukat Additional Where Clause property has been introduces from NiFi 1.4.0+ versions of NiFi, As you are using NiFi 1.3.0 that's the reason why you are not able to find the Additional where clause property in your QuerydatabaseTable processor. Possible ways to achieve where clause functionality:- You can upgrade your NiFi instance to newer versions (or) Use Execute Sql processor(add where clause in your query), store the state in hive/hbase and pull the state again for incremental run using Execute Sql processor (or) Use combination of GenerateTableFetch(supports additional Where clause property) + RemoteProcessorGroup + ExecuteSql processors to achieve where clause property. NiFi 1.4 Querydatabase table processor configs:- once you have nifi 1.4 version then click on configure on QueryDatabase table processor you are going to have Additional Where clause property, Below is the reference link for Jira ticket addressing about additional where clause property. https://issues.apache.org/jira/browse/NIFI-4257

Shu_ashu · ‎04-02-2018

@Vivek Singh Does the answer helped, then please Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of issues.

Shu_ashu · ‎04-02-2018

@Vivek Singh we can add user defined header by using replace text processor instead of writing any script. For this case in your csvsetwriter controller service change the below property to false, Include Header Line false now we are not going to get the header line included in the output flowfile, then use Replace text processor to add our own header to the flowfile content. Replace text configs:- Search Value (?s)(^.*$) Replacement Value "_id", "name","time" to "id", "browser_name","duration" Character Set UTF-8 Maximum Buffer Size 1 MB //needs to change the value according to the file size that we are getting after convertrecord processor Replacement Strategy Prepend //we are prepend the the content with the above header line Evaluation Mode Entire text Input content:- sample content from convert record processor is as follows "1","foo","12:00AM" to "123","Mozilla","1hr" Output Content:- we are adding our header to the above flowfile content the output flowfile content from replacetext processor would be "_id", "name","time" to "id", "browser_name","duration" "1","foo","12:00AM" to "123","Mozilla","1hr" If the Answer helped to resolve your issue, Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of issues.

Shu_ashu · ‎04-01-2018

@Yassine Specify type as BigInt which is Equivalent to long type,hive don't have long datatype. hive> alter table table change col col bigint; for more reference https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-IntegralTypes(TINYINT,SMALLINT,INT/INTEGER,BIGINT)

Shu_ashu · ‎04-01-2018

@siva vulli Does the answer helped to resolve your issue, Click on Accept button below to accept the answer,to close the thread and it would be great help to Community users to find solution quickly for these kind of issues.

Shu_ashu · ‎04-01-2018

@Mark McGowan Does the answer helped to resolve your issue, Click on Accept button below to accept the answer,to close the thread and it would be great help to Community users to find solution quickly for these kind of issues.

Shu_ashu · ‎04-01-2018

<br> @Omer Shahzad By using PutHiveQL processor we can execute Hive DDL/DML statements, As you need to copy data between hive tables so we need to prepare hive statements and execute them using PutHiveQL processor. Example:- Let's take i need to copy 5 source tables in hive to target database in hive for this case keep all your insert/insert overwrite statements in Generate Flowfile processor. GenerateFlowfile Processor configs:- im having 5 target tables which are having different partiton columns dt,yr,ts and source data table also in hive having same partition columns, in the below statements i'm selecting a specific partition and inserting data into final table for the specific partition. Hive DML staements as follows:- insert into target_db_name.target_table_name1 partition(dt) select * from source_db_name.tablename1 where dt='partition_value' insert into target_db_name.target_table_name2 partition(yr) select * from source_db_name.tablename2 where yr='partition_value' insert into target_db_name.target_table_name3 partition(dt) select * from source_db_name.tablename3 where dt='partition_value' insert into target_db_name.target_table_name4 partition(ts) select * from source_db_name.tablename4 where ts='partition_value' insert into target_db_name.target_table_name5 partition(dt) select * from source_db_name.tablename5 where dt='partition_value' Schedule this processor either using cron (or) timer driven to run daily (or) based on your use case. If you want to change the partition column value based on time then use expression language to change the value dynamically based on time. example:- dt='${now():format("yyyy-MM-dd")}' output dt='2018-04-01' Refer to below this link for more date manipulations in NiFi. Then use Split Text processor:- As in the above generate flowfile processor we are having all the statements in one flowfile, now by using split text processor split the contents of flowfile by each line. now we are splitting the contents of flowfile into each line to new flowfile as we are having 5 insert statements so we will get 5 flowfile each having insert staetments in them. PutHiveQl Processor:- Put hive ql processor expects the content of an incoming FlowFile is expected to be the HiveQL command to execute. We are having insert statements as our flowfile content so use the Splits relation from split text processor relation and connect to Puthive Ql processor. each statement will be executed by using put hive ql processor. By using this method if you need to add new table to copy data from source to target then just add the appropriate insert staement in generate flowfile processor and we are re using the same flow for copy data from source to final. in addition if you want to distribute the flowfiles then use Remote processor group after Split text processor for distribution. Following link explains how to configure remote processor group https://community.hortonworks.com/articles/16461/nifi-understanding-how-to-use-process-groups-and-r.html Flow:-

Shu_ashu · ‎03-30-2018

@Stefan Constantin I tried with your sample record and i'm able to get the csv data as i have mentioned in the comment above and also i have attached my xml you can save and upload to your NiFi instance and compare both find where are you missing some configs. Could you share your Convert Record processors Record Reader, Record Writer,Avro Schema Registry Config screenshots.

Shu_ashu · ‎03-30-2018

@Stefan Constantin As you are doing ExecuteSql to pull data the format of output flowfile is in Avro and having embedded schema with the content so we can use Embedded AvroSchema as the Schema Access Strategy in ConvertRecord processor by using embedded schema we are skipping defining Schema Registry in both Reader and Writer controller services. Convert Record processor Configs:- Record Reader AvroReader Avro Reader Configs:- Record Writer CSVRecordSetWriter Csv set writer Configs:- With this configs we are reading the Avro data and converting to Csv, if you want header then keep includer header line property to True. Flow xml for your reference avro-to-csv.xml

Shu_ashu · ‎03-30-2018

@Stefan Constantin If you want header with lower and camel case names then in csvsetwriter controller change the below property to false. Include Header Line false now we are not going to get the header line included in the output flowfile, then use Replace text processor to add our own header to the file Replace text configs:- Search Value (?s)(^.*$) Replacement Value inv_idn,inv_number,usr_mdf,invIssDat,invCliCod,invCliNam,invCli_RegCountry,invoiceClass,corCliCod,ofoChiNmb,bsnCtgDsc,bsnCtgCod,bsnCtgDspCod,invTyp,invStu,invOrgCod,cryCod,amnWthVat,amnWthVatInEur,mediaFeesInEur,kcFeesInEur Character Set UTF-8 Maximum Buffer Size 1 MB //needs to change the value according to the file size that we are getting after convertrecord processor Replacement Strategy Prepend //we are prepend the the content with the above header line Evaluation Mode Entire text Input content:- 247048764,181120060,15/03/2018 08:34:00 by LDL,15/03/2018,FUNDQ,FUNDQUEST,FR,CUS - assujetti in EU outside Lx,BNPAMFR,20173748,Fund Data Management,LIS,FDM,Credit Note,Validated,LU,EUR,"-7,543.23","-7,543.23",0,-7543.23 Output Content:- inv_idn,inv_number,usr_mdf,invIssDat,invCliCod,invCliNam,invCli_RegCountry,invoiceClass,corCliCod,ofoChiNmb,bsnCtgDsc,bsnCtgCod,bsnCtgDspCod,invTyp,invStu,invOrgCod,cryCod,amnWthVat,amnWthVatInEur,mediaFeesInEur,kcFeesInEur 247048764,181120060,15/03/2018 08:34:00 by LDL,15/03/2018,FUNDQ,FUNDQUEST,FR,CUS - assujetti in EU outside Lx,BNPAMFR,20173748,Fund Data Management,LIS,FDM,Credit Note,Validated,LU,EUR,"-7,543.23","-7,543.23",0,-7543.23

Online	Offline
Last Visited	‎04-04-2021 06:38 PM

Member Since	‎06-08-2017 08:15 PM
Last Visited	‎04-04-2021 06:38 PM
Posts	1,049
Kudos received	516

Cloudera Community

Re: Get column values in comma separated value

Re: nifi Json data using routeonattributeto to spl...

Re: HIVE MANAGED TABLE

Re: CSV file with Duplicate Headers

Re: NIFI - SQL Server Lookup

Re: How 'Additional where Clause' property in Quer...

Re: How to generate UUID in apache NIFI ?

Re: How to change csv attribute/header name in apa...

Re: change datatype of external table

Re: Need to display each element of array in a sep...

Re: GetHTTP making multiple api calls with changin...

Re: nifi processor to copy the data between hive t...

Re: ReplaceText doesn't work properly

Re: ReplaceText doesn't work properly

Re: ReplaceText doesn't work properly