About Shu_ashu

Shu_ashu · ‎03-02-2018

@Simon Jespersen You are having two forward slashes in your curl call after https://localhost:9091, just use one forward slash and run again the your curl call curl -k -i -H 'Content-Type: application/json'-XPUT -d '{"id":"cdb54c9a-0158-1000-5566-c45ca9692f85","state":"RUNNING"}' https://localhost:9091/nifi-api/flow/process-groups/a9d5c45f-015b-1000-0000-00006d9844d3 If you are still facing some issues then follow the below steps to start/stop processor groups in kerberos HDF2.1.1. if our HDF is Kerberised then we need to pass our access token with CURL api call. Steps to Start/Stop Processor Group:- 1.First do kinit in your NiFi node bash$ kinit 2.Check the validity of the kerberos ticket and make sure your your ticket is valid bash$ klist 3. Now create access token bash$ token=`curl -k -X POST --negotiate -u : https://localhost:9091/nifi-api/access/kerberos` https://localhost:9091/nifi/ 4.Use the created token in your curl call to start processor group bash$ curl -k --header "Authorization: Bearer $token" -i -H 'Content-Type: application/json' -XPUT -d'{"id":"cdb54c9a-0158-1000-5566-c45ca9692f85","state":"RUNNING"}' https://localhost:9091/nifi-api/flow/process-groups/cdb54c9a-0158-1000-5566-c45ca9692f85 5.Use the created token in your curl call to stop processor group bash$ curl -k --header "Authorization: Bearer $token" -i -H 'Content-Type: application/json' -XPUT -d'{"id":"cdb54c9a-0158-1000-5566-c45ca9692f85","state":"STOPPED"}' https://localhost:9091/nifi-api/flow/process-groups/cdb54c9a-0158-1000-5566-c45ca9692f85

Shu_ashu · ‎02-28-2018

@Gayathri Devi Could you try with below query as we are reading 2018-02-27T02:00 value and converting as timestamp value. Query:- hive> select from_unixtime(unix_timestamp('2018-02-27T02:00',"yyyy-MM-dd'T'hh:mm"),'yyyy-MM-dd hh:mm:ss'); +----------------------+--+ | _c0 | +----------------------+--+ | 2018-02-27 02:00:00 | +----------------------+--+ (or) By using regexp_replace function we can replace T in your timestamp value hive> select regexp_replace('2018-02-27T02:00','T',' '); +-------------------+--+ | _c0 | +-------------------+--+ | 2018-02-27 02:00 | +-------------------+--+ And use concat function to add missing :00 value to make above value as hive timestamp. hive> select concat(regexp_replace('2018-02-27T02:00','T',' '),":00"); +----------------------+--+ | _c0 | +----------------------+--+ | 2018-02-27 02:00:00 | +----------------------+--+

Shu_ashu · ‎02-28-2018

@Gayathri Devi You can use INPUT__FILE__NAME(gives all input filenames of the table) virtual column and construct your query then store the results of your query to final table. You need to create a temp table and keep your akolp9app1a_170905_0000.txt file in that table location. Then use hive> select INPUT__FILE__NAME from table; //this statement results your akolp9app1a_170905_0000.txt filename +---------------------------------------------------------------------------------+--+ | input__file__name | +---------------------------------------------------------------------------------+--+ | /apps/hive/warehouse/sales/akolp9app1a_170905_0000.txt | +---------------------------------------------------------------------------------+--+ So then you can use all your string functions like substring on the input_file_name filed and keep your hostname,date fileds extracted from the input__file__name field. hive> select substring(INPUT__FILE__NAME,20,30) hostname,substring(INPUT__FILE__NAME,40,50) `date` from table; Then you can have final table that you can insert the above select statement hostname,date values. hive> insert into finaltable select substring(INPUT__FILE__NAME,20,30) hostname,substring(INPUT__FILE__NAME,40,50) `date` from table; For more references:- https://cwiki.apache.org/confluence/display/Hive/LanguageManual+VirtualColumns

Shu_ashu · ‎02-28-2018

@Jyoti Ambi Instead of doing all this convertrecord processor choose either of below methods. Method1:- Executesql Processor properties SQL Query:- select MAX(CREATE_DATE) CREATE_DATE from one table Then use ConvertAvrotoJSON processor so we are converting Avro data to json Configs:- Your data would be like this [{"CREATE_DATE": "2018-10-12 09:09:09"}] Then use EvaluateJsonPath Processor to get the create_date value as flowfile content Configs:- Output flowfile content from evaljsonpath processor would be 2018-10-12 09:09:09 Then you can use PutFile processor to store your file into Local directory. Flow:- (or) Method2:- If you want header while keeping file into your directory, then in EvaluateJsonpath processor change the property Destination flowfile-attribute Then add ReplaceText processor Configs:- Now we are creating new contents of flowfile by keeping header as CREATE_DATE and in new line we are keeping our create_date attribute value (i.e. 2018-10-12 09:09:09) Output:- CREATE_DATE 2018-10-12 09:09:09 Then use PutFile processor to store above file into your local. Flow:- Executesql --> ConvertAvrotoJSON --> EvaluateJSONpath(destination as flowfile attribute) --> Replace text -->putfile

Shu_ashu · ‎02-27-2018

@Jyoti Ambi I think your Avro Reader configs looks correct and in CsvSetwriter change configs as mentioned in below screenshot Schema Access Strategy as we are ingeriting the record schema from content of flowfile so i have setup that. As i haven't mentioned any Schema Registry value because we don't want to get schema from any of the registries as we are having schema available with record we are inheriting that record schema. Avro Reader Configs:- If you are still having issues then share us sample data like 10 records in csv(with header) (or) json format, so that we can recreate your scenario and help you out to solve your issue.

Shu_ashu · ‎02-26-2018

@YoungHeon Kim As ExecuteSQL processor SQL select query property supports expression language so you can use select * from tmp_table where std_date='${now():format('yyyy-MM-dd')}' this query is similar to select * from tmp_table where std_date='2018-02-26' For more details refer to https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html#now

Shu_ashu · ‎02-23-2018

@Jyoti Ambi ExecuteSQL processor always return results in Avro Format, so once you get results in avro format then you need to use ConvertRecord processor Record Reader --> Avro Reader //reads the incoming avro format flowfile contents Record Writer --> CsvRecordSetWriter //write the output results in csv format then use PutFile processor to store the output success flowfiles from ConvertRecord processor. To configure this ConverRecord processor please refer to below links https://blogs.apache.org/nifi/entry/record-oriented-data-with-nifi https://community.hortonworks.com/articles/115311/convert-csv-to-json-avro-xml-using-convertrecord-p.html https://community.hortonworks.com/questions/167066/how-to-split-the-csv-files-into-multiple-files-bas.html 2.Wait and Notify processors are work together i.e wait processor Routes incoming FlowFiles to the 'wait' relationship until a matching release signal is stored in the distributed cache from a corresponding Notify processor. When a matching release signal is identified, a waiting FlowFile is routed to the 'success' relationship, with attributes copied from the FlowFile that produced the release signal from the Notify processor. How wait and notify processors works? http://ijokarumawak.github.io/nifi/2017/02/02/nifi-notify-batch/

Shu_ashu · ‎02-21-2018

@Gopal Mehakare you can use either of execute stream command processor which accepts incoming connections (or) Execute process which won't accept incoming connections based on your requirements to execute linux commands Execute Stream Command configs:- let's consider your input flowfile content is as follows hi hcc nifi output from Execute stream command processor would be just hi because we are executing bash command head -1 on the input flowfile content and output stream relation will transfer hi as new content of flowfile content. Output:- hi Execute Process configs:- The Success relation flowfile content will have just hi in it, as we are executing echo and this processor can run on own(no need of any incoming connections). Output:- hi For More reference:- https://community.hortonworks.com/questions/150122/zip-folder-using-nifi.html https://stackoverflow.com/questions/42443101/nifi-how-to-reference-a-flowfile-in-executestreamcommand

Shu_ashu · ‎02-21-2018

@Mark You need to use Regex serde while creating hive table and matching regex to capture the fields that you need to have them in same group. Some references how to create regex serde tables https://community.hortonworks.com/articles/58591/using-regular-expressions-to-extract-fields-for-hi.html https://stackoverflow.com/questions/31008371/hive-using-regexserde-to-define-input-format https://stackoverflow.com/questions/9102184/regex-for-access-log-in-hive-serde

Shu_ashu · ‎02-18-2018

@Mark As you are trying to insert values having semicolon(;) in it, hive thinks semi colon will be end of statement even if you escape semicolon with back slash(\). To insert values with semicolon use unicode for semicolon \u003B in your insert values statement and back slash to escape /,space,). Insert statement:- hive> insert into semicolon values ('Mozilla\/5\.0\ $iPhone\u003B\ CPU\ iPhone\ OS\ 5_0$'); hive> select * from semicolon; +------------------------------------------+--+ | a | +------------------------------------------+--+ | Mozilla/5.0 (iPhone; CPU iPhone OS 5_0) | +------------------------------------------+--+ (or) keep your data file into HDFS directory and create semicolon table with string datatype, pointing to that HDFS directory. Semicolon Table reading from HDFS directory:- hive> select * from semicolon; +------------------------------------------+--+ | a | +------------------------------------------+--+ | Mozilla/5.0 (iPhone; CPU iPhone OS 5_0) | +------------------------------------------+--+ Results will be same from both ways.

Online	Offline
Last Visited	‎04-04-2021 06:38 PM

Member Since	‎06-08-2017 08:15 PM
Last Visited	‎04-04-2021 06:38 PM
Posts	1,049
Kudos received	516

Cloudera Community

Re: Get column values in comma separated value

Re: nifi Json data using routeonattributeto to spl...

Re: HIVE MANAGED TABLE

Re: CSV file with Duplicate Headers

Re: NIFI - SQL Server Lookup

Re: how to stop and start processor-group in NIFI ...

Re: 2018- to convert into hive

Re: Filename extract and insert into column values...

Re: How to store the output of a query to one text...

Re: How to store the output of a query to one text...

Re: How can I use variable on query of executeSQL ...

Re: How to store the output of a query to one text...

Re: how to run simple linux commands in nifi witho...

Re: comma in between data of csv mapped to externa...

Re: how to insert semicolon in hive table