About Shu_ashu

Shu_ashu · ‎11-14-2017

@Sanaz Janbakhsh i tried with your json message and eval json path processor with payload property as $.payload processor working as expected.. Output:- Can you share screenshots about your processors configurations,input to processor and flow, that would be great help to find root cause of your issue.

Shu_ashu · ‎11-14-2017

@Tarek Elgamal, I don't think corresponding parameters are in nifi.properties file and these params are for each node not for the whole cluster.The concurrent tasks assigned to your processors pull threads from the Maximum Timer Driven Thread Count pool. You can refer to below community links to get more info about these parameters https://community.hortonworks.com/questions/104718/handlehttprequest-configure-to-process-millions-of.html https://community.hortonworks.com/questions/140889/in-nifi-all-the-processors-in-the-data-flow-are-ex.html?childToView=140999#comment-140999 https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#system_properties

Shu_ashu · ‎11-13-2017

@LOKASHIS RANA As the logs shows ORA-01840: input value not long enough for date format Try to use some value for last value like '1900-01-01 00:00:00.0' instead of 0 and run again with below sqoop import statement. sqoop import--connect jdbc:oracle:thin:system/system@192.168.XXX.XXX:1521:xe --username sonu -password sonu --table DEMO_ORDERS --target-dir '/user/root/orders/' --as-textfile --split-by CUSTOMER_ID --incremental lastmodified --check-column ORDER_TIMESTAMP --last-value '1900-01-01 00:00:00.0'--num-mappers 5 --fields-terminated-by',' In addition once make sure how the data looks like for ORDER_TIMESTAMP column in source table if it is having just date in it like 2011-01-01 then you need to pass last-value same like the data in source Example:- if source table ORDER_TIMESTAMP data looks like 2011-01-01 then --last-value 1900-01-01 //we need to give last value same format matches with the data in source table.

Shu_ashu · ‎11-13-2017

Hi @Tarek Elgamal, Click on Global menu and Click on Controller settings button then in General tab from this tab you can control threads running in NiFi instance. In addition you can refer to below community links https://community.hortonworks.com/questions/76117/how-to-improve-nifi-concurrency.html

Shu_ashu · ‎11-13-2017

@Sanaz Janbakhsh, Use the below spec you can get gatewayMetaDataList array as is. [{ "operation": "shift", "spec": { "payloadMetaData": { "*": "&", "applicationMetaData": { "id": { "entityType": "entityType", "id": "ApplicationId" }, "customerId": { "entityType": "entityTypeCustomer", "id": "CustomerId" }, "subCustomerId": "subCustomerId", "name": "SubCustomerName" }, "deviceMetaData": { "id": { "entityType": "entityTypeDevice", "id": "DeviceId" }, "name": "DeviceName", "deviceClass": "deviceClass", "deviceEUI": "deviceEUI", "appEUI": "appEUI" }, "rawPayload": "rawPayload", "fcount": "fcount", "fport": "fport" }, "payload": "payload" } }] Flow:- 1.JoltTransform //flatten all the message except gatewayMetaDataList array 2.EvaluateJsonPath //to extract all the content and keep it as attributes to the flowfile and change Destination Property to Flowfile-Attribute. 3.SplitJson //split the json on gatewayMetaDataList array 4.JoltTransform //flatten gatewayMetaDataList array 5.EvaluateJsonPath //to extract all the content and keep it as attributes to the flowfile and change Destination Property to Flowfile-Attribute. 6.AttributesToJson //keep all the attribute names and change Destination Property to Flowfile-Content. You can use the below xml for reference and change the eval json path as per your requirements. jolt-11-13.xml

Shu_ashu · ‎11-10-2017

@sally sally yeah, you can do that by using replace text processor with search value property as <details>\s*([\s\S]+.*)\n+\s+<\/details> //capture every thing enclosed in details tag as capture group 1 then in replacement value <details> ${filename} $1 </details> you can customize the replacement value as per your needs. Replace Text processor Configs:- input:- <?xml version="1.0" encoding="UTF-8"?> <service> <Person> <details> <start>2017-10-22</start> <id>*******</id> <makeVersion>1</makeVersion> <patch>patch</patch> <parameter>1</parameter> </details> </Person> </service> output:- <?xml version="1.0" encoding="UTF-8"?> <service> <Person> <details> 1497701925152409 <start>2017-10-22</start> <id>*******</id> <makeVersion>1</makeVersion> <patch>patch</patch> <parameter>1</parameter> </details> </Person> </service

Shu_ashu · ‎11-10-2017

@Sanaz Janbakhsh Can you once try the below Spec [{ "operation": "shift", "spec": { "payloadMetaData": { "rawPayload": "rawPayload", "fcount": "fcount", "fport": "fport", "applicationMetaData": { "id": { "entityType": "entityType", "id": "id" }, "customerId": { "entityType": "entityTypeCustomer", "id": "idCustomer" } }, "gatewayMetaDataList": { "*": { //* means if you are having more than 1 message in array then it results array of mac addresses in it. "mac": "mac" } }, "deviceMetaData": { "id": { "entityType": "entityTypeDevice", "id": "DeviceId" }, "name": "DeviceName", "deviceClass": "deviceClass", "deviceEUI": "deviceEUI", "appEUI": "appEUI" } }, "payload": "payload" } }] input:- { "payloadMetaData": { "applicationMetaData": { "id": { "entityType": "APPLICATION", "id": "7d11abf0-aa82-11e7-afd4-63e5a28ad7fa" }, "customerId": { "entityType": "CUSTOMER", "id": "a84ab1a0-aa81-11e7-afd4-63e5a28ad7fa" }, "subCustomerId": null, "name": "TekTelic-Industrial" }, "gatewayMetaDataList": [ { "id": { "entityType": "GATEWAY", "id": "e5ca6840-aa81-11e7-afd4-63e5a28ad7fa" }, "name": "ColinsGateway", "mac": "647FDAFFFE00417C", "latitude": 51.04697912502887, "longitude": -114.06121730804443, "altitude": null, "rxInfo": { "channel": 2, "codeRate": "4/5", "crcStatus": 1, "dataRate": { "modulation": "LORA", "spreadFactor": 7, "bandwidth": 125 }, "frequency": 902700000, "loRaSNR": 6.8, "mac": "647fdafffe00417c", "rfChain": 0, "rssi": -51, "size": 21, "time": "2017-11-01T21:51:27Z", "timestamp": 1400148651, "rsig": null } } ], "deviceMetaData": { "id": { "entityType": "DEVICE", "id": "61348ab0-ada2-11e7-ac06-63e5a28ad7fa" }, "name": "TekTelic-Industrial-0017", "deviceClass": "CLASS_A", "deviceEUI": "647FDA0000000155", "appEUI": "647FDA8000000155" }, "fcount": 3706, "fport": 10 } } Output:- { "fcount" : 3706, "fport" : 10, "entityType" : "APPLICATION", "id" : "7d11abf0-aa82-11e7-afd4-63e5a28ad7fa", "entityTypeCustomer" : "CUSTOMER", "idCustomer" : "a84ab1a0-aa81-11e7-afd4-63e5a28ad7fa", "mac" : "647FDAFFFE00417C", "entityTypeDevice" : "DEVICE", "DeviceId" : "61348ab0-ada2-11e7-ac06-63e5a28ad7fa", "DeviceName" : "TekTelic-Industrial-0017", "deviceClass" : "CLASS_A", "deviceEUI" : "647FDA0000000155", "appEUI" : "647FDA8000000155" } in addition if you want to flatten out all the elements of array then extract all the json message contents by using eval json path processor with Destination as flowfile-attribute, then use split json processor with JsonPath Expression as $.gatewayMetaDataList then use jolt transform with below spec [{ "operation": "shift", "spec": { "*": { "*": "id-&", "id": { "*": "&" }, "rxInfo": { "*": "rxInfo-&", "dataRate": { "*": "dataRate-&" } } } } }] Right now you are doing flattening out json array elements then use attributes to json processor and keep all the attributes that you have extracted, now this processor will create new json message which is flattened out with each array elements in it. (or) if you want to add all array elements to respective keys then use below spec [{ "operation": "shift", "spec": { "payloadMetaData": { "gatewayMetaDataList": { "*": { "*": "id-&", "id": { "*": "&" }, "rxInfo": { "*": "rxInfo-&", "dataRate": { "*": "dataRate-&" } } } } } } }

Shu_ashu · ‎11-10-2017

@swathi thukkaraju As you are using timestamp field data type as string, can you cast that to Bigint or int as per your requirements then from_unixtime will work. Possible Outputs for your timestamp value 1465876799, you can check them in hive (or) beeline shell. hive> select from_unixtime(1465876799, 'yyyy-MM-dd'); 2016-06-13 hive> select from_unixtime(CAST(1465876799000 as int), 'yyyy-MM-dd'); 2010-12-21 hive> select from_unixtime(CAST(1465876799000 as bigint), 'yyyy-MM-dd'); 48421-10-14 select from_unixtime(CAST(1465876799000/1000 as BIGINT), 'yyyy-MM-dd'); 2016-06-13 Error:- hive> select from_unixtime(CAST(1465876799000 as string), 'yyyy-MM-dd'); FAILED: SemanticException [Error 10014]: Line 1:7 Wrong arguments ''yyyy-MM-dd'': No matching method for class org.apache.hadoop.hive.ql.udf.UDFFromUnixTime with (string, string). Possible choices: _FUNC_(bigint) _FUNC_(bigint, string) _FUNC_(int) _FUNC_(int, string) As you can view above i did cast 1465876799000 as string but it is giving error with possible choices are bigint,int. Possible Query for your case:- val df = sqlContext.sql("select from_unixtime(cast(timestamp as bigint),'YYYY-MM-dd') as 'ts' from stamp") (or) change data type in case class case class flight(display_id: Int ,uuid:String, document_id :Int, timestamp:BigInt, platformgeo_location:String)val df = sqlContext.sql("select from_unixtime(timestamp,'YYYY-MM-dd') as 'ts' from stamp") I have mentioned all the possible outputs above by testing them in hive shell by using datatypes as int,bigint. You can pick which is best fit for your case.

Shu_ashu · ‎11-09-2017

@dhieru singh, If your file having any new line as first line of your file that would won't show in NiFi UI. you need to use another replace text processor following with the existing replace text processor Search value property as (\s$|^\s*)(.*) //match if you are having any new lines in starting and ending position of the file Replacement value property as $2 Why existing replace text won't match empty lines? As you are having replace text processor with \n+\s //matches for only empty lines in between the text but not first If you have empty lines in between the text then you are going to replace them with shift+enter. But in this case you are having empty line at starting of the file then you need to take off them,that's the reason why you need to use another replace text processor. Flow:- First replace text processor //is going to replace empty lines in text with shift+enter.<br>second replace text processor //replace empty lines at first and end lines in text then remove them from file

Shu_ashu · ‎11-09-2017

@dhieru singh Can you once check your replace text processor configurations if the Replacement Strategy property as Prepend that means you are adding new line at starting of the file. As that new line won't show in NiFi UI but when you do cat on the file then only it shows the new line.

Online	Offline
Last Visited	‎04-04-2021 06:38 PM

Member Since	‎06-08-2017 08:15 PM
Last Visited	‎04-04-2021 06:38 PM
Posts	1,049
Kudos received	516

Cloudera Community

Re: Get column values in comma separated value

Re: nifi Json data using routeonattributeto to spl...

Re: HIVE MANAGED TABLE

Re: CSV file with Duplicate Headers

Re: NIFI - SQL Server Lookup

Re: Extract a text from Json

Re: How can I control the number of threads execut...

Re: SQOOP IMPORT FROM ORACLE TIMESTAMP ERROR ORA-...

Re: How can I control the number of threads execut...

Re: JoltTransformJson Spec

Re: Nifi:how to add tag in xml response

Re: JoltTransformJson Spec

Re: convert milliseconds data frame column into U...

Re: Replacing new lines, weird behavior observed

Re: Replacing new lines, weird behavior observed