Member since
06-08-2017
1049
Posts
518
Kudos Received
312
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 11137 | 04-15-2020 05:01 PM | |
| 7032 | 10-15-2019 08:12 PM | |
| 3076 | 10-12-2019 08:29 PM | |
| 11278 | 09-21-2019 10:04 AM | |
| 4193 | 09-19-2019 07:11 AM |
11-14-2017
08:08 AM
1 Kudo
@Sanaz Janbakhsh i tried with your json message and eval json path processor with payload property as $.payload processor working as expected.. Output:- Can you share screenshots about your processors configurations,input to processor and flow, that would be great help to find root cause of your issue.
... View more
11-14-2017
01:18 AM
@Tarek Elgamal, I don't think corresponding parameters are in nifi.properties file and these params are for each node not for the whole cluster.The concurrent tasks assigned to your processors pull threads from the Maximum Timer Driven Thread Count pool. You can refer to below community links to get more info about these parameters https://community.hortonworks.com/questions/104718/handlehttprequest-configure-to-process-millions-of.html https://community.hortonworks.com/questions/140889/in-nifi-all-the-processors-in-the-data-flow-are-ex.html?childToView=140999#comment-140999 https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#system_properties
... View more
11-13-2017
09:19 PM
1 Kudo
@LOKASHIS RANA
As the logs shows ORA-01840: input value not long enough for date format Try to use some value for last value like '1900-01-01 00:00:00.0' instead of 0 and run again with below sqoop import statement. sqoop import--connect jdbc:oracle:thin:system/system@192.168.XXX.XXX:1521:xe --username sonu -password sonu --table DEMO_ORDERS --target-dir '/user/root/orders/' --as-textfile --split-by CUSTOMER_ID --incremental lastmodified --check-column ORDER_TIMESTAMP --last-value '1900-01-01 00:00:00.0'--num-mappers 5 --fields-terminated-by',' In addition once make sure how the data looks like for ORDER_TIMESTAMP column in source table if it is having just date in it like 2011-01-01 then you need to pass last-value same like the data in source Example:- if source table ORDER_TIMESTAMP data looks like 2011-01-01 then --last-value 1900-01-01 //we need to give last value same format matches with the data in source table.
... View more
11-13-2017
04:19 PM
1 Kudo
Hi @Tarek Elgamal, Click on Global menu and Click on Controller settings button then in General tab from this tab you can control threads running in NiFi instance. In addition you can refer to below community links https://community.hortonworks.com/questions/76117/how-to-improve-nifi-concurrency.html
... View more
11-13-2017
03:37 PM
1 Kudo
@Sanaz Janbakhsh, Use the below spec you can get gatewayMetaDataList array as is. [{
"operation": "shift",
"spec": {
"payloadMetaData": {
"*": "&",
"applicationMetaData": {
"id": {
"entityType": "entityType",
"id": "ApplicationId"
},
"customerId": {
"entityType": "entityTypeCustomer",
"id": "CustomerId"
},
"subCustomerId": "subCustomerId",
"name": "SubCustomerName"
},
"deviceMetaData": {
"id": {
"entityType": "entityTypeDevice",
"id": "DeviceId"
},
"name": "DeviceName",
"deviceClass": "deviceClass",
"deviceEUI": "deviceEUI",
"appEUI": "appEUI"
},
"rawPayload": "rawPayload",
"fcount": "fcount",
"fport": "fport"
},
"payload": "payload"
}
}] Flow:- 1.JoltTransform //flatten all the message except gatewayMetaDataList array 2.EvaluateJsonPath //to extract all the content and keep it as attributes to the flowfile and change Destination Property to Flowfile-Attribute. 3.SplitJson //split the json on gatewayMetaDataList array 4.JoltTransform //flatten gatewayMetaDataList array 5.EvaluateJsonPath //to extract all the content and keep it as attributes to the flowfile and change Destination Property to Flowfile-Attribute. 6.AttributesToJson //keep all the attribute names and change Destination Property to Flowfile-Content. You can use the below xml for reference and change the eval json path as per your requirements. jolt-11-13.xml
... View more
11-10-2017
07:04 PM
1 Kudo
@sally sally yeah, you can do that by using replace text processor with search value property as <details>\s*([\s\S]+.*)\n+\s+<\/details> //capture every thing enclosed in details tag as capture group 1 then in replacement value <details>
${filename}
$1
</details> you can customize the replacement value as per your needs. Replace Text processor Configs:- input:- <?xml version="1.0" encoding="UTF-8"?>
<service>
<Person>
<details>
<start>2017-10-22</start>
<id>*******</id>
<makeVersion>1</makeVersion>
<patch>patch</patch>
<parameter>1</parameter>
</details>
</Person>
</service> output:- <?xml version="1.0" encoding="UTF-8"?>
<service>
<Person>
<details>
1497701925152409
<start>2017-10-22</start>
<id>*******</id>
<makeVersion>1</makeVersion>
<patch>patch</patch>
<parameter>1</parameter>
</details>
</Person>
</service
... View more
11-10-2017
02:53 PM
1 Kudo
@Sanaz Janbakhsh
Can you once try the below Spec [{ "operation": "shift",
"spec": {
"payloadMetaData": {
"rawPayload": "rawPayload",
"fcount": "fcount",
"fport": "fport",
"applicationMetaData": {
"id": {
"entityType": "entityType",
"id": "id"
},
"customerId": {
"entityType": "entityTypeCustomer",
"id": "idCustomer"
}
},
"gatewayMetaDataList": {
"*": { //* means if you are having more than 1 message in array then it results array of mac addresses in it.
"mac": "mac"
}
},
"deviceMetaData": {
"id": {
"entityType": "entityTypeDevice",
"id": "DeviceId"
},
"name": "DeviceName",
"deviceClass": "deviceClass",
"deviceEUI": "deviceEUI",
"appEUI": "appEUI"
}
},
"payload": "payload"
}
}] input:- {
"payloadMetaData": {
"applicationMetaData": {
"id": {
"entityType": "APPLICATION",
"id": "7d11abf0-aa82-11e7-afd4-63e5a28ad7fa"
},
"customerId": {
"entityType": "CUSTOMER",
"id": "a84ab1a0-aa81-11e7-afd4-63e5a28ad7fa"
},
"subCustomerId": null,
"name": "TekTelic-Industrial"
},
"gatewayMetaDataList": [
{
"id": {
"entityType": "GATEWAY",
"id": "e5ca6840-aa81-11e7-afd4-63e5a28ad7fa"
},
"name": "ColinsGateway",
"mac": "647FDAFFFE00417C",
"latitude": 51.04697912502887,
"longitude": -114.06121730804443,
"altitude": null,
"rxInfo": {
"channel": 2,
"codeRate": "4/5",
"crcStatus": 1,
"dataRate": {
"modulation": "LORA",
"spreadFactor": 7,
"bandwidth": 125
},
"frequency": 902700000,
"loRaSNR": 6.8,
"mac": "647fdafffe00417c",
"rfChain": 0,
"rssi": -51,
"size": 21,
"time": "2017-11-01T21:51:27Z",
"timestamp": 1400148651,
"rsig": null
}
}
],
"deviceMetaData": {
"id": {
"entityType": "DEVICE",
"id": "61348ab0-ada2-11e7-ac06-63e5a28ad7fa"
},
"name": "TekTelic-Industrial-0017",
"deviceClass": "CLASS_A",
"deviceEUI": "647FDA0000000155",
"appEUI": "647FDA8000000155"
},
"fcount": 3706,
"fport": 10
}
}
Output:- {
"fcount" : 3706,
"fport" : 10,
"entityType" : "APPLICATION",
"id" : "7d11abf0-aa82-11e7-afd4-63e5a28ad7fa",
"entityTypeCustomer" : "CUSTOMER",
"idCustomer" : "a84ab1a0-aa81-11e7-afd4-63e5a28ad7fa",
"mac" : "647FDAFFFE00417C",
"entityTypeDevice" : "DEVICE",
"DeviceId" : "61348ab0-ada2-11e7-ac06-63e5a28ad7fa",
"DeviceName" : "TekTelic-Industrial-0017",
"deviceClass" : "CLASS_A",
"deviceEUI" : "647FDA0000000155",
"appEUI" : "647FDA8000000155"
}
in addition if you want to flatten out all the elements of array then extract all the json message contents by using eval json path processor with Destination as flowfile-attribute, then use split json processor with JsonPath Expression as $.gatewayMetaDataList then use jolt transform with below spec [{
"operation": "shift",
"spec": {
"*": {
"*": "id-&",
"id": {
"*": "&"
},
"rxInfo": {
"*": "rxInfo-&",
"dataRate": {
"*": "dataRate-&"
}
}
}
}
}] Right now you are doing flattening out json array elements then use attributes to json processor and keep all the attributes that you have extracted, now this processor will create new json message which is flattened out with each array elements in it. (or) if you want to add all array elements to respective keys then use below spec [{
"operation": "shift",
"spec": {
"payloadMetaData": {
"gatewayMetaDataList": {
"*": {
"*": "id-&",
"id": {
"*": "&"
},
"rxInfo": {
"*": "rxInfo-&",
"dataRate": {
"*": "dataRate-&"
}
}
}
}
}
}
}
... View more
11-10-2017
01:36 AM
@swathi thukkaraju As you are using timestamp field data type as string, can you cast that to Bigint or int as per your requirements then from_unixtime will work. Possible Outputs for your timestamp value 1465876799, you can check them in hive (or) beeline shell. hive> select from_unixtime(1465876799, 'yyyy-MM-dd'); 2016-06-13 hive> select from_unixtime(CAST(1465876799000 as int), 'yyyy-MM-dd'); 2010-12-21 hive> select from_unixtime(CAST(1465876799000 as bigint), 'yyyy-MM-dd');
48421-10-14 select from_unixtime(CAST(1465876799000/1000 as BIGINT), 'yyyy-MM-dd'); 2016-06-13 Error:- hive> select from_unixtime(CAST(1465876799000 as string), 'yyyy-MM-dd');
FAILED: SemanticException [Error 10014]: Line 1:7 Wrong arguments ''yyyy-MM-dd'': No matching method for class org.apache.hadoop.hive.ql.udf.UDFFromUnixTime with (string, string). Possible choices: _FUNC_(bigint) _FUNC_(bigint, string) _FUNC_(int) _FUNC_(int, string) As you can view above i did
cast 1465876799000 as string but it is giving error with possible choices are
bigint,int. Possible Query for your case:- val df = sqlContext.sql("select from_unixtime(cast(timestamp as bigint),'YYYY-MM-dd') as 'ts' from stamp") (or) change data type in case class case class flight(display_id: Int ,uuid:String, document_id :Int, timestamp:BigInt, platformgeo_location:String)val df = sqlContext.sql("select from_unixtime(timestamp,'YYYY-MM-dd') as 'ts' from stamp") I have mentioned all the possible outputs above by testing them in hive shell by using datatypes as int,bigint. You can pick which is best fit for your case.
... View more
11-09-2017
06:51 PM
1 Kudo
@dhieru singh, If your file having any new line as first line of your file that would won't show in NiFi UI. you need to use another replace text processor following with the existing replace text processor Search value property as (\s$|^\s*)(.*) //match if you are having any new lines in starting and ending position of the file Replacement value property as $2 Why existing replace text won't match empty lines? As you are having replace text processor with \n+\s //matches for only empty lines in between the text but not first If you have empty lines in between the text then you are going to replace them with shift+enter. But in this case you are having empty line at starting of the file then you need to take off them,that's the reason why you need to use another replace text processor. Flow:- First replace text processor //is going to replace empty lines in text with shift+enter.<br>second replace text processor //replace empty lines at first and end lines in text then remove them from file
... View more
11-09-2017
05:23 PM
1 Kudo
@dhieru singh Can you once check your replace text processor configurations if the Replacement Strategy property as Prepend that means you are adding new line at starting of the file. As that new line won't show in NiFi UI but when you do cat on the file then only it shows the new line.
... View more