Member since
06-08-2017
1049
Posts
518
Kudos Received
312
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 11242 | 04-15-2020 05:01 PM | |
| 7152 | 10-15-2019 08:12 PM | |
| 3129 | 10-12-2019 08:29 PM | |
| 11556 | 09-21-2019 10:04 AM | |
| 4360 | 09-19-2019 07:11 AM |
08-20-2018
02:09 PM
1 Kudo
@Lakshmana
Maddineni
The issue is because of duplicate rows. When 'not match' is combined with 'match' under the Merge statement, then the cardinality check is applied by default. The Cardinality check needs to be disabled when using both 'matched' and 'not matched'. Set the following property in your hive shell and then try to execute the merge statement again. Set hive.merge.cardinality.check=false; Refer to this support KB article for more details regards to the same exact issue..!! - If the Answer helped to resolve your issue, Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of issues.
... View more
08-20-2018
01:53 PM
1 Kudo
@CHEH YIH
LIM
I think you are using extract text processor to extract the content and keep as attribute to the flowfile if yes then change Maximum Buffer Size 1 MB Specifies the maximum amount of data to buffer (per file) in order to apply the regular expressions. Files larger than the specified maximum will not be fully evaluated. Maximum Capture Group Length 1024 Specifies the maximum number of characters a given capture group value can have. Any characters beyond the max will be truncated. These two property values as per your flowfile size. - If the Answer helped to resolve your issue, Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of issues.
... View more
08-18-2018
06:44 PM
1 Kudo
@Saravanan Subramanian Try with below Jolt Spec: [
{
"operation": "shift",
"spec": {
"*": "Data.&"
}
},
{
"operation": "default",
"spec": {
"Data": {}
}
}
]
Input: {
"primary": {
"value": 4
},
"quality": {
"value": 3
}
} Output: {
"Data" : {
"primary" : {
"value" : 4
},
"quality" : {
"value" : 3
}
}
} Another way of doing this is using Replace Text processor by capturing all the content of flowfile and replacement value as Search Value
(^.*$)
Replacement Value
{ "Data": $1 }
Character Set
UTF-8
Maximum Buffer Size
1 MB
Replacement Strategy
Regex Replace
Evaluation Mode
Entire text - If the Answer helped to resolve your issue, Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of issues.
... View more
08-16-2018
11:31 AM
@Parth
Karkhanis
Could you try with introducing SplitAvro Processor in your flow after QueryDatabaseTable processor and configure the processor to create small chunks of flowfile instead of one big AVRO file then try to run your commands again.
... View more
08-15-2018
09:45 PM
3 Kudos
@Sai Krishna Makineni You can use either ListHDFS (or) GetHDFSFileInfo processors and then processor will not store the state and you can schedule this processor to run at nightly and once you list the files from HDFS then you can use hdfs.lastModified attribute(or) you can use your filename with substringAfter function and check the timestamp value in your RouteOnAttribute processor. Once you filterout the files that are more than specific time then feed to DeleteHDFS processor to delete them. In addition ListHDFS processor stores the state and runs only incrementally so if you want to clear the state then use RestAPI with /processors/{id}/state/clear-requests To clear the state and run the processor once you clear the state. Flow: 1.ListHDFS2.RouteOnAttribute //check the filename (or) lastmodified time3.DeleteHDFS //delete the files in hdfs Flow: 1.GenerateFlowFile 2.GetHDFSFileINFO 3.RouteOnAttribute 4.DeleteHDFS (or) You can use GetHDFS processor(Keep source file to true) which doesn't store the state but in this processor we are fetching the files from HDFS if the file is big then we are keeping lot of load on NiFi.
... View more
08-15-2018
12:46 AM
@shraddha srivastav Stop Update Record and add some sample records after GetFile processor by listing the queue i.e. Input sample records like 10(not screenshots) and Expected output. That would be helpful to recreated and resolve the issue..!!
... View more
08-15-2018
12:23 AM
@shraddha srivastav Use UpdateRecord processor below configs in CSVRecordSetWriter controller service add filename column with string type as last field in the avro schema. UpdateRecord Configs: Add new property in UpdateRecord processor as /filename concat(/UutId,/Test) //column names will be case sensitive As we are using Record Path Value as Replacement Value Strategy now update record processor will concat UutId,Test values to filename column value. Refer to this link for more details regarding Update Record processor. Example: InputData: UutId,Test
1,2 CsvReaderConfigs: CsvRecordSetWriter avro schema: {
"namespace": "nifi",
"name": "balances",
"type": "record",
"fields": [
{"name": "UutId", "type": ["null", "string"]},
{"name": "Test", "type": ["null", "string"]},
{"name": "filename", "type": ["null", "string"]}
]
} Configs: Ouput: UutId,Test,filename
1,2,12 - If the Answer helped to resolve your issue, Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of issues.
... View more
08-14-2018
04:28 AM
@Gillu
Varghese
Both cron triggers in the screenshot are same you can use either of them for scheduling purpose. We cannot trigger just at 3AM, the largest time that we can trigger is at 2:59:59AM with one cron expression.
... View more
08-14-2018
03:30 AM
@Gillu Varghese Quartz cron expression needs to be atleast 6
fileds and last field will be optional 0 0/15 2 ? * * //no specific value for day of month as we have scheduled at 2 AM so cron triggers starting at 2AM 0 0/15 2 1/1 ? //invalid as month field doesn’t allow ? in it. 0 0/15 2 1/1 * ? (or) 0 0/15 2 * * ? //start first day of the month and execute
each 1 day at 2AM In this case both
expressions will be same. Please refer to this awesome explanation regards to Quartz cron: 0 0 0/1 1/1 * ? *
| | | | | | |
| | | | | | +-- Year (range: 1970-2099)
| | | | | +---- Day of the Week (range: 1-7 or SUN-SAT)
| | | | +------ Month of the Year (range: 0-11 or JAN-DEC)
| | | +--------- Day of the Month (range: 1-31)
| | +------------- Hour (range: 0-23)
| +---------------- Minute (range: 0-59)
+------------------ Second (range: 0-59)
* (“all values”) used to select all values within a field. For
example, “” in the minute field means *“every minute”. ? (“no specific value”) useful when you need to specify something in
one of the two fields in which the character is allowed, but not the other. For
example, if I want my trigger to fire on a particular day of the month (say,
the 10th), but don’t care what day of the week that happens to be, I would put
“10” in the day-of-month field, and “?” in the day-of-week field. / used to specify increments. For example: “0/15” in the seconds field means
“the seconds 0, 15, 30, and 45”. And “5/15” in the seconds field means “the
seconds 5, 20, 35, and 50”. You can also specify ‘/’ after the ‘’ character -
in this case ‘’ is equivalent to having ‘0’ before the ‘/’. ‘1/3’ in the
day-of-month field means “fire every 3 days starting on the first day of the
month”. To explain difference between ? and * in
the expressions, first of all take a look at this table: Field Name Mandatory Allowed Values Allowed Special Characters
Seconds YES 0-59 , - * /
Minutes YES 0-59 , - * /
Hours YES 0-23 , - * /
Day of month YES 1-31 , - * ? / L W //allowed '?'
Month YES 1-12 or JAN-DEC , - * /
Day of week YES 1-7 or SUN-SAT , - * ? / L # //allowed '?'
Year NO empty, 1970-2099 , - * /
... View more
08-14-2018
02:11 AM
@Gillu
Varghese
> NiFi uses quartz cron expression for your case use below expression to run processor 0 0/15 2 1/1 * ? * runs at 2:00AM,2:15AM,2:30AM,2:45AM. > If you want to run at 3:00 AM then we need to use another trigger processor to be scheduled separately with below cron expression. 0 0 3 1/1 * ? * Refer to this link to create/validate quartz cron expressions and this for more details regarding cron scheduling in NiFi. In addition we can also add minutes with comma seperators 59 0,15,30,45,59 2 1/1 * ? * runs at 2:00:59AM,2:15:59AM,2:30:59AM,2:45:59AM,2:59:59AM. - If the Answer helped to resolve your issue, Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of issues.
... View more