- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
How to convert unix timestamp to datetime in Apache NiFi?
- Labels:
-
Apache NiFi
Created ‎04-20-2018 01:31 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi everyone!
In my FlowFile, using components, I listen to logs from Squid in the form of txt. Then I add columns and separate the log elements. The output file is csv, because in the future it will be added to the Hive database.
The problem occurs when I want to change the type of the "datetime" variable from unix timestamp to the usual date time.
Based on this solution: https://community.hortonworks.com/articles/131320/using-partitionrecord-grokreaderjsonwriter-to-pars...
This is how the log sent by Squid looks like using a proxy:
1518442283.483 161 127.0.0.1 TCP_MISS/200 103701 GET http://www.cnn.com/ matt DIRECT/199.27.79.73 text/html
This is Grok Expression (in GrokReader😞
%{NUMBER:timestamp}\s+%{NUMBER:duration}\s%{IP:client_address}\s%{WORD:cache_result}/%{POSINT:status_code}\s%{NUMBER:bytes}\s%{WORD:request_method}\s%{NOTSPACE:url}\s(%{NOTSPACE:user}|-)\s%{WORD:hierarchy_code}/%{IPORHOST:server}\s%{NOTSPACE:content_type}
The following links include:
→ General appearance of components: https://imgur.com/a/fn2Q03N
→ Settings for the UpdateRecord component: https://imgur.com/a/E8gfMj8
→ Settings for the UpdateRecord/GrokReader(nifi_mid): https://imgur.com/a/l8He75D
→ Settings for the UpdateRecord/CSVRecordSetWriter: https://imgur.com/a/aqiZ98M
My problem is described on stackoverflow: https://stackoverflow.com/questions/48885675/conversion-unix-timestamp-attribute-to-normal-date
Created ‎04-20-2018 03:25 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
did you just left out the timestamp from the log entry, or is a log line exactly as you provided? In your example you only have 1 number preceding the IP address, but you try to read two numbers (timestamp and duration)?
Created ‎04-23-2018 09:28 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm sorry, I did not paste the value for unix timestamp by mistake. My value for timestamp is 1518442283.483, and the next is:
duration: 161,
client_address: 127.0.0.1,
cache_result: TCP_MISS,
status_code: 200,
bytes: 103701,
request_method: GET,
url: http://www.cnn.com/,
user: matt,
hierarchy_code: DIRECT,
server: 199.27.79.73,
content_type: text/html.
Created ‎04-23-2018 10:42 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @Wojtek,
I believe the problem is that the format() function works only with numbers. Your timestamp has a '.' (dot) which makes it interpret as a string.
To solve the problem I have adopted the following EL un a updateAttribute processor:
${ts:substringBefore('.'):append(${ts:substringAfter('.')}) :toNumber():format('MM/dd/yyyy HH:mm:ss.SSS') }ts attribute contains this value 1518442283.483
the result is 02/12/2018 13:31:23.483
I hope I have been of help
Created on ‎04-23-2018 01:29 PM - edited ‎08-17-2019 06:41 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello @Davide Isoardi,
thank you for your answer!
Is this how it should look like adding this method to the UpdateAttribute component?
Unfortunately, but nothing has changed.
Created ‎04-23-2018 01:58 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
you must change the attribute on which to perform the transformation. I never used the GrokReader, but I believe you have to change ts to in timestamp in the EL string
Created on ‎04-23-2018 02:40 PM - edited ‎08-17-2019 06:41 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I noticed, however, that the problem lies elsewhere. For example, I wanted to check if I change the value of the user attribute using the toUpper () function. Unfortunately, but at the exit the user's name was still written in lowercase.
Result is:
datetime,duration,client_address,cache_result,status_code,bytes,request_method,url,user,hierarchy_code,server,content_type
1518442283.483,161,127.0.0.1,TCP_MISS,200,103701,GET,http://www.cnn.com/,matt,DIRECT,199.27.79.73,text/html
Created ‎04-23-2018 02:47 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
-
The UpdateAttribute processor does not read the content of a FlowFile. In order for the above Expression Language statements to work, the incoming FlowFile's must have FlowFile attributes "user" and "datetime" created on them.
-
Stop the UpdateAttribute processor and allow a few FlowFiles to queue. Then list that queue and verify what attributes currently exist on those listed FlowFiles.
-
Thanks,
Matt
Created on ‎04-24-2018 10:45 AM - edited ‎08-17-2019 06:40 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Matt Clarke, thank you for your answer!
I turned off the UpdateAttribute component and turned on the entire flow again. Below are screenshots:
When I re-enabled the UpdateAttribute to component, the information about the attributes is such that @Matt Clarke is right ↓
Created ‎04-23-2018 03:00 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
sorry for the mistake, obviously the operation is performed on the attribute, not on the FlowFile. So you must first set the values you need in the attributes and then transform them.
Thanks @Matt Clarke
