Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Specify timezone when parsing CSV data?

avatar
New Contributor

Hello,

 

we are using Apache NiFi to process CSV data that contains time stamps without a specified time zone.  By default, it seems that NiFi considers timestamps parsed by CSVReader with a parse specification like "yyyy-MM-dd HH:mm:ss" to be in UTC, no matter what the user.timezone property of the JVM or the TZ environment variable is set to.  Is there a way to force the parsing to happen in a specific or the system time zone instead?

 

Thanks,

Hans

1 ACCEPTED SOLUTION

avatar
New Contributor

After I have not been able to find a solution that would be easy to implement inside of NiFi, I've written a small perl (yuk) script that can be used to adjust timestamps in a CSV file to be in ISO8601 format.  Maybe it is useful to someone else:

#!/bin/perl -w

# This perl script adds timezone information to timestamps without a
# timezone.  All timestamps in the input file that follow the format
# "YYYY-MM-DD HH:MM:SS" are converted to ISO8601 timestamps.

use strict;

use DateTime::Format::Strptime;

my $time_zone = 'Europe/Amsterdam';
my $parser = DateTime::Format::Strptime->new(
    pattern   => '%Y-%m-%d %T',
    time_zone => $time_zone
    );
my $printer = DateTime::Format::Strptime->new(
    pattern   => '%FT%T%z',
    time_zone => $time_zone
    );

while (<>) {
    s/(?<=")(\d\d\d\d-\d\d-\d\d \d\d:\d\d:\d\d)(?=")/
        my $dt = $parser->parse_datetime($1);
        $printer->format_datetime($dt);
    /ge;
    print;
}

 

View solution in original post

2 REPLIES 2

avatar
Super Guru

@HansH   I am not sure if this helps but just yesterday i had to use the argument in the reader jdbc connection string like this:

 

?serverTimezone=UTC

 

 

avatar
New Contributor

After I have not been able to find a solution that would be easy to implement inside of NiFi, I've written a small perl (yuk) script that can be used to adjust timestamps in a CSV file to be in ISO8601 format.  Maybe it is useful to someone else:

#!/bin/perl -w

# This perl script adds timezone information to timestamps without a
# timezone.  All timestamps in the input file that follow the format
# "YYYY-MM-DD HH:MM:SS" are converted to ISO8601 timestamps.

use strict;

use DateTime::Format::Strptime;

my $time_zone = 'Europe/Amsterdam';
my $parser = DateTime::Format::Strptime->new(
    pattern   => '%Y-%m-%d %T',
    time_zone => $time_zone
    );
my $printer = DateTime::Format::Strptime->new(
    pattern   => '%FT%T%z',
    time_zone => $time_zone
    );

while (<>) {
    s/(?<=")(\d\d\d\d-\d\d-\d\d \d\d:\d\d:\d\d)(?=")/
        my $dt = $parser->parse_datetime($1);
        $printer->format_datetime($dt);
    /ge;
    print;
}