Support Questions

Find answers, ask questions, and share your expertise
Announcements
We’ve updated our product names and community labels - click here for full details

Specify timezone when parsing CSV data?

avatar
Frequent Visitor

Hello,

 

we are using Apache NiFi to process CSV data that contains time stamps without a specified time zone.  By default, it seems that NiFi considers timestamps parsed by CSVReader with a parse specification like "yyyy-MM-dd HH:mm:ss" to be in UTC, no matter what the user.timezone property of the JVM or the TZ environment variable is set to.  Is there a way to force the parsing to happen in a specific or the system time zone instead?

 

Thanks,

Hans

1 ACCEPTED SOLUTION

avatar
Frequent Visitor

After I have not been able to find a solution that would be easy to implement inside of NiFi, I've written a small perl (yuk) script that can be used to adjust timestamps in a CSV file to be in ISO8601 format.  Maybe it is useful to someone else:

#!/bin/perl -w

# This perl script adds timezone information to timestamps without a
# timezone.  All timestamps in the input file that follow the format
# "YYYY-MM-DD HH:MM:SS" are converted to ISO8601 timestamps.

use strict;

use DateTime::Format::Strptime;

my $time_zone = 'Europe/Amsterdam';
my $parser = DateTime::Format::Strptime->new(
    pattern   => '%Y-%m-%d %T',
    time_zone => $time_zone
    );
my $printer = DateTime::Format::Strptime->new(
    pattern   => '%FT%T%z',
    time_zone => $time_zone
    );

while (<>) {
    s/(?<=")(\d\d\d\d-\d\d-\d\d \d\d:\d\d:\d\d)(?=")/
        my $dt = $parser->parse_datetime($1);
        $printer->format_datetime($dt);
    /ge;
    print;
}

 

View solution in original post

2 REPLIES 2

avatar
Super Guru

@HansH   I am not sure if this helps but just yesterday i had to use the argument in the reader jdbc connection string like this:

 

?serverTimezone=UTC

 

 

avatar
Frequent Visitor

After I have not been able to find a solution that would be easy to implement inside of NiFi, I've written a small perl (yuk) script that can be used to adjust timestamps in a CSV file to be in ISO8601 format.  Maybe it is useful to someone else:

#!/bin/perl -w

# This perl script adds timezone information to timestamps without a
# timezone.  All timestamps in the input file that follow the format
# "YYYY-MM-DD HH:MM:SS" are converted to ISO8601 timestamps.

use strict;

use DateTime::Format::Strptime;

my $time_zone = 'Europe/Amsterdam';
my $parser = DateTime::Format::Strptime->new(
    pattern   => '%Y-%m-%d %T',
    time_zone => $time_zone
    );
my $printer = DateTime::Format::Strptime->new(
    pattern   => '%FT%T%z',
    time_zone => $time_zone
    );

while (<>) {
    s/(?<=")(\d\d\d\d-\d\d-\d\d \d\d:\d\d:\d\d)(?=")/
        my $dt = $parser->parse_datetime($1);
        $printer->format_datetime($dt);
    /ge;
    print;
}