Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Time drift in hive,added by one hour in inner join

Highlighted

Time drift in hive,added by one hour in inner join

New Contributor

1 hive version

[hadoop@gxdh01 ~]$ hive --version
Hive 1.1.0-cdh5.7.5
Subversion file:///data/jenkins/workspace/generic-package-rhel64-6-0/topdir/BUILD/hive-1.1.0-cdh5.7.5 -r Unknown
Compiled by jenkins on Wed Nov 2 11:53:22 PDT 2016
From source with checksum 0e57e48866fa2d9aa5d36cca7bffae5e

 

2 when I do a hive inner join query ,a timestamp columns's value added by an hour.

hive> SELECT COMMUNITY_ID, BUILDENDDATE
    >   FROM DW_ODS.ODS_COMMUNITY A
    >  WHERE A.DAY_ID = '20170324'
    >    AND COMMUNITY_ID = 5086560
    > ;
Query ID = hadoop_20170328085656_331bf931-3ad9-4162-9a1a-89e46586ccef
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1490177280986_5130, Tracking URL = http://gxdh01:8088/proxy/application_1490177280986_5130/
Kill Command = /opt/cloudera/parcels/CDH-5.7.5-1.cdh5.7.5.p0.3/lib/hadoop/bin/hadoop job  -kill job_1490177280986_5130
Hadoop job information for Stage-1: number of mappers: 4; number of reducers: 0
2017-03-28 08:56:16,231 Stage-1 map = 0%,  reduce = 0%
2017-03-28 08:56:22,409 Stage-1 map = 75%,  reduce = 0%, Cumulative CPU 6.4 sec
2017-03-28 08:56:24,460 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 10.07 sec
MapReduce Total cumulative CPU time: 10 seconds 70 msec
Ended Job = job_1490177280986_5130
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 4   Cumulative CPU: 10.07 sec   HDFS Read: 494905939 HDFS Write: 28 SUCCESS
Total MapReduce CPU Time Spent: 10 seconds 70 msec
OK
5086560 1971-01-01 00:00:00
Time taken: 14.011 seconds, Fetched: 1 row(s)
hive> 

you can see BUILDENDDATE's value is 1971-01-01 00:00:00

 

you can see BUILDENDDATE's value is 1971-01-01 00:00:00

 

when I do a inner join

hive> SELECT T1.COMMUNITY_ID, T1.BUILDENDDATE
    >   FROM (SELECT *
    >           FROM DW_ODS.ODS_COMMUNITY A
    >          WHERE A.DAY_ID = '20170324'
    >            AND COMMUNITY_ID = 5086560) T1
    >  INNER JOIN (SELECT ADDRESS_ID
    >                FROM DW_ODS.ODS_STREET_NUMBER SN
    >               WHERE DAY_ID = '20170324') T2
    >     ON T1.COMMUNITY_ID = T2.ADDRESS_ID;
Query ID = hadoop_20170328085858_57223eaf-6b40-407d-8ec9-4212f1db0668
Total jobs = 1
Stage-1 is selected by condition resolver.
Launching Job 1 out of 1
Number of reduce tasks not specified. Estimated from input data size: 12
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1490177280986_5131, Tracking URL = http://gxdh01:8088/proxy/application_1490177280986_5131/
Kill Command = /opt/cloudera/parcels/CDH-5.7.5-1.cdh5.7.5.p0.3/lib/hadoop/bin/hadoop job  -kill job_1490177280986_5131
Hadoop job information for Stage-1: number of mappers: 8; number of reducers: 12
2017-03-28 08:58:16,658 Stage-1 map = 0%,  reduce = 0%
2017-03-28 08:58:22,817 Stage-1 map = 38%,  reduce = 0%, Cumulative CPU 6.99 sec
2017-03-28 08:58:23,843 Stage-1 map = 75%,  reduce = 0%, Cumulative CPU 18.22 sec
2017-03-28 08:58:24,867 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 27.43 sec
2017-03-28 08:58:29,994 Stage-1 map = 100%,  reduce = 83%, Cumulative CPU 52.4 sec
2017-03-28 08:58:31,018 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 58.12 sec
MapReduce Total cumulative CPU time: 58 seconds 120 msec
Ended Job = job_1490177280986_5131
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 8  Reduce: 12   Cumulative CPU: 58.12 sec   HDFS Read: 743128560 HDFS Write: 84 SUCCESS
Total MapReduce CPU Time Spent: 58 seconds 120 msec
OK
5086560 1971-01-01 01:00:00
5086560 1971-01-01 01:00:00
5086560 1971-01-01 01:00:00
Time taken: 19.316 seconds, Fetched: 3 row(s)
hive> 

you can see the column's value of BUILDENDDATE added by an hour,orignal value is 1971-01-01 00:00:00,but inner join's value is 1971-01-01 01:00:00

please help me,thanks!

why?

1 REPLY 1

Re: Time drift in hive,added by one hour in inner join

New Contributor

please help me,thanks

Don't have an account?
Coming from Hortonworks? Activate your account here