Support Questions

Find answers, ask questions, and share your expertise

Hive table creation for month wise data

avatar
Explorer
  • I am working on month-wise data when I give a select command the result is coming alphabetical order, I would like to get a result in the order from Jan to Dec. Below i am leaving my code.
    hive> SELECT monthcrime13.month,monthcrime13.monthcrime13,monthcrime14.monthcrime14 FROM monthcrime13 FULL OUTER JOIN monthcrime14 ON monthcrime13.month = monthcrime14.month ORDER BY month;
    Total MapReduce jobs = 2
    Launching Job 1 out of 2
    Number of reduce tasks not specified. Estimated from input data size: 1
    In order to change the average load for a reducer (in bytes):
      set hive.exec.reducers.bytes.per.reducer=<number>
    In order to limit the maximum number of reducers:
      set hive.exec.reducers.max=<number>
    In order to set a constant number of reducers:
      set mapred.reduce.tasks=<number>
    Starting Job = job_201708240332_0048, Tracking URL = http://localhost.localdomain:50030/jobdetails.jsp?jobid=job_201708240332_0048
    Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill job_201708240332_0048
    Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 1
    2017-08-24 17:50:14,915 Stage-1 map = 0%,  reduce = 0%
    2017-08-24 17:50:29,064 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.17 sec
    2017-08-24 17:50:30,079 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.17 sec
    2017-08-24 17:50:31,088 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.17 sec
    2017-08-24 17:50:32,096 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.17 sec
    2017-08-24 17:50:33,103 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.17 sec
    2017-08-24 17:50:34,112 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.17 sec
    2017-08-24 17:50:35,120 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.17 sec
    2017-08-24 17:50:36,131 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.17 sec
    2017-08-24 17:50:37,151 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 4.13 sec
    2017-08-24 17:50:38,167 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 4.13 sec
    2017-08-24 17:50:39,194 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 4.13 sec
    2017-08-24 17:50:40,214 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 4.13 sec
    2017-08-24 17:50:41,229 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 4.13 sec
    MapReduce Total cumulative CPU time: 4 seconds 130 msec
    Ended Job = job_201708240332_0048
    Launching Job 2 out of 2
    Number of reduce tasks determined at compile time: 1
    In order to change the average load for a reducer (in bytes):
      set hive.exec.reducers.bytes.per.reducer=<number>
    In order to limit the maximum number of reducers:
      set hive.exec.reducers.max=<number>
    In order to set a constant number of reducers:
      set mapred.reduce.tasks=<number>
    Starting Job = job_201708240332_0049, Tracking URL = http://localhost.localdomain:50030/jobdetails.jsp?jobid=job_201708240332_0049
    Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill job_201708240332_0049
    Hadoop job information for Stage-2: number of mappers: 1; number of reducers: 1
    2017-08-24 17:50:46,809 Stage-2 map = 0%,  reduce = 0%
    2017-08-24 17:50:51,886 Stage-2 map = 100%,  reduce = 0%, Cumulative CPU 0.06 sec
    2017-08-24 17:50:52,892 Stage-2 map = 100%,  reduce = 0%, Cumulative CPU 0.06 sec
    2017-08-24 17:50:53,905 Stage-2 map = 100%,  reduce = 0%, Cumulative CPU 0.06 sec
    2017-08-24 17:50:54,920 Stage-2 map = 100%,  reduce = 0%, Cumulative CPU 0.06 sec
    2017-08-24 17:50:55,936 Stage-2 map = 100%,  reduce = 0%, Cumulative CPU 0.06 sec
    2017-08-24 17:50:56,952 Stage-2 map = 100%,  reduce = 0%, Cumulative CPU 0.06 sec
    2017-08-24 17:50:57,968 Stage-2 map = 100%,  reduce = 100%, Cumulative CPU 0.47 sec
    2017-08-24 17:50:58,984 Stage-2 map = 100%,  reduce = 100%, Cumulative CPU 0.47 sec
    2017-08-24 17:50:59,999 Stage-2 map = 100%,  reduce = 100%, Cumulative CPU 0.47 sec
    2017-08-24 17:51:01,015 Stage-2 map = 100%,  reduce = 100%, Cumulative CPU 0.47 sec
    2017-08-24 17:51:02,025 Stage-2 map = 100%,  reduce = 100%, Cumulative CPU 0.47 sec
    MapReduce Total cumulative CPU time: 470 msec
    Ended Job = job_201708240332_0049
    MapReduce Jobs Launched: 
    Job 0: Map: 2  Reduce: 1   Cumulative CPU: 4.13 sec   HDFS Read: 580 HDFS Write: 420 SUCCESS
    Job 1: Map: 1  Reduce: 1   Cumulative CPU: 0.47 sec   HDFS Read: 764 HDFS Write: 144 SUCCESS
    Total MapReduce CPU Time Spent: 4 seconds 600 msec
    OK
    APR	526	517
    AUG	500	509
    DEC	452	538
    FEB	497	504
    JAN	547	447
    JUL	540	528
    JUN	527	468
    MAR	544	554
    MAY	524	456
    NOV	530	561
    OCT	564	565
    SEP	490	483
    Time taken: 52.88 seconds
    Kindly anyone help me out from this.
1 ACCEPTED SOLUTION

avatar
Explorer

This is the solution for the above question

hive> select a.mon,a.no,b.no from t13 a join t14 b on (a.mon = b.mon);
Query ID = tester_20181105125833_11998cc2-0343-4f56-9e85-2f4628e72e2b
Total jobs = 1
Launching Job 1 out of 1
Status: Running (Executing on YARN cluster with App id application_1541140095101_0005)

--------------------------------------------------------------------------------
        VERTICES      STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED
--------------------------------------------------------------------------------
Map 1 ..........   SUCCEEDED      1          1        0        0       0       0
Map 2 ..........   SUCCEEDED      1          1        0        0       0       0
--------------------------------------------------------------------------------
VERTICES: 02/02  [==========================>>] 100%  ELAPSED TIME: 0.48 s
--------------------------------------------------------------------------------
OK
JAN     447     547
FEB     504     497
MAR     554     544
APR     517     526
MAY     456     524
JUN     468     527
JUL     528     540
AUG     509     500
SEP     483     490
OCT     565     564
NOV     561     530
DEC     538     452
Time taken: 1.315 seconds, Fetched: 12 row(s)

View solution in original post

5 REPLIES 5

avatar
Champion

I dont have my vm with me right know , I did not test it But I think it should help you 

 

 SELECT monthcrime13.month,monthcrime13.monthcrime13,monthcrime14.monthcrime14 FROM monthcrime13 
 FULL OUTER JOIN monthcrime14 ON monthcrime13.month = monthcrime14.month 
 ORDER BY  monthcrime13.month ASC

if you feel the result is not what you expect , please let me know . 

 

avatar
Explorer

Bro..!! it is giving the same result as like as I got earlier.

hive> SELECT monthcrime13.month,monthcrime13.monthcrime13,monthcrime14.monthcrime14 FROM monthcrime13 
    >  FULL OUTER JOIN monthcrime14 ON monthcrime13.month = monthcrime14.month 
    >  ORDER BY  monthcrime13.month ASC;
Total MapReduce jobs = 2
Launching Job 1 out of 2
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapred.reduce.tasks=<number>
Starting Job = job_201708291557_0001, Tracking URL = http://localhost.localdomain:50030/jobdetails.jsp?jobid=job_201708291557_0001
Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill job_201708291557_0001
Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 1
2017-08-29 16:00:01,392 Stage-1 map = 0%,  reduce = 0%
2017-08-29 16:00:16,576 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.34 sec
2017-08-29 16:00:17,594 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.34 sec
2017-08-29 16:00:18,615 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.34 sec
2017-08-29 16:00:19,627 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.34 sec
2017-08-29 16:00:20,652 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.34 sec
2017-08-29 16:00:21,671 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.34 sec
2017-08-29 16:00:22,693 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.34 sec
2017-08-29 16:00:23,708 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.34 sec
2017-08-29 16:00:24,725 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.34 sec
2017-08-29 16:00:25,761 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 3.35 sec
2017-08-29 16:00:26,776 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 3.35 sec
2017-08-29 16:00:27,810 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 3.35 sec
2017-08-29 16:00:28,832 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 3.35 sec
2017-08-29 16:00:29,860 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 3.35 sec
2017-08-29 16:00:30,897 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 3.35 sec
MapReduce Total cumulative CPU time: 3 seconds 350 msec
Ended Job = job_201708291557_0001
Launching Job 2 out of 2
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapred.reduce.tasks=<number>
Starting Job = job_201708291557_0002, Tracking URL = http://localhost.localdomain:50030/jobdetails.jsp?jobid=job_201708291557_0002
Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill job_201708291557_0002
Hadoop job information for Stage-2: number of mappers: 1; number of reducers: 1
2017-08-29 16:00:39,038 Stage-2 map = 0%,  reduce = 0%
2017-08-29 16:00:45,081 Stage-2 map = 100%,  reduce = 0%, Cumulative CPU 0.43 sec
2017-08-29 16:00:46,092 Stage-2 map = 100%,  reduce = 0%, Cumulative CPU 0.43 sec
2017-08-29 16:00:47,109 Stage-2 map = 100%,  reduce = 0%, Cumulative CPU 0.43 sec
2017-08-29 16:00:48,279 Stage-2 map = 100%,  reduce = 0%, Cumulative CPU 0.43 sec
2017-08-29 16:00:49,294 Stage-2 map = 100%,  reduce = 0%, Cumulative CPU 0.43 sec
2017-08-29 16:00:50,309 Stage-2 map = 100%,  reduce = 0%, Cumulative CPU 0.43 sec
2017-08-29 16:00:51,325 Stage-2 map = 100%,  reduce = 0%, Cumulative CPU 0.43 sec
2017-08-29 16:00:52,360 Stage-2 map = 100%,  reduce = 0%, Cumulative CPU 0.43 sec
2017-08-29 16:00:53,477 Stage-2 map = 100%,  reduce = 0%, Cumulative CPU 0.43 sec
2017-08-29 16:00:54,497 Stage-2 map = 100%,  reduce = 0%, Cumulative CPU 0.43 sec
2017-08-29 16:00:55,534 Stage-2 map = 100%,  reduce = 100%, Cumulative CPU 2.12 sec
2017-08-29 16:00:56,553 Stage-2 map = 100%,  reduce = 100%, Cumulative CPU 2.12 sec
2017-08-29 16:00:57,566 Stage-2 map = 100%,  reduce = 100%, Cumulative CPU 2.12 sec
2017-08-29 16:00:58,580 Stage-2 map = 100%,  reduce = 100%, Cumulative CPU 2.12 sec
2017-08-29 16:00:59,597 Stage-2 map = 100%,  reduce = 100%, Cumulative CPU 2.12 sec
2017-08-29 16:01:00,613 Stage-2 map = 100%,  reduce = 100%, Cumulative CPU 2.12 sec
MapReduce Total cumulative CPU time: 2 seconds 120 msec
Ended Job = job_201708291557_0002
MapReduce Jobs Launched: 
Job 0: Map: 2  Reduce: 1   Cumulative CPU: 3.35 sec   HDFS Read: 580 HDFS Write: 420 SUCCESS
Job 1: Map: 1  Reduce: 1   Cumulative CPU: 2.12 sec   HDFS Read: 764 HDFS Write: 144 SUCCESS
Total MapReduce CPU Time Spent: 5 seconds 470 msec
OK
APR	526	517
AUG	500	509
DEC	452	538
FEB	497	504
JAN	547	447
JUL	540	528
JUN	527	468
MAR	544	554
MAY	524	456
NOV	530	561
OCT	564	565
SEP	490	483
Time taken: 80.716 seconds
hive> 

avatar
Champion

Could you give me the table schema . 

avatar
Explorer

 

This is 2014 crime data :
JAN,447
FEB,504
MAR,554
APR,517
MAY,456
JUN,468
JUL,528
AUG,509
SEP,483
OCT,565
NOV,561
DEC,538 
 
This is 2013 table:
JAN,547
FEB,497
MAR,544
APR,526
MAY,524
JUN,527
JUL,540
AUG,500
SEP,490
OCT,564
NOV,530
DEC,452
 
schema is :
 
use database;
 
create table monthcrime13( month string, monthcrime int)
row format delimited
fields terminated  by ',';
 
load data local inpath 'Desktop/2013' into table monthcrime13;
 
Select * from monthcrime13;
( it is giving the result in the proper order from jan to dec)
 
create table monthcrime14( month string, monthcrime int)
row format delimited
fields terminated  by ',';
 
load data local inpath 'Desktop/2014' into table monthcrime14;
 
SELECT monthcrime13.month ...........
(here im trying to connect two table using full outer join connection, here im getting the problem).
 

avatar
Explorer

This is the solution for the above question

hive> select a.mon,a.no,b.no from t13 a join t14 b on (a.mon = b.mon);
Query ID = tester_20181105125833_11998cc2-0343-4f56-9e85-2f4628e72e2b
Total jobs = 1
Launching Job 1 out of 1
Status: Running (Executing on YARN cluster with App id application_1541140095101_0005)

--------------------------------------------------------------------------------
        VERTICES      STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED
--------------------------------------------------------------------------------
Map 1 ..........   SUCCEEDED      1          1        0        0       0       0
Map 2 ..........   SUCCEEDED      1          1        0        0       0       0
--------------------------------------------------------------------------------
VERTICES: 02/02  [==========================>>] 100%  ELAPSED TIME: 0.48 s
--------------------------------------------------------------------------------
OK
JAN     447     547
FEB     504     497
MAR     554     544
APR     517     526
MAY     456     524
JUN     468     527
JUL     528     540
AUG     509     500
SEP     483     490
OCT     565     564
NOV     561     530
DEC     538     452
Time taken: 1.315 seconds, Fetched: 12 row(s)