- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Hive table creation for month wise data
- Labels:
-
Apache Hive
-
Apache Impala
-
Cloudera Manager
Created on ‎08-25-2017 02:06 PM - edited ‎09-16-2022 05:09 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- I am working on month-wise data when I give a select command the result is coming alphabetical order, I would like to get a result in the order from Jan to Dec. Below i am leaving my code.
hive> SELECT monthcrime13.month,monthcrime13.monthcrime13,monthcrime14.monthcrime14 FROM monthcrime13 FULL OUTER JOIN monthcrime14 ON monthcrime13.month = monthcrime14.month ORDER BY month; Total MapReduce jobs = 2 Launching Job 1 out of 2 Number of reduce tasks not specified. Estimated from input data size: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapred.reduce.tasks=<number> Starting Job = job_201708240332_0048, Tracking URL = http://localhost.localdomain:50030/jobdetails.jsp?jobid=job_201708240332_0048 Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_201708240332_0048 Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 1 2017-08-24 17:50:14,915 Stage-1 map = 0%, reduce = 0% 2017-08-24 17:50:29,064 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.17 sec 2017-08-24 17:50:30,079 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.17 sec 2017-08-24 17:50:31,088 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.17 sec 2017-08-24 17:50:32,096 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.17 sec 2017-08-24 17:50:33,103 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.17 sec 2017-08-24 17:50:34,112 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.17 sec 2017-08-24 17:50:35,120 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.17 sec 2017-08-24 17:50:36,131 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.17 sec 2017-08-24 17:50:37,151 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 4.13 sec 2017-08-24 17:50:38,167 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 4.13 sec 2017-08-24 17:50:39,194 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 4.13 sec 2017-08-24 17:50:40,214 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 4.13 sec 2017-08-24 17:50:41,229 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 4.13 sec MapReduce Total cumulative CPU time: 4 seconds 130 msec Ended Job = job_201708240332_0048 Launching Job 2 out of 2 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapred.reduce.tasks=<number> Starting Job = job_201708240332_0049, Tracking URL = http://localhost.localdomain:50030/jobdetails.jsp?jobid=job_201708240332_0049 Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_201708240332_0049 Hadoop job information for Stage-2: number of mappers: 1; number of reducers: 1 2017-08-24 17:50:46,809 Stage-2 map = 0%, reduce = 0% 2017-08-24 17:50:51,886 Stage-2 map = 100%, reduce = 0%, Cumulative CPU 0.06 sec 2017-08-24 17:50:52,892 Stage-2 map = 100%, reduce = 0%, Cumulative CPU 0.06 sec 2017-08-24 17:50:53,905 Stage-2 map = 100%, reduce = 0%, Cumulative CPU 0.06 sec 2017-08-24 17:50:54,920 Stage-2 map = 100%, reduce = 0%, Cumulative CPU 0.06 sec 2017-08-24 17:50:55,936 Stage-2 map = 100%, reduce = 0%, Cumulative CPU 0.06 sec 2017-08-24 17:50:56,952 Stage-2 map = 100%, reduce = 0%, Cumulative CPU 0.06 sec 2017-08-24 17:50:57,968 Stage-2 map = 100%, reduce = 100%, Cumulative CPU 0.47 sec 2017-08-24 17:50:58,984 Stage-2 map = 100%, reduce = 100%, Cumulative CPU 0.47 sec 2017-08-24 17:50:59,999 Stage-2 map = 100%, reduce = 100%, Cumulative CPU 0.47 sec 2017-08-24 17:51:01,015 Stage-2 map = 100%, reduce = 100%, Cumulative CPU 0.47 sec 2017-08-24 17:51:02,025 Stage-2 map = 100%, reduce = 100%, Cumulative CPU 0.47 sec MapReduce Total cumulative CPU time: 470 msec Ended Job = job_201708240332_0049 MapReduce Jobs Launched: Job 0: Map: 2 Reduce: 1 Cumulative CPU: 4.13 sec HDFS Read: 580 HDFS Write: 420 SUCCESS Job 1: Map: 1 Reduce: 1 Cumulative CPU: 0.47 sec HDFS Read: 764 HDFS Write: 144 SUCCESS Total MapReduce CPU Time Spent: 4 seconds 600 msec OK APR 526 517 AUG 500 509 DEC 452 538 FEB 497 504 JAN 547 447 JUL 540 528 JUN 527 468 MAR 544 554 MAY 524 456 NOV 530 561 OCT 564 565 SEP 490 483 Time taken: 52.88 seconds
Kindly anyone help me out from this.
Created ‎11-04-2018 11:30 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This is the solution for the above question
hive> select a.mon,a.no,b.no from t13 a join t14 b on (a.mon = b.mon); Query ID = tester_20181105125833_11998cc2-0343-4f56-9e85-2f4628e72e2b Total jobs = 1 Launching Job 1 out of 1 Status: Running (Executing on YARN cluster with App id application_1541140095101_0005) -------------------------------------------------------------------------------- VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED -------------------------------------------------------------------------------- Map 1 .......... SUCCEEDED 1 1 0 0 0 0 Map 2 .......... SUCCEEDED 1 1 0 0 0 0 -------------------------------------------------------------------------------- VERTICES: 02/02 [==========================>>] 100% ELAPSED TIME: 0.48 s -------------------------------------------------------------------------------- OK JAN 447 547 FEB 504 497 MAR 554 544 APR 517 526 MAY 456 524 JUN 468 527 JUL 528 540 AUG 509 500 SEP 483 490 OCT 565 564 NOV 561 530 DEC 538 452 Time taken: 1.315 seconds, Fetched: 12 row(s)
Created on ‎08-27-2017 06:35 PM - edited ‎08-27-2017 06:47 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I dont have my vm with me right know , I did not test it But I think it should help you
SELECT monthcrime13.month,monthcrime13.monthcrime13,monthcrime14.monthcrime14 FROM monthcrime13 FULL OUTER JOIN monthcrime14 ON monthcrime13.month = monthcrime14.month ORDER BY monthcrime13.month ASC
if you feel the result is not what you expect , please let me know .
Created ‎08-29-2017 04:02 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Bro..!! it is giving the same result as like as I got earlier.
hive> SELECT monthcrime13.month,monthcrime13.monthcrime13,monthcrime14.monthcrime14 FROM monthcrime13 > FULL OUTER JOIN monthcrime14 ON monthcrime13.month = monthcrime14.month > ORDER BY monthcrime13.month ASC; Total MapReduce jobs = 2 Launching Job 1 out of 2 Number of reduce tasks not specified. Estimated from input data size: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapred.reduce.tasks=<number> Starting Job = job_201708291557_0001, Tracking URL = http://localhost.localdomain:50030/jobdetails.jsp?jobid=job_201708291557_0001 Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_201708291557_0001 Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 1 2017-08-29 16:00:01,392 Stage-1 map = 0%, reduce = 0% 2017-08-29 16:00:16,576 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.34 sec 2017-08-29 16:00:17,594 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.34 sec 2017-08-29 16:00:18,615 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.34 sec 2017-08-29 16:00:19,627 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.34 sec 2017-08-29 16:00:20,652 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.34 sec 2017-08-29 16:00:21,671 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.34 sec 2017-08-29 16:00:22,693 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.34 sec 2017-08-29 16:00:23,708 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.34 sec 2017-08-29 16:00:24,725 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.34 sec 2017-08-29 16:00:25,761 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 3.35 sec 2017-08-29 16:00:26,776 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 3.35 sec 2017-08-29 16:00:27,810 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 3.35 sec 2017-08-29 16:00:28,832 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 3.35 sec 2017-08-29 16:00:29,860 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 3.35 sec 2017-08-29 16:00:30,897 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 3.35 sec MapReduce Total cumulative CPU time: 3 seconds 350 msec Ended Job = job_201708291557_0001 Launching Job 2 out of 2 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapred.reduce.tasks=<number> Starting Job = job_201708291557_0002, Tracking URL = http://localhost.localdomain:50030/jobdetails.jsp?jobid=job_201708291557_0002 Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_201708291557_0002 Hadoop job information for Stage-2: number of mappers: 1; number of reducers: 1 2017-08-29 16:00:39,038 Stage-2 map = 0%, reduce = 0% 2017-08-29 16:00:45,081 Stage-2 map = 100%, reduce = 0%, Cumulative CPU 0.43 sec 2017-08-29 16:00:46,092 Stage-2 map = 100%, reduce = 0%, Cumulative CPU 0.43 sec 2017-08-29 16:00:47,109 Stage-2 map = 100%, reduce = 0%, Cumulative CPU 0.43 sec 2017-08-29 16:00:48,279 Stage-2 map = 100%, reduce = 0%, Cumulative CPU 0.43 sec 2017-08-29 16:00:49,294 Stage-2 map = 100%, reduce = 0%, Cumulative CPU 0.43 sec 2017-08-29 16:00:50,309 Stage-2 map = 100%, reduce = 0%, Cumulative CPU 0.43 sec 2017-08-29 16:00:51,325 Stage-2 map = 100%, reduce = 0%, Cumulative CPU 0.43 sec 2017-08-29 16:00:52,360 Stage-2 map = 100%, reduce = 0%, Cumulative CPU 0.43 sec 2017-08-29 16:00:53,477 Stage-2 map = 100%, reduce = 0%, Cumulative CPU 0.43 sec 2017-08-29 16:00:54,497 Stage-2 map = 100%, reduce = 0%, Cumulative CPU 0.43 sec 2017-08-29 16:00:55,534 Stage-2 map = 100%, reduce = 100%, Cumulative CPU 2.12 sec 2017-08-29 16:00:56,553 Stage-2 map = 100%, reduce = 100%, Cumulative CPU 2.12 sec 2017-08-29 16:00:57,566 Stage-2 map = 100%, reduce = 100%, Cumulative CPU 2.12 sec 2017-08-29 16:00:58,580 Stage-2 map = 100%, reduce = 100%, Cumulative CPU 2.12 sec 2017-08-29 16:00:59,597 Stage-2 map = 100%, reduce = 100%, Cumulative CPU 2.12 sec 2017-08-29 16:01:00,613 Stage-2 map = 100%, reduce = 100%, Cumulative CPU 2.12 sec MapReduce Total cumulative CPU time: 2 seconds 120 msec Ended Job = job_201708291557_0002 MapReduce Jobs Launched: Job 0: Map: 2 Reduce: 1 Cumulative CPU: 3.35 sec HDFS Read: 580 HDFS Write: 420 SUCCESS Job 1: Map: 1 Reduce: 1 Cumulative CPU: 2.12 sec HDFS Read: 764 HDFS Write: 144 SUCCESS Total MapReduce CPU Time Spent: 5 seconds 470 msec OK APR 526 517 AUG 500 509 DEC 452 538 FEB 497 504 JAN 547 447 JUL 540 528 JUN 527 468 MAR 544 554 MAY 524 456 NOV 530 561 OCT 564 565 SEP 490 483 Time taken: 80.716 seconds hive>
Created ‎08-29-2017 05:37 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Could you give me the table schema .
Created on ‎08-30-2017 02:45 PM - edited ‎08-30-2017 02:46 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This is 2014 crime data : JAN,447 FEB,504 MAR,554 APR,517 MAY,456 JUN,468 JUL,528 AUG,509 SEP,483 OCT,565 NOV,561 DEC,538 This is 2013 table: JAN,547 FEB,497 MAR,544 APR,526 MAY,524 JUN,527 JUL,540 AUG,500 SEP,490 OCT,564 NOV,530 DEC,452 schema is : use database; create table monthcrime13( month string, monthcrime int) row format delimited fields terminated by ','; load data local inpath 'Desktop/2013' into table monthcrime13; Select * from monthcrime13; ( it is giving the result in the proper order from jan to dec) create table monthcrime14( month string, monthcrime int) row format delimited fields terminated by ','; load data local inpath 'Desktop/2014' into table monthcrime14; SELECT monthcrime13.month ........... (here im trying to connect two table using full outer join connection, here im getting the problem).
Created ‎11-04-2018 11:30 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This is the solution for the above question
hive> select a.mon,a.no,b.no from t13 a join t14 b on (a.mon = b.mon); Query ID = tester_20181105125833_11998cc2-0343-4f56-9e85-2f4628e72e2b Total jobs = 1 Launching Job 1 out of 1 Status: Running (Executing on YARN cluster with App id application_1541140095101_0005) -------------------------------------------------------------------------------- VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED -------------------------------------------------------------------------------- Map 1 .......... SUCCEEDED 1 1 0 0 0 0 Map 2 .......... SUCCEEDED 1 1 0 0 0 0 -------------------------------------------------------------------------------- VERTICES: 02/02 [==========================>>] 100% ELAPSED TIME: 0.48 s -------------------------------------------------------------------------------- OK JAN 447 547 FEB 504 497 MAR 554 544 APR 517 526 MAY 456 524 JUN 468 527 JUL 528 540 AUG 509 500 SEP 483 490 OCT 565 564 NOV 561 530 DEC 538 452 Time taken: 1.315 seconds, Fetched: 12 row(s)
