Member since
09-25-2015
230
Posts
276
Kudos Received
39
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
25014 | 07-05-2016 01:19 PM | |
8406 | 04-01-2016 02:16 PM | |
2101 | 02-17-2016 11:54 AM | |
5639 | 02-17-2016 11:50 AM | |
12627 | 02-16-2016 02:08 AM |
12-18-2015
10:43 AM
3 Kudos
@yjiang Add tez.queue.name to your custom hiveserver2-site in Ambari and restart hiveserver2, it will make your default queue. You can also specify the queue when you connect or before submitting your query: jdbc:hive2://localhost:10000?tez.queue.name=hive2 or set tez.queue.name=hive2; Also check this post from @David Streever for more detailed information: https://streever.atlassian.net/wiki/pages/viewpage.action?pageId=4390918
... View more
12-18-2015
10:31 AM
1 Kudo
@Divya Gehlot try to run logged as hive user instead of hdfs user.
... View more
12-18-2015
10:25 AM
1 Kudo
@Suresh Bonam You have to use LATERAL VIEW to do it. See this: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LateralView And an example here: select s.code, exp.splitted
from sample_07 s
lateral view explode(split('asdfa adsfa asdaf asdfad','\\s')) exp as splitted
... View more
12-18-2015
10:19 AM
1 Kudo
@Suresh Bonam you are right. I updated the answer. Thank you!
... View more
12-17-2015
02:42 PM
@William Gonzalez Maybe a cast can be a workaround, like this: --query "select col1 as col1, col2 as col2, cast('' as varchar(10)) as col3, col4 as col4 from input_table" \
... View more
12-17-2015
01:35 PM
2 Kudos
@Ranjith M Try two options below. It will read your source table just once and will create 2 lines: create external table test_groupby
(sale_id string, salevalue double, datex string)
row format delimited fields terminated by ','
stored as textfile
location '/tmp/groupby'
;
select * from test_groupby;
select inline(
array(
named_struct('type', 'month', 'value', sum(case when substr(datex, 1,7) = '2015/12' then salevalue end)),
named_struct('type', 'year', 'value', sum(case when substr(datex, 1,4) = '2015' then salevalue end))
))
from test_groupby
;
create external table test_groupby
(sale_id string, salevalue double, datex string)
row format delimited fields terminated by ','
stored as textfile
location '/tmp/groupby'
;
select * from test_groupby;
drop table test_groupby_result;
create table test_groupby_result
(value double)
partitioned by (type string)
;
from test_groupby
insert into table test_groupby_result partition (type = 'month')
select sum(salevalue)
where substr(datex, 1,7) = '2015/12'
insert into table test_groupby_result partition (type = 'year')
select sum(salevalue)
where substr(datex, 1,4) = '2015'
;
select * from test_groupby_result;
... View more
12-17-2015
12:51 PM
2 Kudos
@Suresh Bonam
Try this: select weekofyear(from_unixtime(unix_timestamp('Thu Dec 17 15:55:08 IST 2015', 'EEE MMM d HH:mm:ss Z yyyy'),'yyyy-MM-dd')) from sample_07 limit 1;
select weekofyear(from_unixtime(unix_timestamp('Thu Dec 02 15:55:08 IST 2015', 'EEE MMM d HH:mm:ss Z yyyy'),'yyyy-MM-dd')) from sample_07 limit 1;
... View more
12-17-2015
11:36 AM
1 Kudo
@vshukla is there any equivalent of hiveserver2 "doAs" in Spark Thrift Server?
... View more
12-16-2015
11:34 PM
@Pardeep See this thread here: https://community.hortonworks.com/questions/4759/hive-explain-says-plan-not-optimized-by-cbo-due-to.html We couldn't find a way to see "columns" stats (analyze table t compute statistics for columns). I think describe extended shows only table stats. Also looking for a solution to get rid of warning: Plan not optimized by CBO due to missing statistics. Please check log for more details, from above question.
... View more
12-16-2015
09:39 PM
1 Kudo
@Vitor Batista Ambari Metrics monitor all services / hosts and it consumes resources to run. For small test environments you can stop Ambari Metrics with few cpu, you can stop ambari metrics, it wont affect other components. Our Sandbox, for example, does not have Ambari Metrics started by default.
... View more