Member since
04-11-2016
13
Posts
4
Kudos Received
0
Solutions
07-13-2018
09:38 AM
so far I know for sure that if vectorization is disabled the problem goes away(because in that case the reader is not vectorized..) set hive.vectorized.execution.enabled=false;
corresponding hive ticket: https://issues.apache.org/jira/browse/HIVE-19943
... View more
07-11-2018
02:08 PM
I think you would probably want to take a look at the "MERGE" statement; https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-Merge but afaik it would need to only keep the last operation on the row to make it work; which is what you are also suggesting (I,U,D = D)... -- merge is only supported on transactional tables
set hive.support.concurrency=true;
set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
drop table if exists target;
drop table if exists upd;
create table target (id string, t timestamp)
stored as orc
tblproperties('transactional'='true');
create table upd (id string, iud string, t timestamp);
insert into target values ('1','2001-11-11'),
('10','2001-11-11');
insert into upd values ('1','D','2001-11-11'),
('2','U','2001-10-11'),
('2','D','2001-11-11'),
('2','I','2001-12-11'),
('3','U','2001-10-11'),
('3','D','2001-11-11'),
('4','I','2001-11-11'),
('10','D','2001-11-11');
-- the following will filter the "final" updates
select * from (
select *,LEAD(id,1) over (partition by id order by t) as lead
from upd
order by t
) tt where lead is null;
-- this merge statement will bake the updates into the target table
merge into target as ta using
(
select * from (
select *,LEAD(id,1) over (partition by id order by t) as lead
from upd
order by t
) tt where lead is null
) as up
ON ta.id=up.id
when matched and up.iud='D' then DELETE
when matched and up.iud='U' then update set t=up.t
when not matched and up.iud='I' then insert values (up.id,up.t);
for more info about using LEAD in this query: https://sqlsunday.com/2016/04/11/last-row-per-group/
... View more
07-11-2018
11:30 AM
Could you share the hiveserver2 exception backtrace? I'm not sure whats wrong; but that would probably help to locate the source of the problem....I think if it's in core-site.xml that should be enough...
... View more
07-11-2018
11:20 AM
the "alter table" which is happening should be okay; since that call stores some results of the analyze I don't know about any tickets which records that it could change the location by mistake.. But according to the article you've found; this seems to have happened to other people as well... All I can think about right now is that something is not fully thread safe. Is there by any chance multiple analyze statements running at the same time? Are you using standalone hive-cli or beeline/jdbc with hiveserver2 ?
... View more
07-09-2018
03:00 PM
please add also Hue label...because there might be some Hue experts who might be able to help.
... View more
07-09-2018
02:58 PM
I'm not sure what causes the problem; I've tried to reproduce you issue without success: drop table if exists t;
create table t (a int) partitioned by (p string);
insert into t partition(p=1) values(1);
-- runs ok
analyze table t partition(p=1) compute statistics;
-- throws exception for me right now
analyze table t partition(p=2) compute statistics;
Could you please help extend this minimal sample case into a testcase which reproduces your issue? Could you give some more precise information about the hive you are using (full version number)... - if you are using Apache Hive 2.0.0; I would like recommend to try out either 2.3.3 or 3.0.0
... View more
07-09-2018
02:35 PM
1 Kudo
Yes it was missing, I've added it to the documentation; because it's needed to make llap work in the first place. https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties
... View more
07-09-2018
02:23 PM
I'm not sure what your original goal is - the following approach may help; or possibly not... Beeline has some output format features which enables it to output the resultset in csv/tsv2 format. see: https://community.hortonworks.com/questions/25789/how-to-dump-the-output-from-beeline.html?childToView=25950#comment-25950 I think with `concat_ws` you are very close; but I'm afraid its not possible to convert a struct into an array (or at least into a map - in which case you might be able to use `map_values` to obtain the array)
... View more
07-09-2018
01:57 PM
I'm not sure if it matters; but it seems there is a 1.5 hour difference between the 2 ps -ef -s ; it might be possible that somehow one of the hs2 instances have not shut down cleanly earlier; I would recommend to try shutting both hs2 down; and checking that they are both stopped by hand ; check that the configs are updated - and start both of them...
... View more
07-09-2018
01:33 PM
I think you might be looking at Ambari settings. Ambari will install and run 2 kinds of hiveservers when Interactive query is enabled; but in the "advanced" section you always see both hiveserver's configurations; as long as "Interactive query" is off; the hive site of the interactive hs2 is not in action
... View more
03-30-2017
08:52 PM
You may want to take a look at hive-json to extract the json schema (from: https://github.com/hortonworks/hive-json) i've run: bin/find-json-schema ~/projects/structor/modules/druid_overlord/files/TimeSeriesQuery.json and creating the table, but set jsonserde rowformat: create table tbl (
aggregations array <struct <
fieldName: string,
name: string,
type: string>>,
dataSource string,
granularity string,
intervals array <string>,
queryType string
)
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
; load the data: LOAD DATA LOCAL INPATH "/home/kirk/projects/structor/modules/druid_overlord/files/TimeSeriesQuery.json" INTO TABLE tbl; switch to orc format (this is a columnar format, it will be better this way): create table tbl_orc (
aggregations array <struct <
fieldName: string,
name: string,
type: string>>,
dataSource string,
granularity string,
intervals array <string>,
queryType string
)
STORED AS ORC
;
insert overwrite table tbl_orc select * from tbl;
select count(*) from tbl_orc where queryType = 'timeseries'; start querying tbl_orc... i'm only aware of it as of today...but it works like a charm 🙂
... View more
11-17-2016
08:53 AM
1 Kudo
can you check that the database password in the xml file(javax.jdo.option.ConnectionPassword) - is it the correct password? (maybe: try to remove any whitespaces before/after the password if there are any)
... View more
11-16-2016
04:15 PM
2 Kudos
from the above: connection reset / broken pipe - I would recommend to also check some basic things like: * `netstat -npl` shows the metastore and hiveserver2 as running and listening on the port... also check that if you shutdown the hiveserver2 and metastore those ports are closed - i would guess that some other application have "hijacked" the hiveserver2 or the metastore port - and the application listening on that port doesn't speak the "hive" protocol 😉 * loopback device is up * the network is working / hostnames can you attach the hiveserver2 and/or metastore logs which could possibly cause this?
... View more