Member since
04-11-2016
13
Posts
4
Kudos Received
0
Solutions
07-13-2018
09:38 AM
so far I know for sure that if vectorization is disabled the problem goes away(because in that case the reader is not vectorized..) set hive.vectorized.execution.enabled=false;
corresponding hive ticket: https://issues.apache.org/jira/browse/HIVE-19943
... View more
07-11-2018
02:08 PM
I think you would probably want to take a look at the "MERGE" statement; https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-Merge but afaik it would need to only keep the last operation on the row to make it work; which is what you are also suggesting (I,U,D = D)... -- merge is only supported on transactional tables
set hive.support.concurrency=true;
set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
drop table if exists target;
drop table if exists upd;
create table target (id string, t timestamp)
stored as orc
tblproperties('transactional'='true');
create table upd (id string, iud string, t timestamp);
insert into target values ('1','2001-11-11'),
('10','2001-11-11');
insert into upd values ('1','D','2001-11-11'),
('2','U','2001-10-11'),
('2','D','2001-11-11'),
('2','I','2001-12-11'),
('3','U','2001-10-11'),
('3','D','2001-11-11'),
('4','I','2001-11-11'),
('10','D','2001-11-11');
-- the following will filter the "final" updates
select * from (
select *,LEAD(id,1) over (partition by id order by t) as lead
from upd
order by t
) tt where lead is null;
-- this merge statement will bake the updates into the target table
merge into target as ta using
(
select * from (
select *,LEAD(id,1) over (partition by id order by t) as lead
from upd
order by t
) tt where lead is null
) as up
ON ta.id=up.id
when matched and up.iud='D' then DELETE
when matched and up.iud='U' then update set t=up.t
when not matched and up.iud='I' then insert values (up.id,up.t);
for more info about using LEAD in this query: https://sqlsunday.com/2016/04/11/last-row-per-group/
... View more
07-09-2018
03:00 PM
please add also Hue label...because there might be some Hue experts who might be able to help.
... View more
07-09-2018
02:23 PM
I'm not sure what your original goal is - the following approach may help; or possibly not... Beeline has some output format features which enables it to output the resultset in csv/tsv2 format. see: https://community.hortonworks.com/questions/25789/how-to-dump-the-output-from-beeline.html?childToView=25950#comment-25950 I think with `concat_ws` you are very close; but I'm afraid its not possible to convert a struct into an array (or at least into a map - in which case you might be able to use `map_values` to obtain the array)
... View more
07-09-2018
01:57 PM
I'm not sure if it matters; but it seems there is a 1.5 hour difference between the 2 ps -ef -s ; it might be possible that somehow one of the hs2 instances have not shut down cleanly earlier; I would recommend to try shutting both hs2 down; and checking that they are both stopped by hand ; check that the configs are updated - and start both of them...
... View more
11-17-2016
08:53 AM
1 Kudo
can you check that the database password in the xml file(javax.jdo.option.ConnectionPassword) - is it the correct password? (maybe: try to remove any whitespaces before/after the password if there are any)
... View more
11-16-2016
04:15 PM
2 Kudos
from the above: connection reset / broken pipe - I would recommend to also check some basic things like: * `netstat -npl` shows the metastore and hiveserver2 as running and listening on the port... also check that if you shutdown the hiveserver2 and metastore those ports are closed - i would guess that some other application have "hijacked" the hiveserver2 or the metastore port - and the application listening on that port doesn't speak the "hive" protocol 😉 * loopback device is up * the network is working / hostnames can you attach the hiveserver2 and/or metastore logs which could possibly cause this?
... View more