Member since
08-20-2018
4
Posts
0
Kudos Received
0
Solutions
08-28-2018
08:58 AM
I try to use the Hive merge command and delete the rows on real data. But merge command is throwing the below run time exception. Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: GC overhead limit exceeded Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:67, Vertex vertex_1534484623192_0265_2_01 [Map 2] killed/failed due to:OWN_TASK_FAILURE]Vertex killed, vertexName=Reducer 3, vertexId=vertex_1534484623192_0265_2_03, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:200, Vertex vertex_1534484623192_0265_2_03 [Reducer 3] killed/failed due to:OTHER_VERTEX_FAILURE]Vertex killed, vertexName=Reducer 4, vertexId=vertex_1534484623192_0265_2_02, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:2, Vertex vertex_1534484623192_0265_2_02 [Reducer 4] killed/failed due to:OTHER_VERTEX_FAILURE]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:2
... View more
08-28-2018
07:24 AM
I have two transactional tables A and B. I want to delete the Table A records if those records are matching with Table B based on two columns. So please let me know how to do this in better way. I have created the below sample code. create database merge_data; drop table if exists merge_data.students1;
CREATE TABLE merge_data.students1( name string, age int, gpa double) CLUSTERED BY (age) into 5 buckets STORED AS ORC TBLPROPERTIES ('transactional'='true'); INSERT INTO merge_data.students1
VALUES ('fred', 35, 1.28),
('fred', 1, 2.32),
('fred1', 35, 2.32),
('barney', 32, 2.32),
('barney', 2, 2.32),
('barney1', 32, 2.32),
('shyam', 32, 2.32);
drop table if exists merge_data.students2; CREATE TABLE merge_data.students2( name string, age int) CLUSTERED BY (age) into 5 buckets STORED AS ORC TBLPROPERTIES ('transactional'='true'); INSERT INTO merge_data.students2
VALUES ('fred', 35),
('barney', 32); I have used below delete statement and they are working fine for sample data. DELETE FROM merge_data.students1
WHERE CONCAT(CONCAT(name,"-"), age) IN (SELECT CONCAT(CONCAT(name,"-"), age) FROM merge_data.students2); DELETE FROM merge_data.students1 WHERE EXISTS (SELECT name, age FROM merge_data.students2 WHERE students1.name = students2.name AND students1.age = students2.age); But similar queries executed on real data and seeing the below error message. Invalid SubQuery expression 'xyz': For Exists/Not Exists operator SubQuery must be Correlated. Not able to understand this issue and need your inputs on this.
... View more
Labels:
08-20-2018
12:57 PM
--- Hive Queries ---- Please let me know why below merge command throwing the error message? CREATE DATABASE merge_data;
CREATE TABLE merge_data.transactions( ID int, TranValue string, last_update_user string) PARTITIONED BY (tran_date string) CLUSTERED BY (ID) into 5 buckets STORED AS ORC TBLPROPERTIES ('transactional'='true'); INSERT INTO merge_data.transactions PARTITION (tran_date)
VALUES (1, 'value_01', 'creation', '20170410'),
(2, 'value_02', 'creation', '20170410'),
(3, 'value_03', 'creation', '20170410'),
(4, 'value_04', 'creation', '20170410'),
(5, 'value_05', 'creation', '20170413'),
(6, 'value_06', 'creation', '20170413'),
(7, 'value_07', 'creation', '20170413'),
(8, 'value_08', 'creation', '20170413'),
(9, 'value_09', 'creation', '20170413'),
(10, 'value_10','creation', '20170413'); CREATE TABLE merge_data.merge_source( ID int, TranValue string, tran_date string) STORED AS ORC; INSERT INTO merge_data.merge_source VALUES
(1, 'value_01', '20170410'),
(4, NULL, '20170410'),
(7, 'value_77777', '20170413'),
(8, NULL, '20170413'),
(8, 'value_08', '20170415'),
(11, 'value_11', '20170415'); MERGE INTO merge_data.transactions AS T
USING merge_data.merge_source AS S
ON T.ID = S.ID
WHEN MATCHED THEN UPDATE SET TranValue = S.TranValue, last_update_user = 'merge_update'
WHEN NOT MATCHED THEN INSERT VALUES (S.ID, S.TranValue, 'merge_insert', S.tran_date); -- Error -- Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: Hive Runtime Error while closing operators: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating cardinality_violation(_col0,_col1) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating cardinality_violation(_col0,_col1) Caused by: java.lang.RuntimeException: Cardinality Violation in Merge statement: [61, 3, 0],20170415
... View more
Labels:
08-20-2018
12:57 PM
Please let me know why below merge command is throwing the error message. -------- merge command ---- MERGE INTO merge_data.transactions AS T
USING merge_data.merge_source AS S
ON T.ID = S.ID
WHEN MATCHED THEN UPDATE SET TranValue = S.TranValue, last_update_user = 'merge_update'
WHEN NOT MATCHED THEN INSERT VALUES (S.ID, S.TranValue, 'merge_insert', S.tran_date); ------ Error ---- Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: Hive Runtime Error while closing operators: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating cardinality_violation(_col0,_col1) Caused by: java.lang.RuntimeException: Hive Runtime Error while closing operators: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating cardinality_violation(_col0,_col1) Caused by: java.lang.RuntimeException: Cardinality Violation in Merge statement: [61, 3, 0],20170415
... View more