Support Questions
Find answers, ask questions, and share your expertise

hive updates very slow

Super Collaborator

hi:

iam executing this but its very slow and not all memory ram are using, and just 1 map, iam running with tez, any suggestions??

INFO  : Map 1: 1/1 Reducer 2: 59(+1)/60 
INFO  : Map 1: 1/1 Reducer 2: 59(+1)/60 
INFO  : Map 1: 1/1 Reducer 2: 59(+1)/60
UPDATE desercion_clientes_temp3 SET PROD='CM' WHERE (COD_LINEA ='01' AND ID_GRP_PD='11') OR (COD_LINEA ='01' AND ID_GRP_PD='13') OR (COD_LINEA ='01' AND ID_GRP_PD='21');
UPDATE desercion_clientes_temp3 SET PROD='INGs' WHERE (COD_LINEA ='01' AND ID_GRP_PD='31') OR (COD_LINEA ='01' AND ID_GRP_PD='32') OR (COD_LINEA ='01' AND ID_GRP_PD='33');
UPDATE desercion_clientes_temp3 SET PROD='AXT' WHERE (COD_LINEA ='01' AND ID_GRP_PD='34') OR (COD_LINEA ='01' AND ID_GRP_PD='35') OR (COD_LINEA ='01' AND ID_GRP_PD='36');
UPDATE desercion_clientes_temp3 SET PROD='CRE' WHERE (COD_LINEA ='01' AND ID_GRP_PD='41');
UPDATE desercion_clientes_temp3 SET PROD='TC'  WHERE (COD_LINEA ='01' AND ID_GRP_PD='50') OR (COD_LINEA ='01' AND ID_GRP_PD='51');
UPDATE desercion_clientes_temp3 SET PROD='PRE' WHERE (COD_LINEA ='01' AND ID_GRP_PD='71');
UPDATE desercion_clientes_temp3 SET PROD='VIST' WHERE (COD_LINEA ='03' AND ID_GRP_PD='11') OR (COD_LINEA ='03' AND ID_GRP_PD='21');
UPDATE desercion_clientes_temp3 SET PROD='PLZO' WHERE (COD_LINEA ='03' AND ID_GRP_PD='51');
UPDATE desercion_clientes_temp3 SET PROD='PLZOES' WHERE (COD_LINEA ='03' AND ID_GRP_PD='52');
UPDATE desercion_clientes_temp3 SET PROD='SUBORD' WHERE (COD_LINEA ='03' AND ID_GRP_PD='92') OR (COD_LINEA ='03' AND ID_GRP_PD='93');
UPDATE desercion_clientes_temp3 SET PROD='VAL' WHERE (COD_LINEA ='04' AND ID_GRP_PD='10') OR (COD_LINEA ='04' AND ID_GRP_PD='20') OR (COD_LINEA ='04' AND ID_GRP_PD='40');
UPDATE desercion_clientes_temp3 SET PROD='PPEN' WHERE (COD_LINEA ='04' AND ID_GRP_PD='30') OR (COD_LINEA ='04' AND ID_GRP_PD='34');
UPDATE desercion_clientes_temp3 SET PROD='FONI' WHERE (COD_LINEA ='04' AND ID_GRP_PD='56');
UPDATE desercion_clientes_temp3 SET PROD='SEGAH' WHERE (COD_LINEA ='04' AND ID_GRP_PD='61');
UPDATE desercion_clientes_temp3 SET PROD='REC' WHERE (COD_LINEA ='05' AND ID_GRP_PD='01');
UPDATE desercion_clientes_temp3 SET PROD='PDOM' WHERE (COD_LINEA ='05' AND ID_GRP_PD='02');
UPDATE desercion_clientes_temp3 SET PROD='EMRV' WHERE (COD_LINEA ='05' AND ID_GRP_PD='11');
UPDATE desercion_clientes_temp3 SET PROD='GCOBRO' WHERE (COD_LINEA ='05' AND ID_GRP_PD='12');
UPDATE desercion_clientes_temp3 SET PROD='EMRECDOM' WHERE (COD_LINEA ='05' AND ID_GRP_PD='13');
UPDATE desercion_clientes_temp3 SET PROD='RECVEN' WHERE (COD_LINEA ='05' AND ID_GRP_PD='14');
UPDATE desercion_clientes_temp3 SET PROD='CONFIR' WHERE (COD_LINEA ='05' AND ID_GRP_PD='15');
UPDATE desercion_clientes_temp3 SET PROD='DOMINGREC' WHERE (COD_LINEA ='05' AND ID_GRP_PD='33') OR (COD_LINEA ='05' AND ID_GRP_PD='31');
UPDATE desercion_clientes_temp3 SET PROD='EMISTR' WHERE (COD_LINEA ='05' AND ID_GRP_PD='41');
UPDATE desercion_clientes_temp3 SET PROD='EMISTRMAS' WHERE (COD_LINEA ='05' AND ID_GRP_PD='42');
UPDATE desercion_clientes_temp3 SET PROD='PRVENT' WHERE (COD_LINEA ='05' AND ID_GRP_PD='43');
UPDATE desercion_clientes_temp3 SET PROD='PRDOM' WHERE (COD_LINEA ='05' AND ID_GRP_PD='44');
UPDATE desercion_clientes_temp3 SET PROD='ACPAG' WHERE (COD_LINEA ='05' AND ID_GRP_PD='45');
UPDATE desercion_clientes_temp3 SET PROD='TD' WHERE (COD_LINEA ='05' AND ID_GRP_PD='50') OR (COD_LINEA ='05' AND ID_GRP_PD='51');
UPDATE desercion_clientes_temp3 SET PROD='ACCOMER' WHERE (COD_LINEA ='05' AND ID_GRP_PD='52');
UPDATE desercion_clientes_temp3 SET PROD='TPR' WHERE (COD_LINEA ='05' AND ID_GRP_PD='53');
UPDATE desercion_clientes_temp3 SET PROD='GCEXT' WHERE (COD_LINEA ='05' AND ID_GRP_PD='70') OR (COD_LINEA ='05' AND ID_GRP_PD='71') OR (COD_LINEA ='05' AND ID_GRP_PD='72');
UPDATE desercion_clientes_temp3 SET PROD='SEGVIDA' WHERE (COD_LINEA ='05' AND ID_GRP_PD='81');
UPDATE desercion_clientes_temp3 SET PROD='SEGRIESGO' WHERE (COD_LINEA ='05' AND ID_GRP_PD='82') OR (COD_LINEA ='05' AND ID_GRP_PD='93');

 

4956-snip20160610-3.png

1 ACCEPTED SOLUTION

Accepted Solutions

Re: hive updates very slow

Expert Contributor

The parallelism of the write side is equal to the number of buckets for the table.

For the read side, there was (is) a bug with split computation on Tez. If you do "dfs -lsr" on the table directory, do you see any files called base_xxxxxx? If not, you may want to run a Major compaction. It will generate a base_xxxx file and then your subsequent Updates should have > 1 split. (this was fixed in https://issues.apache.org/jira/browse/HIVE-13821 and requires Tez 0.7)

Alternatively, run the update statement on MR - it doesn't have split calculation issue.

View solution in original post

5 REPLIES 5

Re: hive updates very slow

data set size please?

Re: hive updates very slow

Super Collaborator

from hue i can see that

totalSize 787795564

Re: hive updates very slow

Super Collaborator

Ho can i control the map and reduce with hive or tex paralell???

Re: hive updates very slow

Expert Contributor

The parallelism of the write side is equal to the number of buckets for the table.

For the read side, there was (is) a bug with split computation on Tez. If you do "dfs -lsr" on the table directory, do you see any files called base_xxxxxx? If not, you may want to run a Major compaction. It will generate a base_xxxx file and then your subsequent Updates should have > 1 split. (this was fixed in https://issues.apache.org/jira/browse/HIVE-13821 and requires Tez 0.7)

Alternatively, run the update statement on MR - it doesn't have split calculation issue.

View solution in original post

Re: hive updates very slow

Super Collaborator

colud you please type the command example to run a compaction for my table adn bucket in ORC format??

CREATE EXTERNAL TABLE IF NOT EXISTS canal( CODTF string, CANAL string, FECHAOPRCNF string, FRECUENCIA int, CODNRBEENF string ) COMMENT 'A buckered table canal by CODNRBEENF' CLUSTERED BY (CODNRBEENF) SORTED BY (FECHAOPRCNF) INTO 60 BUCKETS stored as ORC LOCATION '/RSI/tables/logs/canal' TBLPROPERTIES ("immutable"="false", "orc.create.index"="true","orc.compress"="ZLIB", "orc.stripe.size"="67108864","orc.row.index.stride"="25000");