Created on 06-10-2016 06:44 PM - edited 08-19-2019 02:09 AM
hi:
iam executing this but its very slow and not all memory ram are using, and just 1 map, iam running with tez, any suggestions??
INFO : Map 1: 1/1 Reducer 2: 59(+1)/60 INFO : Map 1: 1/1 Reducer 2: 59(+1)/60 INFO : Map 1: 1/1 Reducer 2: 59(+1)/60
UPDATE desercion_clientes_temp3 SET PROD='CM' WHERE (COD_LINEA ='01' AND ID_GRP_PD='11') OR (COD_LINEA ='01' AND ID_GRP_PD='13') OR (COD_LINEA ='01' AND ID_GRP_PD='21'); UPDATE desercion_clientes_temp3 SET PROD='INGs' WHERE (COD_LINEA ='01' AND ID_GRP_PD='31') OR (COD_LINEA ='01' AND ID_GRP_PD='32') OR (COD_LINEA ='01' AND ID_GRP_PD='33'); UPDATE desercion_clientes_temp3 SET PROD='AXT' WHERE (COD_LINEA ='01' AND ID_GRP_PD='34') OR (COD_LINEA ='01' AND ID_GRP_PD='35') OR (COD_LINEA ='01' AND ID_GRP_PD='36'); UPDATE desercion_clientes_temp3 SET PROD='CRE' WHERE (COD_LINEA ='01' AND ID_GRP_PD='41'); UPDATE desercion_clientes_temp3 SET PROD='TC' WHERE (COD_LINEA ='01' AND ID_GRP_PD='50') OR (COD_LINEA ='01' AND ID_GRP_PD='51'); UPDATE desercion_clientes_temp3 SET PROD='PRE' WHERE (COD_LINEA ='01' AND ID_GRP_PD='71'); UPDATE desercion_clientes_temp3 SET PROD='VIST' WHERE (COD_LINEA ='03' AND ID_GRP_PD='11') OR (COD_LINEA ='03' AND ID_GRP_PD='21'); UPDATE desercion_clientes_temp3 SET PROD='PLZO' WHERE (COD_LINEA ='03' AND ID_GRP_PD='51'); UPDATE desercion_clientes_temp3 SET PROD='PLZOES' WHERE (COD_LINEA ='03' AND ID_GRP_PD='52'); UPDATE desercion_clientes_temp3 SET PROD='SUBORD' WHERE (COD_LINEA ='03' AND ID_GRP_PD='92') OR (COD_LINEA ='03' AND ID_GRP_PD='93'); UPDATE desercion_clientes_temp3 SET PROD='VAL' WHERE (COD_LINEA ='04' AND ID_GRP_PD='10') OR (COD_LINEA ='04' AND ID_GRP_PD='20') OR (COD_LINEA ='04' AND ID_GRP_PD='40'); UPDATE desercion_clientes_temp3 SET PROD='PPEN' WHERE (COD_LINEA ='04' AND ID_GRP_PD='30') OR (COD_LINEA ='04' AND ID_GRP_PD='34'); UPDATE desercion_clientes_temp3 SET PROD='FONI' WHERE (COD_LINEA ='04' AND ID_GRP_PD='56'); UPDATE desercion_clientes_temp3 SET PROD='SEGAH' WHERE (COD_LINEA ='04' AND ID_GRP_PD='61'); UPDATE desercion_clientes_temp3 SET PROD='REC' WHERE (COD_LINEA ='05' AND ID_GRP_PD='01'); UPDATE desercion_clientes_temp3 SET PROD='PDOM' WHERE (COD_LINEA ='05' AND ID_GRP_PD='02'); UPDATE desercion_clientes_temp3 SET PROD='EMRV' WHERE (COD_LINEA ='05' AND ID_GRP_PD='11'); UPDATE desercion_clientes_temp3 SET PROD='GCOBRO' WHERE (COD_LINEA ='05' AND ID_GRP_PD='12'); UPDATE desercion_clientes_temp3 SET PROD='EMRECDOM' WHERE (COD_LINEA ='05' AND ID_GRP_PD='13'); UPDATE desercion_clientes_temp3 SET PROD='RECVEN' WHERE (COD_LINEA ='05' AND ID_GRP_PD='14'); UPDATE desercion_clientes_temp3 SET PROD='CONFIR' WHERE (COD_LINEA ='05' AND ID_GRP_PD='15'); UPDATE desercion_clientes_temp3 SET PROD='DOMINGREC' WHERE (COD_LINEA ='05' AND ID_GRP_PD='33') OR (COD_LINEA ='05' AND ID_GRP_PD='31'); UPDATE desercion_clientes_temp3 SET PROD='EMISTR' WHERE (COD_LINEA ='05' AND ID_GRP_PD='41'); UPDATE desercion_clientes_temp3 SET PROD='EMISTRMAS' WHERE (COD_LINEA ='05' AND ID_GRP_PD='42'); UPDATE desercion_clientes_temp3 SET PROD='PRVENT' WHERE (COD_LINEA ='05' AND ID_GRP_PD='43'); UPDATE desercion_clientes_temp3 SET PROD='PRDOM' WHERE (COD_LINEA ='05' AND ID_GRP_PD='44'); UPDATE desercion_clientes_temp3 SET PROD='ACPAG' WHERE (COD_LINEA ='05' AND ID_GRP_PD='45'); UPDATE desercion_clientes_temp3 SET PROD='TD' WHERE (COD_LINEA ='05' AND ID_GRP_PD='50') OR (COD_LINEA ='05' AND ID_GRP_PD='51'); UPDATE desercion_clientes_temp3 SET PROD='ACCOMER' WHERE (COD_LINEA ='05' AND ID_GRP_PD='52'); UPDATE desercion_clientes_temp3 SET PROD='TPR' WHERE (COD_LINEA ='05' AND ID_GRP_PD='53'); UPDATE desercion_clientes_temp3 SET PROD='GCEXT' WHERE (COD_LINEA ='05' AND ID_GRP_PD='70') OR (COD_LINEA ='05' AND ID_GRP_PD='71') OR (COD_LINEA ='05' AND ID_GRP_PD='72'); UPDATE desercion_clientes_temp3 SET PROD='SEGVIDA' WHERE (COD_LINEA ='05' AND ID_GRP_PD='81'); UPDATE desercion_clientes_temp3 SET PROD='SEGRIESGO' WHERE (COD_LINEA ='05' AND ID_GRP_PD='82') OR (COD_LINEA ='05' AND ID_GRP_PD='93');
Created 06-13-2016 06:32 PM
The parallelism of the write side is equal to the number of buckets for the table.
For the read side, there was (is) a bug with split computation on Tez. If you do "dfs -lsr" on the table directory, do you see any files called base_xxxxxx? If not, you may want to run a Major compaction. It will generate a base_xxxx file and then your subsequent Updates should have > 1 split. (this was fixed in https://issues.apache.org/jira/browse/HIVE-13821 and requires Tez 0.7)
Alternatively, run the update statement on MR - it doesn't have split calculation issue.
Created 06-10-2016 07:33 PM
data set size please?
Created 06-10-2016 09:34 PM
from hue i can see that
totalSize 787795564
Created 06-12-2016 05:13 PM
Ho can i control the map and reduce with hive or tex paralell???
Created 06-13-2016 06:32 PM
The parallelism of the write side is equal to the number of buckets for the table.
For the read side, there was (is) a bug with split computation on Tez. If you do "dfs -lsr" on the table directory, do you see any files called base_xxxxxx? If not, you may want to run a Major compaction. It will generate a base_xxxx file and then your subsequent Updates should have > 1 split. (this was fixed in https://issues.apache.org/jira/browse/HIVE-13821 and requires Tez 0.7)
Alternatively, run the update statement on MR - it doesn't have split calculation issue.
Created 06-14-2016 06:43 AM
colud you please type the command example to run a compaction for my table adn bucket in ORC format??
CREATE EXTERNAL TABLE IF NOT EXISTS canal( CODTF string, CANAL string, FECHAOPRCNF string, FRECUENCIA int, CODNRBEENF string ) COMMENT 'A buckered table canal by CODNRBEENF' CLUSTERED BY (CODNRBEENF) SORTED BY (FECHAOPRCNF) INTO 60 BUCKETS stored as ORC LOCATION '/RSI/tables/logs/canal' TBLPROPERTIES ("immutable"="false", "orc.create.index"="true","orc.compress"="ZLIB", "orc.stripe.size"="67108864","orc.row.index.stride"="25000");