Support Questions
Find answers, ask questions, and share your expertise

Hive Compactor doesn´t launch cleaner

New Contributor

Hello,

there are problems with Hive Compactor. We can see in hivemetastore.log this message "Max block location exceeded for split" and it´s appearing more and more times.
After that, the "compactor.Cleaner" is not launched.

We observed that after a Hive Metastore restart, the "compactor.Cleaner" has not been launched nevermore, but logs doesn´t display any message about it.

Could be a degradation of the Hive Compactor when delta files are growing in the partitions?

Regards.

 

Hive 3.1.0

Hadoop 3.1.1

 

 

 

 

hive.compactor.initiator.on = true 
hive.compactor.job.queue = compactions 
hive.compactor.worker.threads = 6 
hive.metastore.thrift.compact.protocol.enabled = true 
hive.compactor.check.interval = 120s 
hive.compactor.delta.num.threshold = 5 
hive.compactor.cleaner.run.interval = 5s 
hive.compactor.worker.timeout = 300s 
hive.compactor.max.num.delta = 100 
hive.compactor.abortedtxn.threshold = 100 hive.compactor.initiator.failed.compacts.threshold = 10 hive.compactor.history.retention.succeeded = 5 
hive.compactor.history.retention.failed = 5 
hive.compactor.history.retention.attempted = 5

 

 

 

The Worker initiatior is not running automatically (I dont understand why) , so I've launched a manual major compactor:

 

 

2021-04-27T13:41:38,542 INFO  [bi-hdp3-master-44]: compactor.Worker (Worker.java:run(165)) - Starting MAJOR compaction for sta_1.draco_cdrs.fecha_dia=20210421
2021-04-27T13:41:40,955 INFO  [bi-hdp3-master-44]: compactor.CompactorMR (CompactorMR.java:launchCompactionJob(578)) - Submitting MAJOR compaction job 'bi-hdp3-master-44-compactor-sta_1.draco_cdrs.fecha_dia=20210421' to compactions queue.  (current delta dirs count=2, obsolete delta dirs count=50. TxnIdRange[7931,7931]
2021-04-27T13:41:42,061 INFO  [bi-hdp3-master-44]: compactor.CompactorMR (CompactorMR.java:launchCompactionJob(586)) - Submitted compaction job 'bi-hdp3-master-44-compactor-sta_1.draco_cdrs.fecha_dia=20210421' with jobID=job_1619518348886_0107 compaction ID=48704
2021-04-27T13:42:02,096 INFO  [bi-hdp3-master-44]: compactor.Worker$StatsUpdater (Worker.java:gatherStats(299)) - id:48704,dbname:sta_1,tableName:draco_cdrs,partName:fecha_dia=20210421,state:,type:MAJOR,properties:null,runAs:null,tooManyAborts:false,highestWriteId:0: running 'analyze table sta_1.draco_cdrs partition(fecha_dia='20210421') compute statistics for columns cdr_direccion,rngroup_tar_id,rngroup_area_negocio_id,cdr_cluster_id,acdr_publisher_id,cdr_landing_id,cdr_subtipo,cdr_duracion,cdr_numeroa_tipo,cdr_numeroc_pais,cdr_nord_id,cdr_numero_num_id,cdr_numero_publico,cdr_event_source,cdr_landings_session_id,cdr_fecha_hora,cdr_merchant_id,cdr_detection_channel,acdr_device_version_id,cdr_payment_method_country,cdr_producto_id,cdr_devicetype,cdr_original_network,cdr_pricepoint_currency,cdr_isp,cdr_pricepoint_pvp,cdr_info,cdr_duracion_fact,cdr_numeroa,cdr_cliente_cli_id,cdr_tarifa_franja,cdr_tipo_evento,cdr_evento_tarificable,cdr_cashflow,rngroup_id,rngroup_pub_medio_id,rngroup_product_id,acdr_device_id,cdr_pgab_id,cdr_numeroa_provincia,cdr_numeroa_pais,cdr_ggab_id,cdr_pais,cdr_devicename,cdr_subs_id,cdr_tipo,cdr_gabinete_costes,cdr_subs_fecha_alta,cdr_fecha,cdr_sord_id,cdr_fecha_creacion,cdr_numerob,cdr_numerob_pais,cdr_operador_divisa,cdr_empresa_editora_id,cdr_keyword,cdr_first_event_user,cdr_operador_refunds,cdr_numeroc_provincia,cdr_operador_costes,cdr_pvp,cdr_network,cdr_id,cdr_cliente_ingresos,cdr_cliente_costes,cdr_event_id,rngroup_pub_id,cdr_operador,cdr_numerob_provincia,cdr_numerob_tipo,cdr_numeroc,cdr_numeroc_tipo,cdr_numeroc_tarifa_id,cdr_operador_ingresos,cdr_cliente_divisa,cdr_ser_id,cdr_traducido,cdr_empresa_emp_id,cdr_grenlaces_gren_id,cdr_operador_ope_id,cdr_tarifa_tar_id,cdr_gab_terminacion,cdr_subs_operacion,cdr_subs_cobro_completo,cdr_app_id,cdr_subpub_id,cdr_design_id,cdr_website_id,cdr_fecha_baja_diferida,rngroup_pub_tar_id,acdr_device_os_id,acdr_app_id,acdr_campaign_id,acdr_adset_id,acdr_ad_id,acdr_pricing_plan_id'

 

There are not errors in job compaction (5 attempts map OK), are finished successly, in "show compaction" i can see the status "ready for cleaning" but the Cleaner is not launched never.

 

And finally in S3, It has added a base file more in the partitition but It hasnt deleted the delta files and the other base

2021-04-27_14-00-04.png

 

Two question:

- Why initiatior is not running automatically

- Why Cleaner is not launched never

 

Any suggestion please?

1 REPLY 1

New Contributor

Solved

We have moved configs Initiatior.on from hiveserver to hivemetastore

 

hive.compactor.initiator.on = true