Support Questions

Find answers, ask questions, and share your expertise

Hive error:java.lang.OutOfMemoryError: Java heap space

avatar
Contributor

Hi All,

Could you please help to resolve below concern as -

 

We are executing below script and getting ERROR. I have enclosed the same for referene.

 

hive -e "set
hive.merge.tezfiles=true;
set hive.merge.mapfiles=true;
set hive.merge.mapredfiles=true;
set tez.queue.name=BIAdv;
set hive.execution.engine=tez;
set hive.vectorized.execution.enabled =true;
set hive.vectorized.execution.reduce.enabled ==true;
set hive.exec.dynamic.partition=true;set hive.exec.dynamic.partition.mode=nonstrict;
set hive.exec.max.dynamic.partitions.pernode=20000;
set hive.exec.max.dynamic.partitions=100000;set hive.merge.size.per.task=134217724;
set hive.merge.smallfiles.avgsize=134217724;
set tez.grouping.split-count=1;
INSERT overwrite TABLE <table-name1> partition (reported_date,last_usage_date)
SELECT * from <table-name1> where reported_date='2022-11-16' and last_usage_date='2022-04-10';"

 

ERROR - Log :

----------------------------------------------------------------------------------------------
ERROR : Status: Failed
ERROR : Vertex failed, vertexName=Map 1, vertexId=vertex_1668260139179_120620_1_00, diagnostics=[Task failed, taskId=task_1668260139179_120620_1_00_000002, diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( failure ) : java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:3332)
at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448)
at java.lang.StringBuilder.append(StringBuilder.java:136)
at org.apache.hadoop.hive.serde2.ColumnProjectionUtils.appendReadColumnNames(ColumnProjectionUtils.java:239)
at org.apache.hadoop.hive.serde2.ColumnProjectionUtils.appendReadColumns(ColumnProjectionUtils.java:163)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:964)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:409)
at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:203)
at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:152)
at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69)
at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
, errorMessage=Cannot recover from this error:java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:3332)
at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448)
at java.lang.StringBuilder.append(StringBuilder.java:136)
at org.apache.hadoop.hive.serde2.ColumnProjectionUtils.appendReadColumnNames(ColumnProjectionUtils.java:239)
at org.apache.hadoop.hive.serde2.ColumnProjectionUtils.appendReadColumns(ColumnProjectionUtils.java:163)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:964)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:409)
at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:203)
at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:152)
at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69)
at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:6, Vertex vertex_1668260139179_120620_1_00 [Map 1] killed/failed due to:OWN_TASK_FAILURE]
ERROR : Vertex killed, vertexName=Reducer 2, vertexId=vertex_1668260139179_120620_1_01, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:1009, Vertex vertex_1668260139179_120620_1_01 [Reducer 2] killed/failed due to:OTHER_VERTEX_FAILURE]
ERROR : DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:1
INFO : org.apache.tez.common.counters.DAGCounter:
INFO : NUM_FAILED_TASKS: 1
INFO : NUM_KILLED_TASKS: 6
INFO : NUM_SUCCEEDED_TASKS: 7
INFO : TOTAL_LAUNCHED_TASKS: 14
INFO : OTHER_LOCAL_TASKS: 2
INFO : RACK_LOCAL_TASKS: 5
INFO : AM_CPU_MILLISECONDS: 381090
INFO : AM_GC_TIME_MILLIS: 1830
INFO : File System Counters:
INFO : FILE_BYTES_READ: 339136
INFO : FILE_BYTES_WRITTEN: 4844738
INFO : HDFS_BYTES_READ: 4005665433
INFO : HDFS_BYTES_WRITTEN: 1590596979
INFO : HDFS_READ_OPS: 71884
INFO : HDFS_WRITE_OPS: 2638
INFO : HDFS_OP_CREATE: 1183
INFO : HDFS_OP_GET_FILE_STATUS: 5887
INFO : HDFS_OP_MKDIRS: 279
INFO : HDFS_OP_OPEN: 65997
INFO : HDFS_OP_RENAME: 1176
INFO : org.apache.tez.common.counters.TaskCounter:
INFO : SPILLED_RECORDS: 1176
INFO : GC_TIME_MILLIS: 3749370
INFO : TASK_DURATION_MILLIS: 10490948
INFO : CPU_MILLISECONDS: 35432800
INFO : PHYSICAL_MEMORY_BYTES: 46170898432
INFO : VIRTUAL_MEMORY_BYTES: 65302401024
INFO : COMMITTED_HEAP_BYTES: 46170898432
INFO : INPUT_RECORDS_PROCESSED: 46000210
INFO : INPUT_SPLIT_LENGTH_BYTES: 3020210217
INFO : OUTPUT_RECORDS: 1176
INFO : OUTPUT_LARGE_RECORDS: 0
INFO : OUTPUT_BYTES: 8054528
INFO : OUTPUT_BYTES_WITH_OVERHEAD: 8065910
INFO : OUTPUT_BYTES_PHYSICAL: 4675170
INFO : ADDITIONAL_SPILLS_BYTES_WRITTEN: 0
INFO : ADDITIONAL_SPILLS_BYTES_READ: 0
INFO : ADDITIONAL_SPILL_COUNT: 0
INFO : SHUFFLE_CHUNK_COUNT: 7
INFO : HIVE:
INFO : CREATED_DYNAMIC_PARTITIONS: 110
INFO : CREATED_FILES: 1176
INFO : DESERIALIZE_ERRORS: 0
INFO : RECORDS_IN_Map_1: 46000210
INFO : RECORDS_OUT_1_dim_cd_db.dim_vas_ppu_subs_base: 46000210
INFO : RECORDS_OUT_INTERMEDIATE_Map_1: 1176
INFO : RECORDS_OUT_OPERATOR_FS_3: 46000210
INFO : RECORDS_OUT_OPERATOR_GBY_6: 1176
INFO : RECORDS_OUT_OPERATOR_MAP_0: 0
INFO : RECORDS_OUT_OPERATOR_RS_7: 1176
INFO : RECORDS_OUT_OPERATOR_SEL_2: 46000210
INFO : RECORDS_OUT_OPERATOR_SEL_5: 46000210
INFO : RECORDS_OUT_OPERATOR_TS_0: 46000210
INFO : TaskCounter_Map_1_INPUT_dim_vas_ppu_subs_base:
INFO : INPUT_RECORDS_PROCESSED: 46000210
INFO : INPUT_SPLIT_LENGTH_BYTES: 3020210217
INFO : TaskCounter_Map_1_OUTPUT_Reducer_2:
INFO : ADDITIONAL_SPILLS_BYTES_READ: 0
INFO : ADDITIONAL_SPILLS_BYTES_WRITTEN: 0
INFO : ADDITIONAL_SPILL_COUNT: 0
INFO : OUTPUT_BYTES: 8054528
INFO : OUTPUT_BYTES_PHYSICAL: 4675170
INFO : OUTPUT_BYTES_WITH_OVERHEAD: 8065910
INFO : OUTPUT_LARGE_RECORDS: 0
INFO : OUTPUT_RECORDS: 1176
INFO : SHUFFLE_CHUNK_COUNT: 7
INFO : SPILLED_RECORDS: 1176
INFO : org.apache.hadoop.hive.ql.exec.tez.HiveInputCounters:
INFO : GROUPED_INPUT_SPLITS_Map_1: 14
INFO : INPUT_DIRECTORIES_Map_1: 169
INFO : INPUT_FILES_Map_1: 412191
INFO : RAW_INPUT_SPLITS_Map_1: 412191
ERROR : FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, vertexId=vertex_1668260139179_120620_1_00, diagnostics=[Task failed, taskId=task_1668260139179_120620_1_00_000002, diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( failure ) : java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:3332)
at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448)
at java.lang.StringBuilder.append(StringBuilder.java:136)
at org.apache.hadoop.hive.serde2.ColumnProjectionUtils.appendReadColumnNames(ColumnProjectionUtils.java:239)
at org.apache.hadoop.hive.serde2.ColumnProjectionUtils.appendReadColumns(ColumnProjectionUtils.java:163)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:964)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:409)
at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:203)
at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:152)
at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69)
at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
, errorMessage=Cannot recover from this error:java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:3332)
at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448)
at java.lang.StringBuilder.append(StringBuilder.java:136)
at org.apache.hadoop.hive.serde2.ColumnProjectionUtils.appendReadColumnNames(ColumnProjectionUtils.java:239)
at org.apache.hadoop.hive.serde2.ColumnProjectionUtils.appendReadColumns(ColumnProjectionUtils.java:163)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:964)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:409)
at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:203)
at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:152)
at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69)
at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:6, Vertex vertex_1668260139179_120620_1_00 [Map 1] killed/failed due to:OWN_TASK_FAILURE]Vertex killed, vertexName=Reducer 2, vertexId=vertex_1668260139179_120620_1_01, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:1009, Vertex vertex_1668260139179_120620_1_01 [Reducer 2] killed/failed due to:OTHER_VERTEX_FAILURE]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:1
INFO : Completed executing command(queryId=hive_20221121225133_cda2cb8e-97d3-48be-a1df-c0cd86b00ac5); Time taken: 7826.447 seconds
INFO : Compiling command(queryId=hive_20221121225133_cda2cb8e-97d3-48be-a1df-c0cd86b00ac5): INSERT overwrite TABLE dim_cd_db.dim_vas_ppu_subs_base partition (reported_date,last_usage_date) SELECT * from dim_cd_db.dim_vas_ppu_subs_base where reported_date='2022-09-25'
INFO : Semantic Analysis Completed (retrial = false)
INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:dim_vas_ppu_subs_base.subs_msisdn, type:varchar(25), comment:null), FieldSchema(name:dim_vas_ppu_subs_base.circle_id, type:varchar(4), comment:null), FieldSchema(name:dim_vas_ppu_subs_base.subs_key, type:varchar(25), comment:null), FieldSchema(name:dim_vas_ppu_subs_base.pre_post_ind, type:varchar(2), comment:null), FieldSchema(name:dim_vas_ppu_subs_base.act_mdt_cdr_id_key, type:varchar(70), comment:null), FieldSchema(name:dim_vas_ppu_subs_base.called_calling_number, type:varchar(25), comment:null), FieldSchema(name:dim_vas_ppu_subs_base.short_code, type:varchar(25), comment:null), FieldSchema(name:dim_vas_ppu_subs_base.discovery_bearer, type:varchar(6), comment:null), FieldSchema(name:dim_vas_ppu_subs_base.service_id, type:varchar(30), comment:null), FieldSchema(name:dim_vas_ppu_subs_base.service_sub_sub_type_id, type:varchar(30), comment:null), FieldSchema(name:dim_vas_ppu_subs_base.content_partner_code, type:varchar(30), comment:null), FieldSchema(name:dim_vas_ppu_subs_base.activation_price_amt, type:decimal(14,6), comment:null), FieldSchema(name:dim_vas_ppu_subs_base.current_price_amt, type:decimal(14,6), comment:null), FieldSchema(name:dim_vas_ppu_subs_base.activation_start_date, type:date, comment:null), FieldSchema(name:dim_vas_ppu_subs_base.activation_end_date, type:date, comment:null), FieldSchema(name:dim_vas_ppu_subs_base.last_status_updt_date, type:date, comment:null), FieldSchema(name:dim_vas_ppu_subs_base.load_timestamp, type:timestamp, comment:null), FieldSchema(name:dim_vas_ppu_subs_base.reported_date, type:date, comment:null), FieldSchema(name:dim_vas_ppu_subs_base.last_usage_date, type:date, comment:null)], properties:null)
INFO : Completed compiling command(queryId=hive_20221121225133_cda2cb8e-97d3-48be-a1df-c0cd86b00ac5); Time taken: 1.779 seconds
INFO : Executing command(queryId=hive_20221121225133_cda2cb8e-97d3-48be-a1df-c0cd86b00ac5): INSERT overwrite TABLE dim_cd_db.dim_vas_ppu_subs_base partition (reported_date,last_usage_date) SELECT * from dim_cd_db.dim_vas_ppu_subs_base where reported_date='2022-09-25'
INFO : Query ID = hive_20221121225133_cda2cb8e-97d3-48be-a1df-c0cd86b00ac5
INFO : Total jobs = 3
INFO : Launching Job 1 out of 3
INFO : Starting task [Stage-1:MAPRED] in serial mode
INFO : Subscribed to counters: [] for queryId: hive_20221121225133_cda2cb8e-97d3-48be-a1df-c0cd86b00ac5
INFO : Tez session hasn't been created yet. Opening session
INFO : Dag name: INSERT overwrite TAB...ed_date='2022-09-25' (Stage-1)
INFO : Status: Running (Executing on YARN cluster with App id application_1668260139179_121442)

ERROR : Status: Failed
ERROR : Vertex failed, vertexName=Map 1, vertexId=vertex_1668260139179_121442_1_00, diagnostics=[Task failed, taskId=task_1668260139179_121442_1_00_000002, diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( failure ) : java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:3332)
at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448)
at java.lang.StringBuilder.append(StringBuilder.java:136)
at org.apache.hadoop.hive.serde2.ColumnProjectionUtils.appendReadColumnNames(ColumnProjectionUtils.java:239)
at org.apache.hadoop.hive.serde2.ColumnProjectionUtils.appendReadColumns(ColumnProjectionUtils.java:163)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:964)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:409)
at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:203)
at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:152)
at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69)
at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
, errorMessage=Cannot recover from this error:java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:3332)
at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448)
at java.lang.StringBuilder.append(StringBuilder.java:136)
at org.apache.hadoop.hive.serde2.ColumnProjectionUtils.appendReadColumnNames(ColumnProjectionUtils.java:239)
at org.apache.hadoop.hive.serde2.ColumnProjectionUtils.appendReadColumns(ColumnProjectionUtils.java:163)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:964)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:409)
at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:203)
at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:152)
at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69)
at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:6, Vertex vertex_1668260139179_121442_1_00 [Map 1] killed/failed due to:OWN_TASK_FAILURE]
ERROR : Vertex killed, vertexName=Reducer 2, vertexId=vertex_1668260139179_121442_1_01, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:2, Vertex vertex_1668260139179_121442_1_01 [Reducer 2] killed/failed due to:OTHER_VERTEX_FAILURE]
ERROR : DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:1
INFO : org.apache.tez.common.counters.DAGCounter:
INFO : NUM_FAILED_TASKS: 1
INFO : NUM_KILLED_TASKS: 8
INFO : NUM_SUCCEEDED_TASKS: 7
INFO : TOTAL_LAUNCHED_TASKS: 16
INFO : OTHER_LOCAL_TASKS: 3
INFO : RACK_LOCAL_TASKS: 4
INFO : AM_CPU_MILLISECONDS: 344940
INFO : AM_GC_TIME_MILLIS: 1404
INFO : File System Counters:
INFO : FILE_BYTES_READ: 784
INFO : FILE_BYTES_WRITTEN: 4249582
INFO : HDFS_BYTES_READ: 4005665433
INFO : HDFS_BYTES_WRITTEN: 1590596809
INFO : HDFS_READ_OPS: 71884
INFO : HDFS_WRITE_OPS: 2658
INFO : HDFS_OP_CREATE: 1183
INFO : HDFS_OP_GET_FILE_STATUS: 5887
INFO : HDFS_OP_MKDIRS: 299
INFO : HDFS_OP_OPEN: 65997
INFO : HDFS_OP_RENAME: 1176
INFO : org.apache.tez.common.counters.TaskCounter:
INFO : SPILLED_RECORDS: 1176
INFO : GC_TIME_MILLIS: 2605189
INFO : TASK_DURATION_MILLIS: 8536033
INFO : CPU_MILLISECONDS: 33374430
INFO : PHYSICAL_MEMORY_BYTES: 45646610432
INFO : VIRTUAL_MEMORY_BYTES: 65281306624
INFO : COMMITTED_HEAP_BYTES: 45646610432
INFO : INPUT_RECORDS_PROCESSED: 46000210
INFO : INPUT_SPLIT_LENGTH_BYTES: 3020733963
INFO : OUTPUT_RECORDS: 1176
INFO : OUTPUT_LARGE_RECORDS: 0
INFO : OUTPUT_BYTES: 8054528
INFO : OUTPUT_BYTES_WITH_OVERHEAD: 8059316
INFO : OUTPUT_BYTES_PHYSICAL: 4249190
INFO : ADDITIONAL_SPILLS_BYTES_WRITTEN: 0
INFO : ADDITIONAL_SPILLS_BYTES_READ: 0
INFO : ADDITIONAL_SPILL_COUNT: 0
INFO : SHUFFLE_CHUNK_COUNT: 7
INFO : HIVE:
INFO : CREATED_DYNAMIC_PARTITIONS: 114
INFO : CREATED_FILES: 1176
INFO : DESERIALIZE_ERRORS: 0
INFO : RECORDS_IN_Map_1: 46000210
INFO : RECORDS_OUT_1_dim_cd_db.dim_vas_ppu_subs_base: 46000210
INFO : RECORDS_OUT_INTERMEDIATE_Map_1: 1176
INFO : RECORDS_OUT_OPERATOR_FS_3: 46000210
INFO : RECORDS_OUT_OPERATOR_GBY_6: 1176
INFO : RECORDS_OUT_OPERATOR_MAP_0: 0
INFO : RECORDS_OUT_OPERATOR_RS_7: 1176
INFO : RECORDS_OUT_OPERATOR_SEL_2: 46000210
INFO : RECORDS_OUT_OPERATOR_SEL_5: 46000210
INFO : RECORDS_OUT_OPERATOR_TS_0: 46000210
INFO : TaskCounter_Map_1_INPUT_dim_vas_ppu_subs_base:
INFO : INPUT_RECORDS_PROCESSED: 46000210
INFO : INPUT_SPLIT_LENGTH_BYTES: 3020733963
INFO : TaskCounter_Map_1_OUTPUT_Reducer_2:
INFO : ADDITIONAL_SPILLS_BYTES_READ: 0
INFO : ADDITIONAL_SPILLS_BYTES_WRITTEN: 0
INFO : ADDITIONAL_SPILL_COUNT: 0
INFO : OUTPUT_BYTES: 8054528
INFO : OUTPUT_BYTES_PHYSICAL: 4249190
INFO : OUTPUT_BYTES_WITH_OVERHEAD: 8059316
INFO : OUTPUT_LARGE_RECORDS: 0
INFO : OUTPUT_RECORDS: 1176
INFO : SHUFFLE_CHUNK_COUNT: 7
INFO : SPILLED_RECORDS: 1176
INFO : org.apache.hadoop.hive.ql.exec.tez.HiveInputCounters:
INFO : GROUPED_INPUT_SPLITS_Map_1: 14
INFO : INPUT_DIRECTORIES_Map_1: 169
INFO : INPUT_FILES_Map_1: 412191
INFO : RAW_INPUT_SPLITS_Map_1: 412191
ERROR : FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, vertexId=vertex_1668260139179_121442_1_00, diagnostics=[Task failed, taskId=task_1668260139179_121442_1_00_000002, diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( failure ) : java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:3332)
at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448)
at java.lang.StringBuilder.append(StringBuilder.java:136)
at org.apache.hadoop.hive.serde2.ColumnProjectionUtils.appendReadColumnNames(ColumnProjectionUtils.java:239)
at org.apache.hadoop.hive.serde2.ColumnProjectionUtils.appendReadColumns(ColumnProjectionUtils.java:163)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:964)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:409)
at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:203)
at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:152)
at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69)
at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
, errorMessage=Cannot recover from this error:java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:3332)
at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448)
at java.lang.StringBuilder.append(StringBuilder.java:136)
at org.apache.hadoop.hive.serde2.ColumnProjectionUtils.appendReadColumnNames(ColumnProjectionUtils.java:239)
at org.apache.hadoop.hive.serde2.ColumnProjectionUtils.appendReadColumns(ColumnProjectionUtils.java:163)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:964)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:409)
at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:203)
at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:152)
at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69)
at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:6, Vertex vertex_1668260139179_121442_1_00 [Map 1] killed/failed due to:OWN_TASK_FAILURE]Vertex killed, vertexName=Reducer 2, vertexId=vertex_1668260139179_121442_1_01, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:2, Vertex vertex_1668260139179_121442_1_01 [Reducer 2] killed/failed due to:OTHER_VERTEX_FAILURE]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:1
INFO : Completed executing command(queryId=hive_20221121225133_cda2cb8e-97d3-48be-a1df-c0cd86b00ac5); Time taken: 7380.715 seconds
Error: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, vertexId=vertex_1668260139179_121442_1_00, diagnostics=[Task failed, taskId=task_1668260139179_121442_1_00_000002, diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( failure ) : java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:3332)
at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448)
at java.lang.StringBuilder.append(StringBuilder.java:136)
at org.apache.hadoop.hive.serde2.ColumnProjectionUtils.appendReadColumnNames(ColumnProjectionUtils.java:239)
at org.apache.hadoop.hive.serde2.ColumnProjectionUtils.appendReadColumns(ColumnProjectionUtils.java:163)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:964)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:409)
at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:203)
at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:152)
at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69)
at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
, errorMessage=Cannot recover from this error:java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:3332)
at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448)
at java.lang.StringBuilder.append(StringBuilder.java:136)
at org.apache.hadoop.hive.serde2.ColumnProjectionUtils.appendReadColumnNames(ColumnProjectionUtils.java:239)
at org.apache.hadoop.hive.serde2.ColumnProjectionUtils.appendReadColumns(ColumnProjectionUtils.java:163)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:964)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:409)
at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:203)
at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:152)
at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69)
at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:6, Vertex vertex_1668260139179_121442_1_00 [Map 1] killed/failed due to:OWN_TASK_FAILURE]Vertex killed, vertexName=Reducer 2, vertexId=vertex_1668260139179_121442_1_01, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:2, Vertex vertex_1668260139179_121442_1_01 [Reducer 2] killed/failed due to:OTHER_VERTEX_FAILURE]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:1 (state=08S01,code=2)
Closing: 0: jdbc:hive2://ndc3hdpprodmn05.vodafoneidea.com:2181,ndc3hdpprodmn06.vodafoneidea.com:2181,ndc3hdpprodmn07.vodafoneidea.com:2181/default;httpPath=cliservice;password=biuser2;principal=hive/_HOST@INROOT.IN;serviceDiscoveryMode=zooKeeper;transportMode=http;user=biuser2;zooKeeperNamespace=hiveserver2
[biuser2@ndc3hdpproden01 Sudhakar]$
You have new mail in /var/spool/mail/biuser2
[biuser2@ndc3hdpproden01 Sudhakar]$ cat merge_vas_ppu.out
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/3.1.5.0-152/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/3.1.5.0-152/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Connecting to jdbc:hive2://ndc3hdpprodmn05.vodafoneidea.com:2181,ndc3hdpprodmn06.vodafoneidea.com:2181,ndc3hdpprodmn07.vodafoneidea.com:2181/default;httpPath=cliservice;password=biuser2;principal=hive/_HOST@INROOT.IN;serviceDiscoveryMode=zooKeeper;transportMode=http;user=biuser2;zooKeeperNamespace=hiveserver2
22/11/21 22:51:33 [main]: INFO jdbc.HiveConnection: Connected to NDC3HDPPRODMN06.vodafoneidea.com:10001
Connected to: Apache Hive (version 3.1.0.3.1.5.0-152)
Driver: Hive JDBC (version 3.1.0.3.1.5.0-152)
Transaction isolation: TRANSACTION_REPEATABLE_READ
No rows affected (0.064 seconds)
No rows affected (0.009 seconds)
No rows affected (0.008 seconds)
No rows affected (0.006 seconds)
No rows affected (0.005 seconds)
No rows affected (0.004 seconds)
No rows affected (0.004 seconds)
No rows affected (0.004 seconds)
No rows affected (0.004 seconds)
No rows affected (0.004 seconds)
No rows affected (0.003 seconds)
No rows affected (0.004 seconds)
No rows affected (0.004 seconds)
No rows affected (0.004 seconds)
INFO : Compiling command(queryId=hive_20221121225133_cda2cb8e-97d3-48be-a1df-c0cd86b00ac5): INSERT overwrite TABLE dim_cd_db.dim_vas_ppu_subs_base partition (reported_date,last_usage_date) SELECT * from dim_cd_db.dim_vas_ppu_subs_base where reported_date='2022-09-25'
INFO : Semantic Analysis Completed (retrial = false)
INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:dim_vas_ppu_subs_base.subs_msisdn, type:varchar(25), comment:null), FieldSchema(name:dim_vas_ppu_subs_base.circle_id, type:varchar(4), comment:null), FieldSchema(name:dim_vas_ppu_subs_base.subs_key, type:varchar(25), comment:null), FieldSchema(name:dim_vas_ppu_subs_base.pre_post_ind, type:varchar(2), comment:null), FieldSchema(name:dim_vas_ppu_subs_base.act_mdt_cdr_id_key, type:varchar(70), comment:null), FieldSchema(name:dim_vas_ppu_subs_base.called_calling_number, type:varchar(25), comment:null), FieldSchema(name:dim_vas_ppu_subs_base.short_code, type:varchar(25), comment:null), FieldSchema(name:dim_vas_ppu_subs_base.discovery_bearer, type:varchar(6), comment:null), FieldSchema(name:dim_vas_ppu_subs_base.service_id, type:varchar(30), comment:null), FieldSchema(name:dim_vas_ppu_subs_base.service_sub_sub_type_id, type:varchar(30), comment:null), FieldSchema(name:dim_vas_ppu_subs_base.content_partner_code, type:varchar(30), comment:null), FieldSchema(name:dim_vas_ppu_subs_base.activation_price_amt, type:decimal(14,6), comment:null), FieldSchema(name:dim_vas_ppu_subs_base.current_price_amt, type:decimal(14,6), comment:null), FieldSchema(name:dim_vas_ppu_subs_base.activation_start_date, type:date, comment:null), FieldSchema(name:dim_vas_ppu_subs_base.activation_end_date, type:date, comment:null), FieldSchema(name:dim_vas_ppu_subs_base.last_status_updt_date, type:date, comment:null), FieldSchema(name:dim_vas_ppu_subs_base.load_timestamp, type:timestamp, comment:null), FieldSchema(name:dim_vas_ppu_subs_base.reported_date, type:date, comment:null), FieldSchema(name:dim_vas_ppu_subs_base.last_usage_date, type:date, comment:null)], properties:null)
INFO : Completed compiling command(queryId=hive_20221121225133_cda2cb8e-97d3-48be-a1df-c0cd86b00ac5); Time taken: 1.636 seconds
INFO : Executing command(queryId=hive_20221121225133_cda2cb8e-97d3-48be-a1df-c0cd86b00ac5): INSERT overwrite TABLE dim_cd_db.dim_vas_ppu_subs_base partition (reported_date,last_usage_date) SELECT * from dim_cd_db.dim_vas_ppu_subs_base where reported_date='2022-09-25'
INFO : Query ID = hive_20221121225133_cda2cb8e-97d3-48be-a1df-c0cd86b00ac5
INFO : Total jobs = 3
INFO : Launching Job 1 out of 3
INFO : Starting task [Stage-1:MAPRED] in serial mode
INFO : Subscribed to counters: [] for queryId: hive_20221121225133_cda2cb8e-97d3-48be-a1df-c0cd86b00ac5
INFO : Tez session hasn't been created yet. Opening session
INFO : Dag name: INSERT overwrite TAB...ed_date='2022-09-25' (Stage-1)
INFO : Status: Running (Executing on YARN cluster with App id application_1668260139179_120620)

2 ACCEPTED SOLUTIONS

avatar
Guru

Please find the difference between hive.tez.container.size and tez.task.resource.mb

 

hive.tez.container.size

This property specifies tez container size. Usually value of this property should be the same as or a small multiple (1 or 2 times that) of YARN container size yarn.scheduler.minimum-allocation-mb and should not exceed value of yarn.scheduler.maximum-allocation-mb.

As a general rule don't put value higher than memory per processor as you want 1 processor per container and you want to spun up multiple containers.

You can find very detailed answer and a great architecture diagram on Hortonworks community answer here

tez.task.resource.memory.mb

Amount of memory used by launched task in TEZ container.

tez.task.resource.memory.mb should be set < hive.tez.container.size

This will be recalculated. Run the job without setting.

 

View solution in original post

avatar
Contributor

GM Asish... Thanks for sharing difference between two parameter. Will take note on  tez.task.resource.memory.mb should be set < hive.tez.container.size 

 

Appreciate for your support.

 

Regards,

Pankaj shivankar

View solution in original post

6 REPLIES 6

avatar
Guru

@pankshiv1809  Please increase the container size:

 

set hive.tez.container.size=10240;

set tez.runtime.io.sort.mb=4096;  ==> 40% of hive.tez.container.size

 

Keep on increasing the container.

 

Please also collect table and column stats too https://docs.cloudera.com/HDPDocuments/HDP2/HDP-2.6.5/bk_cloud-data-access/content/hive-analyzing-ta...

 

Please mark it "Accept As Solution". if your query is answered.

avatar
Contributor

Hi Asish,

I was trying to add below parameter to do workaround memory. Will include new parameter mentioned by you for today's task and share update. Hope below parameter gives some optimized memory for Reason code - 2 solution.

set tez.am.resource.memory.mb=16384;
set tez.task.resource.memory.mb=16384;
set hive.tez.container.size=16384;

 

Thanks for sharing parameter with values.

set tez.am.resource.memory.mb=10240;
set tez.task.resource.memory.mb=10240;
set tez.runtime.io.sort.mb=4096;
set hive.tez.container.size=10240;

 

 

avatar
Guru

hi @pankshiv1809  DOnt include set tez.task.resource.memory.mb=10240;

avatar
Contributor

okay . any reason if i include in task ?

avatar
Guru

Please find the difference between hive.tez.container.size and tez.task.resource.mb

 

hive.tez.container.size

This property specifies tez container size. Usually value of this property should be the same as or a small multiple (1 or 2 times that) of YARN container size yarn.scheduler.minimum-allocation-mb and should not exceed value of yarn.scheduler.maximum-allocation-mb.

As a general rule don't put value higher than memory per processor as you want 1 processor per container and you want to spun up multiple containers.

You can find very detailed answer and a great architecture diagram on Hortonworks community answer here

tez.task.resource.memory.mb

Amount of memory used by launched task in TEZ container.

tez.task.resource.memory.mb should be set < hive.tez.container.size

This will be recalculated. Run the job without setting.

 

avatar
Contributor

GM Asish... Thanks for sharing difference between two parameter. Will take note on  tez.task.resource.memory.mb should be set < hive.tez.container.size 

 

Appreciate for your support.

 

Regards,

Pankaj shivankar