Hi,
I am executing following insert query on EMR cluster. The source table (HDFS table ) has 100 records destination table (DynamoDB) has 20 million records.
INSERT INTO TABLE hive_settings_external_step
SELECT athenaTable.col1, athenaTable.col2, athenaTable.col3, athenaTable.col4, athenaTable.col5, athenaTable.col6, athenaTable.col7, athenaTable.temp FROM internal_settings_athena_step_boost athenaTable LEFT OUTER JOIN hive_settings_external_step dynamo ON (athenaTable.col1 = dynamo.col1 and athenaTable.temp = dynamo.temp)
WHERE (dynamo.col1 is NULL AND dynamo.temp is NULL);
Following is the exception trace
Status: Failed
Vertex failed, vertexName=Map 1, vertexId=vertex_1587679313517_0008_2_01, diagnostics=[Task failed, taskId=task_1587679313517_0008_2_01_000000, diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( failure ) : attempt_1587679313517_0008_2_01_000000_0:java.lang.RuntimeException: java.lang.RuntimeException: Hive Runtime Error while closing operators
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:211)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:168)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: Hive Runtime Error while closing operators
at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:488)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:199)
... 14 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected exception from MapJoinOperator : com.amazonaws.SdkClientException: Unable to marshall request to JSON: Unable to marshall request to JSON: Unable to marshall request to JSON: Unable to marshall request to JSON: null