Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Creating new table in Hive using data of the older table

Highlighted

Creating new table in Hive using data of the older table

New Contributor

We have a table in Hive which is partitioned on one column . It is holding over 2 TBs of data .

We want to create a new table which has 3 partitions and we want add data from the older table to new one.

What approach should we take ?

P.S.- We are using HDP 2.3

3 REPLIES 3

Re: Creating new table in Hive using data of the older table

Super Guru

@Krupal Jagtap

I was assuming that your new table have 3 partition columns.


Best option would be Using Insert statement to dynamically create partitions in new table

Ex:

Insert into new_table partition(partition_col1,partition_col2,partition_col3) 
select non_partition_cols,partition_col1,partition_col2,partition_col3 from old_table;

Use distributeBy,sortBy to control number of files created in the directory.

Refer to this link for more details regards to distributeBy,sortBy.

Re: Creating new table in Hive using data of the older table

New Contributor

Hi @Shu


we tried the approach and got following error


1)

hive> Insert into new_table partition( new_partition1,new_partition2, new_partition3)

> select col1,

> col2 ,

> col3 ,

> colX ,

> new_partition1 ,

> new_partition2 ,

> new_partition3 from old_table where createdate='2016-11-09' ;

FAILED: ArrayIndexOutOfBoundsException -1

Re: Creating new table in Hive using data of the older table

New Contributor

2)

hive> Insert into new_table partition( new_partition1,new_partition2, new_partition3)

> select col1,

> col2 ,

> col3 ,

> colX ,

> new_partition1 ,

> new_partition2 ,

> new_partition3 from old_table;


Query ID = hdfs_20190514114256_81a413f7-49eb-4460-a16f-4bef38f7954a

Total jobs = 1

Launching Job 1 out of 1

Tez session was closed. Reopening...

Session re-established.



Status: Running (Executing on YARN cluster with App id application_1538560024513_0256)


--------------------------------------------------------------------------------

VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED

--------------------------------------------------------------------------------

Map 1 FAILED 10 0 0 10 4 0

Reducer 2 KILLED 302 0 0 302 0 0

--------------------------------------------------------------------------------

VERTICES: 00/02 [>>--------------------------] 0% ELAPSED TIME: 5.35 s

--------------------------------------------------------------------------------

Status: Failed

Vertex failed, vertexName=Map 1, vertexId=vertex_1538560024513_0256_1_00, diagnostics=[Task failed, taskId=task_1538560024513_0256_1_00_000002, diagnostics=[TaskAttempt 0 failed, info=[Container launch failed for container_e486_1538560024513_0256_02_000002 : org.apache.hadoop.yarn.exceptions.YarnException: Unauthorized request to start container.

This token is expired. current time is 1557835643652 found 1557834788261

Note: System times on machines may be out of sync. Check system time and time zones.

at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)

at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)

at java.lang.reflect.Constructor.newInstance(Constructor.java:422)

at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateException(SerializedExceptionPBImpl.java:168)

at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.deSerialize(SerializedExceptionPBImpl.java:106)

at org.apache.tez.dag.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:180)

at org.apache.tez.dag.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:384)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745)

], TaskAttempt 1 failed, info=[Container launch failed for container_e486_1538560024513_0256_02_000009 : org.apache.hadoop.yarn.exceptions.YarnException: Unauthorized request to start container.

This token is expired. current time is 1557835644617 found 1557834789325

Note: System times on machines may be out of sync. Check system time and time zones.

at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)

at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)

at java.lang.reflect.Constructor.newInstance(Constructor.java:422)

at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateException(SerializedExceptionPBImpl.java:168)

at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.deSerialize(SerializedExceptionPBImpl.java:106)

at org.apache.tez.dag.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:180)

at org.apache.tez.dag.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:384)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745)

], TaskAttempt 2 failed, info=[Container launch failed for container_e486_1538560024513_0256_02_000010 : org.apache.hadoop.yarn.exceptions.YarnException: Unauthorized request to start container.

This token is expired. current time is 1557835645700 found 1557834790432

Note: System times on machines may be out of sync. Check system time and time zones.

at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)

at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)

at java.lang.reflect.Constructor.newInstance(Constructor.java:422)

at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateException(SerializedExceptionPBImpl.java:168)

at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.deSerialize(SerializedExceptionPBImpl.java:106)

at org.apache.tez.dag.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:180)

at org.apache.tez.dag.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:384)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745)

], TaskAttempt 3 failed, info=[Container launch failed for container_e486_1538560024513_0256_02_000013 : org.apache.hadoop.yarn.exceptions.YarnException: Unauthorized request to start container.

This token is expired. current time is 1557835646991 found 1557834791443

Note: System times on machines may be out of sync. Check system time and time zones.

at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)

at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)

at java.lang.reflect.Constructor.newInstance(Constructor.java:422)

at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateException(SerializedExceptionPBImpl.java:168)

at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.deSerialize(SerializedExceptionPBImpl.java:106)

at org.apache.tez.dag.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:180)

at org.apache.tez.dag.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:384)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745)

]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:9, Vertex vertex_1538560024513_0256_1_00 [Map 1] killed/failed due to:OWN_TASK_FAILURE]

Vertex killed, vertexName=Reducer 2, vertexId=vertex_1538560024513_0256_1_01, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:302, Vertex vertex_1538560024513_0256_1_01 [Reducer 2] killed/failed due to:OTHER_VERTEX_FAILURE]

DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:1

FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, vertexId=vertex_1538560024513_0256_1_00, diagnostics=[Task failed, taskId=task_1538560024513_0256_1_00_000002, diagnostics=[TaskAttempt 0 failed, info=[Container launch failed for container_e486_1538560024513_0256_02_000002 : org.apache.hadoop.yarn.exceptions.YarnException: Unauthorized request to start container.

This token is expired. current time is 1557835643652 found 1557834788261

Note: System times on machines may be out of sync. Check system time and time zones.

at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)

at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)

at java.lang.reflect.Constructor.newInstance(Constructor.java:422)

at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateException(SerializedExceptionPBImpl.java:168)

at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.deSerialize(SerializedExceptionPBImpl.java:106)

at org.apache.tez.dag.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:180)

at org.apache.tez.dag.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:384)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745)

], TaskAttempt 1 failed, info=[Container launch failed for container_e486_1538560024513_0256_02_000009 : org.apache.hadoop.yarn.exceptions.YarnException: Unauthorized request to start container.

This token is expired. current time is 1557835644617 found 1557834789325

Note: System times on machines may be out of sync. Check system time and time zones.

at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)

at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)

at java.lang.reflect.Constructor.newInstance(Constructor.java:422)

at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateException(SerializedExceptionPBImpl.java:168)

at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.deSerialize(SerializedExceptionPBImpl.java:106)

at org.apache.tez.dag.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:180)

at org.apache.tez.dag.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:384)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745)

], TaskAttempt 2 failed, info=[Container launch failed for container_e486_1538560024513_0256_02_000010 : org.apache.hadoop.yarn.exceptions.YarnException: Unauthorized request to start container.

This token is expired. current time is 1557835645700 found 1557834790432

Note: System times on machines may be out of sync. Check system time and time zones.

at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)

at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)

at java.lang.reflect.Constructor.newInstance(Constructor.java:422)

at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateException(SerializedExceptionPBImpl.java:168)

at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.deSerialize(SerializedExceptionPBImpl.java:106)

at org.apache.tez.dag.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:180)

at org.apache.tez.dag.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:384)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745)

], TaskAttempt 3 failed, info=[Container launch failed for container_e486_1538560024513_0256_02_000013 : org.apache.hadoop.yarn.exceptions.YarnException: Unauthorized request to start container.

This token is expired. current time is 1557835646991 found 1557834791443

Note: System times on machines may be out of sync. Check system time and time zones.

at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)

at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)

at java.lang.reflect.Constructor.newInstance(Constructor.java:422)

at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateException(SerializedExceptionPBImpl.java:168)

at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.deSerialize(SerializedExceptionPBImpl.java:106)

at org.apache.tez.dag.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:180)

at org.apache.tez.dag.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:384)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745)

]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:9, Vertex vertex_1538560024513_0256_1_00 [Map 1] killed/failed due to:OWN_TASK_FAILURE]Vertex killed, vertexName=Reducer 2, vertexId=vertex_1538560024513_0256_1_01, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:302, Vertex vertex_1538560024513_0256_1_01 [Reducer 2] killed/failed due to:OTHER_VERTEX_FAILURE]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:1