Support Questions

rezab · ‎10-07-2015

Hello every body,

I have a Hive table and trying to find a solution to add an incremental primary key to it. Here is my solution:

create table new_table as select row_number() over () as ID, * from old_table;

It will create a new table with a new incremental column(ID). Its working well on small tables but when I'm running it on a bigger table(20M records/500 columns), it will fail with this message:

Examining task ID: task_1444013233108_0091_r_000558 (and more) from job job_1444013233108_0091
Examining task ID: task_1444013233108_0091_r_000000 (and more) from job job_1444013233108_0091

Task with the most failures(4):
-----
Task ID:
task_1444013233108_0091_r_000000

URL:
http://chd.moneyball.guru:8088/taskdetails.jsp?jobid=job_1444013233108_0091&tipid=task_1444013233108...
-----
Diagnostic Messages for this Task:
Exception from container-launch.
Container id: container_1444013233108_0091_01_000715
Exit code: 255
Stack trace: ExitCodeException exitCode=255:
   at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
   at org.apache.hadoop.util.Shell.run(Shell.java:455)
   at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
   at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
   at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
   at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:745)

Container exited with a non-zero exit code 255

FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched:
Stage-Stage-1: Map: 142 Reduce: 568   Cumulative CPU: 9791.07 sec   HDFS Read: 38198769932 HDFS Write: 54432 FAIL
Total MapReduce CPU Time Spent: 0 days 2 hours 43 minutes 11 seconds 70 msec

-----------------------------

I also tried to limit the number of records in select command:

create table new_table as select row_number() over () as ID, * from old_table limit 1000;

Do you have any idea about this error?

Thanks

KiranThati · ‎10-07-2015

Can you try using cluster by:

create table new_table as select row_number() over () as ID, * from old_table cluster by ID;

francis2411 · ‎03-21-2017

Did you find the solution I have the same problem

Cloudera Community

Support Questions

Add an auto_inceremnt ID to a Hive table