Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Not able to run hive benchmark test

avatar
New Member

I am trying to do hive bench marking(https://github.com/hortonworks/hive-testbench)

but when I run setup script it loads data is some table but fails after sometime fails with the following error:

OK Time taken: 0.264 seconds + '[' X = X ']' + FORMAT=orc + i=1 + total=24 + DATABASE=tpcds_bin_partitioned_orc_2 + for t in '${FACTS}' + echo 'Optimizing table store_sales (1/24).' Optimizing table store_sales (1/24). + COMMAND='hive -i settings/load-partitioned.sql -f ddl-tpcds/bin_partitioned/store_sales.sql -d DB=tpcds_bin_partitioned_orc_2 -d SCALE=2 -d SOURCE=tpcds_text_2 -d BUCKETS=1 -d RETURN_BUCKETS=1 -d FILE=orc' + runcommand 'hive -i settings/load-partitioned.sql -f ddl-tpcds/bin_partitioned/store_sales.sql -d DB=tpcds_bin_partitioned_orc_2 -d SCALE=2 -d SOURCE=tpcds_text_2 -d BUCKETS=1 -d RETURN_BUCKETS=1 -d FILE=orc' + '[' XON '!=' X ']' + hive -i settings/load-partitioned.sql -f ddl-tpcds/bin_partitioned/store_sales.sql -d DB=tpcds_bin_partitioned_orc_2 -d SCALE=2 -d SOURCE=tpcds_text_2 -d BUCKETS=1 -d RETURN_BUCKETS=1 -d FILE=orc WARNING: Use "yarn jar" to launch YARN applications. Logging initialized using configuration in file:/etc/hive/2.4.0.0-169/0/hive-log4j.properties ...

OK Time taken: 0.948 seconds OK Time taken: 0.238 seconds OK Time taken: 0.629 seconds OK Time taken: 0.248 seconds Query ID = hdfs_20160322014240_60c3f689-816d-409e-b8c7-c6ea636fa12a Total jobs = 1 Launching Job 1 out of 1 Dag submit failed due to Invalid TaskLaunchCmdOpts defined for Vertex Map 1 : Invalid/conflicting GC options found, cmdOpts="-server -Djava.net.preferIPv4Stack=true -Dhdp.version=2.4.0.0-169 -XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps -XX:+UseNUMA -XX:+UseParallelGC -XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps -XX:+UseNUMA -XX:+UseG1GC -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/ -Dlog4j.configuratorClass=org.apache.tez.common.TezLog4jConfigurator -Dlog4j.configuration=tez-container-log4j.properties -Dyarn.app.container.log.dir=<LOG_DIR> -Dtez.root.logger=INFO,CLA " stack trace: [org.apache.tez.dag.api.DAG.createDag(DAG.java:859), org.apache.tez.client.TezClientUtils.prepareAndCreateDAGPlan(TezClientUtils.java:694), org.apache.tez.client.TezClient.submitDAGSession(TezClient.java:487), org.apache.tez.client.TezClient.submitDAG(TezClient.java:434), org.apache.hadoop.hive.ql.exec.tez.TezTask.submit(TezTask.java:439), org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:180), org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160), org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89), org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:75)] retrying... FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTask + '[' 1 -ne 0 ']' + echo 'Command failed, try '\''export DEBUG_SCRIPT=ON'\'' and re-running' Command failed, try 'export DEBUG_SCRIPT=ON' and re-running + exit 1

Not sure what is wrong.

Anyhelp is appreciated.

1 ACCEPTED SOLUTION

avatar
New Member

So, I found out that the problem was caused by hive-testbench/settings/load-partitioned.sql. This file is used as init file for hive on generating TPC-DS data. It has some configs for hive, including hive.tez.java.opts.

set hive.tez.java.opts=-XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps -XX:+UseNUMA -XX:+UseG1GC -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/;

This config conflicts with default HDP Hive config.

Two ways to solve it:

1. Change hive.tez.java.opts in hive-testbench/settings/load-partitioned.sql to use UseParallelGC (recommended).

set hive.tez.java.opts=-XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps -XX:+UseNUMA -XX:+UseParallelGC -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/;

or

2. Set hive config on Ambari to use UseG1GC java garbage collecto

tez.am.launch.cmd-opts: -XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps -XX:+UseNUMA --XX:+UseG1GC

tez.task.launch.cmd-opts: -XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps -XX:+UseNUMA -XX:+UseG1GC

hive.tez.java.opt: -server -Djava.net.preferIPv4Stack=true -XX:NewRatio=8 -XX:+UseNUMA -XX:+UseG1GC -XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps

View solution in original post

3 REPLIES 3

avatar
New Member

Same here! Deployed Hortonworks Data Platform on Google Cloud using bdutil. Ambari is ok. When I try to gen data using tpcds setup it fails with same error as you though.

Did you find any solution?

avatar
New Member

So, I found out that the problem was caused by hive-testbench/settings/load-partitioned.sql. This file is used as init file for hive on generating TPC-DS data. It has some configs for hive, including hive.tez.java.opts.

set hive.tez.java.opts=-XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps -XX:+UseNUMA -XX:+UseG1GC -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/;

This config conflicts with default HDP Hive config.

Two ways to solve it:

1. Change hive.tez.java.opts in hive-testbench/settings/load-partitioned.sql to use UseParallelGC (recommended).

set hive.tez.java.opts=-XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps -XX:+UseNUMA -XX:+UseParallelGC -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/;

or

2. Set hive config on Ambari to use UseG1GC java garbage collecto

tez.am.launch.cmd-opts: -XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps -XX:+UseNUMA --XX:+UseG1GC

tez.task.launch.cmd-opts: -XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps -XX:+UseNUMA -XX:+UseG1GC

hive.tez.java.opt: -server -Djava.net.preferIPv4Stack=true -XX:NewRatio=8 -XX:+UseNUMA -XX:+UseG1GC -XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps

avatar
Master Guru

+1 Another solution is to comment out hive.tez.java.opts in that sql file and manage the GC from Ambari.