Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Hive benchmarking error tpcds_kit.zip issue

Hive benchmarking error tpcds_kit.zip issue

Explorer

Hi folks,

I am not able to build hive-testbench tpcds tests. I am running an EMR cluster with Hive 1.0.0 and Hadoop 2.7.1 installed on it.

I have cloned the code from “git clonehttps://github.com/hortonworks/hive-testbench.git” when i try to build tpcds tests it throws error

$ ./tpcds-build.sh Building TPC-DS Data Generator

mkdir -p target/ cp tpcds_kit.zip target/tpcds_kit.zip test -d target/tools/ || (cd target; unzip tpcds_kit.zip; cd tools; cat ../../*.patch | patch -p0 )

Archive: tpcds_kit.zip End-of-central-directory signature not found. Either this file is not a zipfile, or it constitutes one disk of a multi-part archive. In the latter case the central directory and zipfile comment will be found on the last disk(s) of this archive.

unzip: cannot find zipfile directory in one of tpcds_kit.zip or tpcds_kit.zip.zip, and cannot find tpcds_kit.zip.ZIP, period.

/bin/sh: line 0: cd: tools: No such file or directory cat: ../../*.patch: No such file or directory cd target/tools; make clean; make dsdgen

/bin/sh: line 0: cd: target/tools: No such file or directory

make[1]: Entering directory `/home/hadoop/hive-testbench-master/tpcds-gen' mvn clean [INFO] Scanning for projects...

[WARNING]

[WARNING] Some problems were encountered while building the effective model for org.notmysock.tpcds:tpcds-gen:jar:1.0-SNAPSHOT

[WARNING] 'build.plugins.plugin.version' for org.apache.maven.plugins:maven-compiler-plugin is missing. @ line 47, column 15

[WARNING] 'build.plugins.plugin.version' for org.apache.maven.plugins:maven-jar-plugin is missing. @ line 54, column 15

[WARNING]

[WARNING] It is highly recommended to fix these problems because they threaten the stability of your build. [WARNING]

[WARNING] For this reason, future Maven versions might no longer support building such malformed projects.

make[1]: Leaving directory `/home/hadoop/hive-testbench-master/tpcds-gen'

make[1]: Entering directory `/home/hadoop/hive-testbench-master/tpcds-gen'

make[1]: *** No rule to make target `dsdgen'. Stop.

make[1]: Leaving directory `/home/hadoop/hive-testbench-master/tpcds-gen' make: *** [target/tools/dsdgen] Error 2 TPC-DS

Data Generator built, you can now use tpcds-setup.sh to generate data.

I am able to build tpch tests but not the tpcds tests. Anyone who has successfully build DS tests recently ?

Help is very much appreciated

Thanks!

5 REPLIES 5

Re: Hive benchmarking error tpcds_kit.zip issue

Hi there @Akhil Chalamlasetty I usually see this when the zip files have not been able to be downloaded.

Check to see what happeend to the following two curl commands that should be run as part of the tpcds-gen:

curl http://dev.hortonworks.com.s3.amazonaws.com/hive-testbench/tpcds/README

curl --output tpcds_kit.zip http://dev.hortonworks.com.s3.amazonaws.com/hive-testbench/tpcds/TPCDS_Tools.zip

Re: Hive benchmarking error tpcds_kit.zip issue

Explorer

Hello Drussell. Thanks for getting back.

I was able to run these commands and get the output.

$ curl http://dev.hortonworks.com.s3.amazonaws.com/hive-testbench/tpcds/README

This artifact was downloaded from http://dev.hortonworks.com.s3.amazonaws.com/hive-testbench/tpcds/README and is available for consumption under section 9 of EULA.txt available. 9. Limited grants of sublicense. You may distribute the Software as provided or as modified as permitted under clause 4 of this Agreement, provided you comply with all of the terms of this Agreement and the following conditions: a. If you distribute any portion of the Software in its original form you may do so only under this Agreement by including a complete copy of this Agreement with your distribution, and if you distribute the Software in modified form, you may only do so under a license that at a minimum provides all of the protections and conditions of use contained within this Agreement; b. You must include on each copy of the Software that you distribute the following legend in all caps, at the top of the label and license, and in a font not less than [12] point and no less prominent than any other printing: "THE TPC SOFTWARE IS AVAILABLE WITHOUT CHARGE FROM TPC."; c. You must retain all copyright, patent, trademark, and attribution notices that are present in the Software; and d. You may not generate revenue directly or indirectly (e.g., by charging service fees) for distribution of the Software or of any modifications permitted under clause 4.c.

$ curl --output tpcds_kit.zip http://dev.hortonworks.com.s3.amazonaws.com/hive-testbench/tpcds/TPCDS_Tools.zip

% Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 5585k 100 5585k 0 0 30.3M 0 --:--:-- --:--:-- --:--:-- 30.4M

/tmp/v1.4.0]$ ls

answer_sets EULA.htm EULA.txt query_templates query_variants specification tests tools

What would you suggest to do next ?

Highlighted

Re: Hive benchmarking error tpcds_kit.zip issue

Mentor

@Carter Shanklin @drussell

in my case I'm actually able to build tpcds and cannot build tpch. I'm able to fetch the tpch_kit.zip just fine.

[vagrant@c6401 hive-testbench]$ ./tpch-build.sh
Building TPC-H Data Generator
test -d target/tools/ || (cd target; unzip tpch_kit.zip -x __MACOSX/; ln -sf $PWD/*/dbgen/ tools)
Archive:  tpch_kit.zip
  End-of-central-directory signature not found.  Either this file is not
  a zipfile, or it constitutes one disk of a multi-part archive.  In the
  latter case the central directory and zipfile comment will be found on
  the last disk(s) of this archive.
unzip:  cannot find zipfile directory in one of tpch_kit.zip or
        tpch_kit.zip.zip, and cannot find tpch_kit.zip.ZIP, period.
cd target/tools; cat ../../../patches/Linux/*.patch | patch -p0
/bin/sh: line 0: cd: target/tools: No such file or directory
cat: ../../../patches/Linux/*.patch: No such file or directory
cd target/tools; make -f makefile.suite clean; make -f makefile.suite CC=gcc DATABASE=ORACLE MACHINE=LINUX WORKLOAD=TPCH
/bin/sh: line 0: cd: target/tools: No such file or directory
make[1]: Entering directory `/home/vagrant/hive-testbench/tpch-gen'
make[1]: makefile.suite: No such file or directory
make[1]: *** No rule to make target `makefile.suite'.  Stop.
make[1]: Leaving directory `/home/vagrant/hive-testbench/tpch-gen'
make[1]: Entering directory `/home/vagrant/hive-testbench/tpch-gen'
make[1]: makefile.suite: No such file or directory
make[1]: *** No rule to make target `makefile.suite'.  Stop.
make[1]: Leaving directory `/home/vagrant/hive-testbench/tpch-gen'
make: *** [target/tools/dbgen] Error 2
TPC-H Data Generator built, you can now use tpch-setup.sh to generate data.

Re: Hive benchmarking error tpcds_kit.zip issue

Contributor

Same here. I was able to build tpcds but not tpch.

tpch is failing because the link to download tpch zip is invalid now:

curl --output tpch_kit.zip http://www.tpc.org/tpch/spec/tpch_2_16_0.zip

WORKAROUND:

1. Go to TPC website to download TPCH_Tools_v2.17.1.zip :

http://www.tpc.org/tpc_documents_current_versions/current_specifications.asp

2. Put the zip file to <hive-testbench>/tpch-gen/ and rename to 'tpch_kit.zip'

3. ./tpch-build.sh

Re: Hive benchmarking error tpcds_kit.zip issue

New Contributor

did it work?

Don't have an account?
Coming from Hortonworks? Activate your account here