Created on 10-19-2014 06:01 PM - edited 09-16-2022 02:10 AM
where can i to get impala test data?
Created 10-21-2014 12:31 AM
Impala TPC-DS test data generation by this project https://github.com/cloudera/impala-tpcds-kit
Created 10-19-2014 06:58 PM
set -e
set -u
echo "Copying data files from the share. If the file already exists locally, the files"\
"will not be copied. It's not check summing the files or anything like that, if"\
"you need to force a copy, delete the local directory:"\
"IMPALA_HOME/testdata/impala-data"
DATAsrc="http://util-1.ent.cloudera.com/impala-test-data/"
DATADST=${IMPALA_HOME}/testdata/impala-data
mkdir -p ${DATADST}
pushd ${DATADST}
# Download all .tar.gz files from the source, excluding the hostname and directory name.
# If the file already exists locally, skip the download.
wget -q --cut-dirs=1 --no-clobber -r --no-parent -nH --accept="*.tar.gz" ${DATASRC}
for filename in *.tar.gz
do
echo "Extracting: ${filename}"
tar -xzf ${filename}
done
popd
echo "Test data download successful."
Created 10-20-2014 06:04 PM
The url is in this script "https://github.com/cloudera/Impala/blob/master/bin/copy-test-data.sh".
Created 10-21-2014 12:31 AM
Impala TPC-DS test data generation by this project https://github.com/cloudera/impala-tpcds-kit