Member since
02-11-2017
17
Posts
0
Kudos Received
0
Solutions
05-30-2017
07:47 PM
Hello Im performing a tpch benchmark on Apache Drill, when i try to run query 21 (i'll put the code next) it gives the error "UNSUPPORTED_OPERATION ERROR: This query cannot be planned possibly due to either a cartesian join or an inequality join". I tried with all tables on the from line but the result is the same. Why is this happening? can someone provide a version of the queries that might work with the same result? <code>SELECT S_NAME, COUNT(*) AS NUMWAIT
FROM hive.tpch_flat_orc_30.supplier, hive.tpch_flat_orc_30.nation
join hive.tpch_flat_orc_30.LINEITEM L1 on S_SUPPKEY = L1.L_SUPPKEY
join hive.tpch_flat_orc_30.ORDERS on O_ORDERKEY = L1.L_ORDERKEY
where O_ORDERSTATUS = 'F'
AND L1.L_RECEIPTDATE> L1.L_COMMITDATE
AND EXISTS (SELECT *
FROM hive.tpch_flat_orc_30.lINEITEM L2
WHERE L2.L_ORDERKEY = L1.L_ORDERKEY
AND L2.L_SUPPKEY <> L1.L_SUPPKEY)
AND NOT EXISTS (SELECT *
FROM hive.tpch_flat_orc_30.lineitem L3
WHERE L3.L_ORDERKEY = L1.L_ORDERKEY
AND L3.L_SUPPKEY <> L1.L_SUPPKEY
AND L3.L_RECEIPTDATE > L3.L_COMMITDATE)
AND S_NATIONKEY = N_NATIONKEY
AND N_NAME = 'SAUDI ARABIA'
GROUP BY S_NAME
ORDER BY NUMWAIT DESC, S_NAME
LIMIT 100;
... View more
Labels:
- Labels:
-
Apache Ambari
05-30-2017
04:53 PM
Hello Im performing a tpch benchmark on Apache Drill, when i try to run query 21 (i'll put the code next) it gives the error "UNSUPPORTED_OPERATION ERROR: This query cannot be planned possibly due to either a cartesian join or an inequality join". I tried with all tables on the from line but the result is the same. Why is this happening? can someone provide a version of the queries that might work with the same result? <code>SELECT S_NAME, COUNT(*) AS NUMWAIT
FROM hive.tpch_flat_orc_30.supplier, hive.tpch_flat_orc_30.nation
join hive.tpch_flat_orc_30.LINEITEM L1 on S_SUPPKEY = L1.L_SUPPKEY
join hive.tpch_flat_orc_30.ORDERS on O_ORDERKEY = L1.L_ORDERKEY
where O_ORDERSTATUS = 'F'
AND L1.L_RECEIPTDATE> L1.L_COMMITDATE
AND EXISTS (SELECT *
FROM hive.tpch_flat_orc_30.lINEITEM L2
WHERE L2.L_ORDERKEY = L1.L_ORDERKEY
AND L2.L_SUPPKEY <> L1.L_SUPPKEY)
AND NOT EXISTS (SELECT *
FROM hive.tpch_flat_orc_30.lineitem L3
WHERE L3.L_ORDERKEY = L1.L_ORDERKEY
AND L3.L_SUPPKEY <> L1.L_SUPPKEY
AND L3.L_RECEIPTDATE > L3.L_COMMITDATE)
AND S_NATIONKEY = N_NATIONKEY
AND N_NAME = 'SAUDI ARABIA'
GROUP BY S_NAME
ORDER BY NUMWAIT DESC, S_NAME
LIMIT 100;
... View more
Labels:
- Labels:
-
Apache Ambari
05-18-2017
05:01 PM
Hello, Im performing a tpch benchmark using drill, the queries are stored in sql file and are executed over a hive schema (with tables stored as orc). When i try to run query 22 (ill put the code next), Drill gives an error of "Error: UNSUPPORTED_OPERATION ERROR: This query cannot be planned possibly due to either a cartesian join or an inequality join". The same happens in query 22 of tpch, can some tell me how to fix the query? Query code: "SELECT S_NAME, COUNT(*) AS NUMWAIT
FROM SUPPLIER, LINEITEM L1, ORDERS, NATION WHERE S_SUPPKEY = L1.L_SUPPKEY AND
O_ORDERKEY = L1.L_ORDERKEY AND O_ORDERSTATUS = 'F' AND L1.L_RECEIPTDATE> L1.L_COMMITDATE
AND EXISTS (SELECT * FROM LINEITEM L2 WHERE L2.L_ORDERKEY = L1.L_ORDERKEY
AND L2.L_SUPPKEY <> L1.L_SUPPKEY) AND
NOT EXISTS (SELECT * FROM LINEITEM L3 WHERE L3.L_ORDERKEY = L1.L_ORDERKEY AND
L3.L_SUPPKEY <> L1.L_SUPPKEY AND L3.L_RECEIPTDATE > L3.L_COMMITDATE) AND
S_NATIONKEY = N_NATIONKEY AND N_NAME = 'SAUDI ARABIA'
GROUP BY S_NAME
ORDER BY NUMWAIT DESC, S_NAME
LIMIT 100;"
... View more
Labels:
- Labels:
-
Apache Hive
05-09-2017
12:37 AM
So although it presents itself as .deflate, basicly it's orc? Spark queries can query .parquet files, it will be able to query in these files with deflate format?
... View more
05-08-2017
12:40 AM
Hello, I am Hive-testbench (http://blog.moserit.com/benchmarking-hive) to test some queries. By default, using the ./tpcds-setup.sh 10 what is the file format will my hive tables have (since in hdfs they are listed with a .deflate extesion)? I think the best file formats for performance are either ORC orc parquet, how can i generate in those formats? Thanks
... View more
Labels:
- Labels:
-
Apache Ambari
-
Apache Tez
02-13-2017
11:55 PM
I was able to generate the data (10GB), but now that i've runned some queries, i get no results except on Query 1 that returns 4 rows. When i run the queries on hive command line it gives me the output of the mapreduce jobs, but in the end it doesnt return any rows. Can you give me some kind of help?
... View more
02-11-2017
02:46 AM
Good evening, im trying to perform a TPC-H benchmark on hive, i donloaded from .git hive-testbench (https://github.com/hortonworks/hive-testbench) after i build (./tpch-build.sh) i try to generate the data (./tpch-setup.sh 10), but ir gives error saying that dbgen.jar doenst exist (but he exists):~ ls: `/tmp/tpch-generate/10/lineitem': No such file or directory
Generating data at scale factor 10.
Exception in thread "main" java.io.FileNotFoundException: File file:/home/centos/hive-testbench-hive14/tpch-gen/target/lib/dbgen.jar does not exist
at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:598)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:811)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:588)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:425)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:340)
at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:2042)
at org.notmysock.tpch.GenTable.copyJar(GenTable.java:163)
at org.notmysock.tpch.GenTable.run(GenTable.java:100)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.notmysock.tpch.GenTable.main(GenTable.java:54)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136) ls: `/tmp/tpch-generate/10/lineitem': No such file or directory I already tried to generate sepecifying a directory but the result its the same. Can you give me some kind of help?
... View more
Labels:
- Labels:
-
Apache Hive