About mmlr_90

mmlr_90 · ‎05-30-2017

Hello Im performing a tpch benchmark on Apache Drill, when i try to run query 21 (i'll put the code next) it gives the error "UNSUPPORTED_OPERATION ERROR: This query cannot be planned possibly due to either a cartesian join or an inequality join". I tried with all tables on the from line but the result is the same. Why is this happening? can someone provide a version of the queries that might work with the same result? <code>SELECT S_NAME, COUNT(*) AS NUMWAIT FROM hive.tpch_flat_orc_30.supplier, hive.tpch_flat_orc_30.nation join hive.tpch_flat_orc_30.LINEITEM L1 on S_SUPPKEY = L1.L_SUPPKEY join hive.tpch_flat_orc_30.ORDERS on O_ORDERKEY = L1.L_ORDERKEY where O_ORDERSTATUS = 'F' AND L1.L_RECEIPTDATE> L1.L_COMMITDATE AND EXISTS (SELECT * FROM hive.tpch_flat_orc_30.lINEITEM L2 WHERE L2.L_ORDERKEY = L1.L_ORDERKEY AND L2.L_SUPPKEY <> L1.L_SUPPKEY) AND NOT EXISTS (SELECT * FROM hive.tpch_flat_orc_30.lineitem L3 WHERE L3.L_ORDERKEY = L1.L_ORDERKEY AND L3.L_SUPPKEY <> L1.L_SUPPKEY AND L3.L_RECEIPTDATE > L3.L_COMMITDATE) AND S_NATIONKEY = N_NATIONKEY AND N_NAME = 'SAUDI ARABIA' GROUP BY S_NAME ORDER BY NUMWAIT DESC, S_NAME LIMIT 100;

mmlr_90 · ‎05-30-2017

Hello Im performing a tpch benchmark on Apache Drill, when i try to run query 21 (i'll put the code next) it gives the error "UNSUPPORTED_OPERATION ERROR: This query cannot be planned possibly due to either a cartesian join or an inequality join". I tried with all tables on the from line but the result is the same. Why is this happening? can someone provide a version of the queries that might work with the same result? <code>SELECT S_NAME, COUNT(*) AS NUMWAIT FROM hive.tpch_flat_orc_30.supplier, hive.tpch_flat_orc_30.nation join hive.tpch_flat_orc_30.LINEITEM L1 on S_SUPPKEY = L1.L_SUPPKEY join hive.tpch_flat_orc_30.ORDERS on O_ORDERKEY = L1.L_ORDERKEY where O_ORDERSTATUS = 'F' AND L1.L_RECEIPTDATE> L1.L_COMMITDATE AND EXISTS (SELECT * FROM hive.tpch_flat_orc_30.lINEITEM L2 WHERE L2.L_ORDERKEY = L1.L_ORDERKEY AND L2.L_SUPPKEY <> L1.L_SUPPKEY) AND NOT EXISTS (SELECT * FROM hive.tpch_flat_orc_30.lineitem L3 WHERE L3.L_ORDERKEY = L1.L_ORDERKEY AND L3.L_SUPPKEY <> L1.L_SUPPKEY AND L3.L_RECEIPTDATE > L3.L_COMMITDATE) AND S_NATIONKEY = N_NATIONKEY AND N_NAME = 'SAUDI ARABIA' GROUP BY S_NAME ORDER BY NUMWAIT DESC, S_NAME LIMIT 100;

mmlr_90 · ‎05-18-2017

Hello, Im performing a tpch benchmark using drill, the queries are stored in sql file and are executed over a hive schema (with tables stored as orc). When i try to run query 22 (ill put the code next), Drill gives an error of "Error: UNSUPPORTED_OPERATION ERROR: This query cannot be planned possibly due to either a cartesian join or an inequality join". The same happens in query 22 of tpch, can some tell me how to fix the query? Query code: "SELECT S_NAME, COUNT(*) AS NUMWAIT FROM SUPPLIER, LINEITEM L1, ORDERS, NATION WHERE S_SUPPKEY = L1.L_SUPPKEY AND O_ORDERKEY = L1.L_ORDERKEY AND O_ORDERSTATUS = 'F' AND L1.L_RECEIPTDATE> L1.L_COMMITDATE AND EXISTS (SELECT * FROM LINEITEM L2 WHERE L2.L_ORDERKEY = L1.L_ORDERKEY AND L2.L_SUPPKEY <> L1.L_SUPPKEY) AND NOT EXISTS (SELECT * FROM LINEITEM L3 WHERE L3.L_ORDERKEY = L1.L_ORDERKEY AND L3.L_SUPPKEY <> L1.L_SUPPKEY AND L3.L_RECEIPTDATE > L3.L_COMMITDATE) AND S_NATIONKEY = N_NATIONKEY AND N_NAME = 'SAUDI ARABIA' GROUP BY S_NAME ORDER BY NUMWAIT DESC, S_NAME LIMIT 100;"

mmlr_90 · ‎05-09-2017

So although it presents itself as .deflate, basicly it's orc? Spark queries can query .parquet files, it will be able to query in these files with deflate format?

mmlr_90 · ‎05-09-2017

Thank you all for the answers!

mmlr_90 · ‎05-08-2017

Hello, I am Hive-testbench (http://blog.moserit.com/benchmarking-hive) to test some queries. By default, using the ./tpcds-setup.sh 10 what is the file format will my hive tables have (since in hdfs they are listed with a .deflate extesion)? I think the best file formats for performance are either ORC orc parquet, how can i generate in those formats? Thanks

mmlr_90 · ‎02-13-2017

I was able to generate the data (10GB), but now that i've runned some queries, i get no results except on Query 1 that returns 4 rows. When i run the queries on hive command line it gives me the output of the mapreduce jobs, but in the end it doesnt return any rows. Can you give me some kind of help?

mmlr_90 · ‎02-11-2017

Good evening, im trying to perform a TPC-H benchmark on hive, i donloaded from .git hive-testbench (https://github.com/hortonworks/hive-testbench) after i build (./tpch-build.sh) i try to generate the data (./tpch-setup.sh 10), but ir gives error saying that dbgen.jar doenst exist (but he exists):~ ls: `/tmp/tpch-generate/10/lineitem': No such file or directory Generating data at scale factor 10. Exception in thread "main" java.io.FileNotFoundException: File file:/home/centos/hive-testbench-hive14/tpch-gen/target/lib/dbgen.jar does not exist at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:598) at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:811) at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:588) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:425) at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:340) at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:2042) at org.notmysock.tpch.GenTable.copyJar(GenTable.java:163) at org.notmysock.tpch.GenTable.run(GenTable.java:100) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.notmysock.tpch.GenTable.main(GenTable.java:54) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) ls: `/tmp/tpch-generate/10/lineitem': No such file or directory I already tried to generate sepecifying a directory but the result its the same. Can you give me some kind of help?

Online	Offline
Last Visited	‎10-26-2017 02:31 AM

Member Since	‎02-11-2017 02:46 AM
Last Visited	‎10-26-2017 02:31 AM
Posts	17

Cloudera Community

Cant run query on Drill

Cant run query on Drill

Drill query cannot be planned possibly due to eit...

Re: Hive Table formats

Re: Hive Table formats

Hive Table formats

Re: How to perform TPCH on Hive?

How to perform TPCH on Hive?