Member since
05-02-2018
7
Posts
0
Kudos Received
0
Solutions
10-06-2018
10:31 AM
Yes, second that, observing the same behaviour surprisingly. desc formatted/extended shows no values for the column stats for the partition even after analyse. Is this a reported bug?
... View more
08-14-2018
06:56 PM
Hi Gautam, Firstly, great article, helps a lot. Could you please explain the session per queue parameter little more in detail? Currently this value is set to 1 by default on our cluster, does it mean, that only one query would be serviced at any given point of time by the HS2 Interactive?
... View more
08-03-2018
06:15 PM
Thanks for the quick response @kgautam 1. Due to a bug we have had to disable the default fetch operation by default and hence a TEZ session is spawned for the operation at the moment for every select * limit 1 operation, also since we use LLAP we have observed there is considerable IO within the mappers from the metrics printed after the query execution. 2. Is this a good approach?
... View more
08-03-2018
05:30 PM
Wanted to know what's the fastest/most efficient way to know if there exists data in a partitioned table. select * from database.table limit 1 according to the query plan does a full table scan. Is there any way to avoid this and quickly know if there exists any data in a given table/partition of a table?
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Tez
05-02-2018
06:02 PM
While creating the table with `org.apache.spark.sql.execution.datasources.orc`we see that the SerDe properties set are haywire and not ORC. Do we have to explicitly fix that? Also does this work seamlessly on Spark 2.2?
... View more