Member since
07-31-2019
346
Posts
259
Kudos Received
62
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2870 | 08-22-2018 06:02 PM | |
1662 | 03-26-2018 11:48 AM | |
4095 | 03-15-2018 01:25 PM | |
5056 | 03-01-2018 08:13 PM | |
1415 | 02-20-2018 01:05 PM |
04-02-2017
11:54 AM
Hi @Bala Vignesh N V, Truncating the partition will not remove the partition metadata. To remove the metadata for all partitions you'll want to issue the CASCADE statement in an ALTER TABLE statement. This should remove the column metadata for all partitions. https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-ChangeColumnName/Type/Position/Comment
... View more
03-27-2017
07:24 PM
Hi @Abhijeet Rajput, it is recommended to analyze all tables, ORC included, on a regular basis for performance. Statistics will be more valuable on larger tables than smaller tables. Sorting is not necessary and, in fact, sorting is not allowed on ACID tables. As of HDP 2.5, Hive uses both a rules based optimizer as well as a cost-based optimizer called Apache Calcite. Enabling the CBO will provide the best use of statistics. Also, you may want to take a look at LLAP which is TP in 2.5 and will be GA in 2.6. Hope this helps.
... View more
03-11-2017
02:35 PM
@ccasano I'd also like to add to my comment that default LLAP leverages the LRFU algorithm with a pre-emption strategy on the Frequently Used side. This means that LLAP will always preempt long running queries in favor of short, adhoc queries yet it still will allow for the occasional "bring-back-all-the-data" scenario without flushing adhoc query cache. This allows for better query concurrency and provides optimal performance for the majority of BI workloads. In addition, LLAP does not use YARN containers which would limit concurrency since each container is a user session, aka job. Most BI use cases involve many users running ad hoc queries and then keeping their session open as they look at the reports. By leverages TEZ AM for queries LLAP gets much higher concurrency. The combination of LLAP's LRFU algorithm, use of TEZ AM, caching, and AtScale's Adaptive Cache, users get a nice boost in performance and concurrency out-of-the-box.
... View more
03-10-2017
03:56 PM
2 Kudos
Hi @ccasano Current limits are more tightly related to query performance because queries that take a long time can keep open threads which only serve to backup other users. So getting queries to Adaptive Cache and improving query performance is important. That being said, there will be some clients that will have enterprise grade concurrency requirements. When this is the case, the recommendation is to stand up additional AtScale nodes which will synchronize and have a load balancer in front. Of course this also assumes that the increased concurrency demand can be serviced by the HDP cluster leveraging the Adaptive Cache. Hope this helps.
... View more
03-01-2017
06:25 PM
https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.QueryDatabaseTable/
... View more
02-23-2017
06:54 PM
5 Kudos
Hi @James Dinkel I'm guessing there is a memory sizing issue. Make sure you follow these sizing rules:
MemPerDaemon (Container Size) > LLAP Heapsize (Java process heap) + CacheSize (off heap) +headroom
Multiple of yarn min allocation Should be less than yarn.nodemanager.resource.memory-mb Headroom is capped at 6GB QueueSize (Yarn Queue) >= MemPerDaemon * num daemons + slider + (tez AM size * concurrency) Cachesize = MemPerDaemon - (hive tez container * num of executors) Num executors per daemon = (MemPerDaemon - cache_size)/hive tez container size In addition, be sure your LLAP queue is setup appropriately and has sufficient capacity:
<queue>.user_limit_factor =1 <queue>.ama-resource-percent =1 (its actually a factor between 0 and 1) <queue>.capacity=100 <queue>.max-capacity=100
... View more
01-24-2017
03:12 PM
Hi @suresh krish It appears you may have a Ranger policy preventing access to the table. You can disable Ranger authentication through Ambari in the Hive configs or review the Hive Ranger policies and provide the appropriate authorization. This HCC thread has some additional information https://community.hortonworks.com/questions/64345/how-to-add-another-hiveserver-for-current-metastor.html
... View more
01-23-2017
07:44 PM
2 Kudos
Hi @Geetha Anne Our most recent release is 2.5.3. We provide support for a rolling window of the previous 2 versions. This means we still provide support for 2.3. HDP 2.3 was released on June 8, 2015.
... View more
01-18-2017
04:08 PM
@Yasir Faiz I believe LLAP will be GA in 2.6 which is due out sometime in Q2. The advantage of HDP 2.5 would be to test LLAP and Hive 2.0. Also, HDP 2.5 also includes a number fixes to Hive 1.2.1 1.2.1 http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.3/bk_release-notes/content/fixed_issues.html
... View more
01-18-2017
03:34 PM
2 Kudos
@Yasir Faiz Hortonworks fully supports Spark but we do not support Hive on Spark. There are couple of reasons for this: 1. Performance: Hive on Tez showed a 50x or greater improvement than Hive on MR, while Hive on Spark only showed a 3x improvement over Hive on MR. 2. Scale: Hive on Tez has been proven to scale at data sets well exceeding what is capable by Hive on Spark. Here is a presentation discussing some of the differences: http://www.slideshare.net/hortonworks/hive-on-spark-is-blazing-fast-or-is-it-final With LLAP, Hive speed is increased even further but LLAP is not a replacement for Tez. In fact, LLAP still leverages the Tez engine. Hope this helps!
... View more