About testsaran09

testsaran09 · ‎11-01-2017

Thanks much for your detailed reply, it really helps.!!!

testsaran09 · ‎10-27-2017

We have 2-node cluster(1 master 4 CPU,16 GB RAM + 1 data node 8 CPU,30 GB RAM). However in Ambari console, I could be able to see the Total cluster memory is 22 GB only. Is there a way to allocate more cluster memory(around 36GB ) out of 46 GB physical memory we have together from master + data node. Morever, the number of containers are only 5 whereas the available Vcores are 8 already. I have attached the screenshot for your reference. Please suggest a way to improve the cluster resource utilization. Thank you in advance.

testsaran09 · ‎10-25-2017

We have 2-node cluster(1 master 4 CPU,16 GB RAM + 1 data node 8 CPU,30 GB RAM) and the estimated amount of data being processed through HIVE tables are 100 GB. We are using Ambari Hive 2.0 view instance running in Master and the estimated number of support/analytics users are around 15-20. When we try to access the HIVE instance differently for each user (per session), all HIVE queries (using Tez) are processed via YARN default queue. However the expectation is to get the HIVE results in parallel for each session, but these Tez jobs are executed in sequence and the performance is major constraint here. We dont want to add more nodes as the data being processed is still in GBs and we wanted to improve the parallelism in HIVE query execution with the current hardware configuration. We have also applied tuning parameters related to HIVE such as et hive.cbo.enable=true; set hive.compute.query.using.stats=true; set hive.stats.fetch.column.stats=true; set hive.stats.fetch.partition.stats=true; along with converting the table into ORC format. Even then the performance of query response time and parallelism are not improved. Any help related to this,highly appreciated. Thanks!!!

testsaran09 · ‎10-12-2017

Thanks much

testsaran09 · ‎10-12-2017

Thanks for your response. Is it possible to minimize the response time by converting my table in ORC or Parquet format.?

testsaran09 · ‎10-12-2017

Thanks for your reply. Yes true, we cannot compare with RDBMS as both HIVE and RDBMS meant for different purposes. However, it is evident that HIVE is still handful for batch analytics but not for interactive.(atleast for now)

testsaran09 · ‎10-11-2017

My usecase is to perform interactive analytics on top of the log data (json format) stored in HDFS and in HIVE table(TEXTFILE format). We have around 30 million records and the size of the dataset is around 60 GB. Since Tez is the default query engine for my hive version, i expected the query results should be faster enough, but the response time for even count() also took around 30 seconds. What would be best practice or recommendation for performing interactive log analytics using HIVE? Do I need to use HIVE table with RC/ORC format rather than TEXT.? My customer comparing the query response time with RDBMS in this case. Appreciate your suggestion on approach/solution to satisfy my usecase. Thanks!!!

testsaran09 · ‎07-14-2016

Hello, Is there a way to restrict/protect the access to the following service URLs through browser. As of now all these URLs are accessible without authentication and our Security Assessment team list these as part of the vulnerabilities. http://domainame:50070/logs/ http://domainame:50070/explorer.html#/ http://domainame:50070/dfshealth.html#tab-datanode http://domainame:16030/rs-status http://domainame:8088/cluster/cluster http://domainame:8188/applicationhistory http://domainame:8042/node http://secondarynamenode:16010/logs/ http://datanode:61310/logs/ Your speedy response is highly appreciated. Thanks

testsaran09 · ‎06-14-2016

Thanks Deepesh for your immense response.

testsaran09 · ‎06-13-2016

We are using HDP2.2. stack and by default mysql 5.6.x has been bundled. The security team assessment report shows that there are vulnerabilities in mysql 5.6 and the remedy of the issue could be upgrade mysql 5.6.30 or above. I am not sure how to upgrade mysql alone in HDP 2.2 and would be good if any one have such experience of upgrading mysql 5.6.x to 5.6.30 or above. Please reply.

Online	Offline
Last Visited	‎01-30-2018 09:10 AM

Member Since	‎06-13-2016 10:49 PM
Last Visited	‎01-30-2018 09:10 AM
Posts	20

Cloudera Community

Re: Why my cluster memory is less even though phys...

Why my cluster memory is less even though physical...

Performance and parallelism of Hive Queries throug...

Re: Why my hive over tez query slow for 25 million...

Re: Why my hive over tez query slow for 25 million...

Re: Why my hive over tez query slow for 25 million...

Why my hive over tez query slow for 25 million rec...

Restrict/Protect free access to users through web

Re: upgrade mysql 5.6.x to mysql 5.6.30/5.6.31

upgrade mysql 5.6.x to mysql 5.6.30/5.6.31