Member since
03-29-2020
110
Posts
10
Kudos Received
16
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
855 | 01-08-2022 07:17 PM | |
2710 | 09-22-2021 09:39 AM | |
12439 | 09-14-2021 04:21 AM | |
2355 | 09-01-2021 10:28 PM | |
2881 | 08-31-2021 08:04 PM |
06-29-2021
07:32 AM
Hello @K_K Once you run a query in beeline pick the queryID and trace the queryID in Hiveserver2 logs to figure out how much time it takes in the HTTP handler thread and the background thread to figure out any slowness in this part. Once the job goes through this it reaches YARN so you need to check the YARN application log of the query about where it is getting slow whether at AM level/container assigning level or task level. In this way, you can see where it is taking time. If it is a managed table you can run major compaction in the table to compress all the delta files into a single base file, in this way you can eliminate multiple HDFS scanning while running the query. You can also run explain plan against the query to figure out the flow and how much data it is processing. You can also run analyze query against the table to collect the column stats and table stats that will increase the query performance. All the jobs cannot be completed in lesser than 4 seconds. Reference: https://cwiki.apache.org/confluence/display/Hive/StatsDev#StatsDev-ANALYZETABLE%3Ctable1%3ECACHEMETADATA https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.0.1/performance-tuning/content/hive_query_result_cache_ms_cache.html https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.5/using-hiveql/content/hive_hive_3_tables.html
... View more
06-26-2021
09:40 AM
1 Kudo
Hi @PURUSHOTHAMAN_S I can see there are a lot of alerts(28) in Ambari, if I were you I will start checking with HDFS service at first like namenode are up and running because it is vital for other services to come up. Then you may need to check YARN and then you can concentrate on others. Check out the Ambari startup logs to see why and where it is getting failed. Hope it helps.
... View more
06-23-2021
10:50 PM
Hello @K_K Hope you are doing great. MapReduce2 and TEZ can provide an output of lesser than 4 seconds but it is DEPENDS upon so many factors. Namely query complexity, queue sizing, input data, resource availability, and so on.
... View more
06-20-2021
06:55 AM
@Bryan_zh I believe HDP 3.1.5 supports Spark 2.X only. Please check the below link https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.5/spark-overview/content/analyzing_data_with_apache_spark.html How to integrate Hive and Spark? https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.0.1/integrating-hive/content/hive_hivewarehouseconnector_for_handling_apache_spark_data.html
... View more
06-20-2021
06:38 AM
Hello @prasanna06 Could you check the below link and see it helps. https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/auto_tls.html
... View more
06-20-2021
06:16 AM
Hello @vidanimegh Error: Error while compiling statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Application application_1623850591633_0042 failed 2 times due to AM Container for appattempt_1623850591633_0042_000002 exited with exitCode: -104
Failing this attempt.Diagnostics: [2021-06-18 18:32:52.722]Container [pid=32822,containerID=container_e49_1623850591633_0042_02_000001] is running 34230272B beyond the 'PHYSICAL' memory limit. Current usage: 2.0 GB of 2 GB physical memory used; 3.9 GB of 4.2 GB virtual memory used. Killing container. As I can see your jobs are getting failed with PHYSICAL memory limit error. Could you set the below property in beeline session level and re-run the analysis query and see how it goes. set hive.tez.container.size=8192; set hive.tez.java.opts=-Xmx6553; set tez.runtime.io.sort.mb=3072; set tez.task.resource.memory.mb=8192; set tez.am.resource.memory.mb=8192; set tez.am.launch.cmd-opts=-Xmx6553m;
... View more
06-19-2021
11:36 PM
Hello @Bryan_zh Hive 3 is the default version in HDP 3.1.5 and you cannot degrade the version to Hive 2.3.7. It is also not recommended to degrade Hive from 3.X to 2.X
... View more
05-16-2021
10:05 AM
Hello @ryu Could you take a screenshot of the message and share it with us. What is the HDP and Ambari version you are using?
... View more
05-16-2021
10:03 AM
Hello @Enigmat Could you try DISTINCT to remove similar entries? https://dwgeek.com/identify-and-remove-duplicate-records-from-hive-table.html/ https://stackoverflow.com/questions/43280052/how-to-delete-duplicate-records-from-hive-table
... View more
05-16-2021
09:56 AM
Hello @bsaad 1. Could you check whether you are able to connect to internet from the Oracle VM, using a ping test to google.com 2. Could you cross-check the port number 8889 is up and listening by using the following command as the root user #netstat -ntpla | grep 8889
... View more