Member since
03-29-2020
110
Posts
10
Kudos Received
16
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
273 | 01-08-2022 07:17 PM | |
701 | 09-22-2021 09:39 AM | |
3447 | 09-14-2021 04:21 AM | |
944 | 09-01-2021 10:28 PM | |
1191 | 08-31-2021 08:04 PM |
01-08-2022
07:17 PM
Can you take a look at the below links for creating the ORC table with the Snappy compression question? https://community.cloudera.com/t5/Support-Questions/Data-Compression-Doesn-t-work-in-ORC-with-SNAPPY-Compression/td-p/172151 https://community.cloudera.com/t5/Support-Questions/Snappy-vs-Zlib-Pros-and-Cons-for-each-compression-in-Hive/m-p/97110 https://community.cloudera.com/t5/Community-Articles/Performance-Comparison-b-w-ORC-SNAPPY-and-ZLib-in-hive-ORC/ta-p/246948
... View more
11-09-2021
06:44 AM
Hi @mhchethan Generally, we suggest using a single LDAP URL. If you want more than one you can configure LB and that LB should connect to your backend https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.0.0/securing-hive/content/hive_secure_hiveserver_using_ldap.html If you are happy with the reply, mark it Accept as Solution
... View more
11-09-2021
02:26 AM
Hi @Aarth You can add the hive properties in the HQL file so that it will take it at the session level. Example: beeline -u "jdbc:hive2://<FQDN:10000>" -f rajkumar.hql Add all the required properties in the HQL file so it will be applied at the time of executing at a session level. set hive.compute.query.using.stats=false; set hive.fetch.task.conversion=none ; If you are processing more than one HQL file add the needed properties to it. If you are happy with the reply, mark it Accept as Solution
... View more
10-11-2021
02:13 AM
Hi @PawanUppala It would be really helpful if you could share with us the complete error stack before the HS2 goes down.
... View more
10-05-2021
06:07 AM
hi @Sadique1 Can you run which spark-shell and share us the results
... View more
10-05-2021
06:06 AM
Hi @Swagat (1). Can you run the below command in your Linux terminal and re-run the "yarn logs -applicationID <appID>" command export HADOOP_USER_NAME=hdfs (2). Based on the logs parent folder "/tmp/logs/hive" and its subfolders are with 770 permission/hive:hadoop so the other users do not have permission to access it. The users ledapp3, calapr01, and gaiapr01 will fall under others so they are not able to access it. Can you run below command as HDFS user and ask your end users to try again. hdfs dfs -chmod 775 <parent folder> hdfs dfs -chmod -R 775 <parent folder> hdfs dfs -chmod -R 777 <parent folder> ###Least Recommended. Similar Issue Reference: https://community.cloudera.com/t5/Support-Questions/Permission-denied-user-root-access-WRITE-inode-quot-user/td-p/4943 https://community.cloudera.com/t5/Support-Questions/Permission-denied-as-I-am-unable-to-delete-a-directory-in/m-p/322469#M228801 If you are happy with the reply, mark it Accept as Solution
... View more
10-01-2021
02:38 AM
Hi @Sam2020 Can you check the below links and see whether it helps. Upgrade CM https://docs.cloudera.com/cdp-private-cloud-upgrade/latest/upgrade-cdp/topics/ug_cdh_upgrading_top.html Upgrade CDP Cluster to a higher version https://docs.cloudera.com/cdp-private-cloud-upgrade/latest/upgrade-cdp/topics/ug-cdpdc.html If you are happy with the reply, mark it Accept as Solution
... View more
09-28-2021
07:37 PM
Hi @Kicker Question: How can I find the list of active sessions on my cluster or how can I check whether a session is active or not by sessionId in Hive? Answer: Could you please take a look at the below link where we discussed a similar question. https://community.cloudera.com/t5/Support-Questions/How-many-users-connected-to-HiveServer2/m-p/322372#M228765 You may have to log in to hiveserver2 and run the below command to see the active number of connections to hiveserver2. netstat -ntpla | grep 10000 | grep -i ESTABLISHED ### Instead of 10000 you have to mention your HS2 Port(if you use http more the port number will be 10001) netstat -ntpla | grep 10000 ### This gives you the detail of Established, Close_wait, and other processes as well. If you want to check the Active sessions you can find the details in Hiveserver2 Web UI Ambari > Hive > QuickLinks > Hiveserver2 WebUI If you are happy with the reply, mark it Accept as Solution
... View more
09-22-2021
09:39 AM
1 Kudo
Hi @nareshbattula Q1: hive metastore performance Based on the memory tuning the performance could vary. Please check the below link to see how to set the memory stuff https://docs.cloudera.com/documentation/enterprise/5-7-x/topics/admin_hive_tuning.html Q2: which tables/dbs causing more pressure on Hivemetastore You may have to check in the HMS logs to see which query is taking a long time. If you are using HDP you can find the current Memory pressure and Heap usage details via the below link https://docs.cloudera.com/HDPDocuments/Ambari-2.7.5.0/using-ambari-core-services/content/amb_hive_hivemetastore.html https://docs.cloudera.com/HDPDocuments/Ambari-2.7.5.0/using-ambari-core-services/content/amb_hive_home.html If you are using CM then you can see the details in below link CM > Hive > Hivemetastore > Charts > JVM Heap Usage/JVM Pause Time/ Q3: number of connections on HMS? You can run the below commands to see the established connection/number of connections to HMS netstat -ntpla | grep 9083 lsof -p <hms pid> | grep "ESTABLISHED" -i If you are using CM then you can see the details in the below link CM > Hive > Hivemetastore > Charts > Open connections If you are happy with the reply, mark it Accept as Solution
... View more
09-20-2021
11:22 PM
Hi @vciampa Please check the below document which should help you to understand the requirement upgrading from HDP 2. X to CDP 7.X cluster as well as Hive. https://docs.cloudera.com/cdp-private-cloud-upgrade/latest/upgrade-hdp/topics/amb-hdp-cdp-upg.html https://docs.cloudera.com/cdp-private-cloud-upgrade/latest/upgrade-hdp/topics/ug_hive_validations.html https://docs.cloudera.com/cdp-private-cloud-upgrade/latest/upgrade-cdh/topics/ug_hive_changes_in_cdp.html If you are happy with the comment, Mark it "Accept as Solution".
... View more
09-17-2021
06:56 AM
Hi @manojamr I am glad to know your original issue got resolved. As per your last comment, your Query took 9.5 hours to get complete. In this case, we may need to check whether there is a delay or hungriness, or resource crunch or it is normal. To figure out that we may need beeline console output, QueryId, Application log, all HS2 and HMS logs. It would be great if you create a case with Cloudera so we would be happy to assist you. If you are happy with the reply, mark it Accept as Solution
... View more
09-17-2021
06:44 AM
1 Kudo
Hi @Kiddo Could you check whether the below link helps your query? https://community.cloudera.com/t5/Support-Questions/Hive-Do-we-have-checksum-in-hive/td-p/104490 https://community.cloudera.com/t5/Support-Questions/Hive-Can-t-get-the-md5-value/m-p/117696 https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF If you are happy with the reply, mark it Accept as Solution
... View more
09-14-2021
04:21 AM
Hi @manojamr Step 1: Could you run the following commands to gather column statistics for all the table that is involved in the query. analyze table <TABLE-NAME> compute statistics; analyze table <TABLE-NAME> compute statistics for columns; Reference: https://cwiki.apache.org/confluence/display/Hive/StatsDev Step 2: set the following property in session level set hive.tez.container.size=10240 ; set hive.tez.java.opts=-Xmx8192m; set tez.runtime.io.sort.mb=4096; set tez.task.resource.memory.mb=7680; set tez.am.resource.memory.mb=10240; set tez.am.launch.cmd-opts=-Xmx8192m; If Step2 got succes ignore step 3. Step 3: Re-run the job at the beeline session-level. If the job fails again, I would request the below details. 1. Complete query, 2. Beeline console output, 3. QueryId of the job 4. HS2 and HMS logs and 5. Application logs.
... View more
09-01-2021
11:44 PM
@Eric_B Yes, your understanding is correct.
... View more
09-01-2021
10:28 PM
1 Kudo
Hi @saikat As I can understand you are running a merge query and it is failing with java.lang.OutOfMemoryError error. Step 1: Could you please run major compaction on all the tables involves in the merge query(If it is an ACID table or else ignore step1). Once the major compaction is triggered make sure it got completed by running "show compactions;" command in the beeline. This will bring down some stats collection burden for the hive. How to run minor and major compaction? Alter table <table name> compact 'MAJOR'; Step 2: Once step1 is done. Please set the following propery in beeline session level and re-run the merge query set hive.tez.container.size=16384; set hive.tez.java.opts=-Xmx13107m; set tez.runtime.io.sort.mb=4096; set tez.task.resource.memory.mb=16384; set tez.am.resource.memory.mb=16384; set tez.am.launch.cmd-opts=-Xmx13107m; set hive.auto.convert.join=false; The TEZ container and AM size is set as 16GB, if the query got failed you can increase the value to 20GB(then hive.tez.java.opts and tez.am.launch.cmd-opts need to be configured 80% of container and AM size that is 16384). If the query got succeeded with 16GB of TEZ container and AM size then you can try to decrease it too 14/12/10 and figure out a benchmark where it is failing and getting succeeded. In this way, you can save resources. If you are happy with the comment, Mark it "Accept as Solution".
... View more
08-31-2021
08:04 PM
If I installed a later version of Zookeeper (for example), would ambari recognize that later version in it's management? Or would it exist in parallel with the version of Zookeeper packaged with 3.1.5? > You have to install zookeeper or any component via Ambari only, if you install it manually(via yum or apt) in the server ambari will not recognize or it will not consider it. Grafana is running v6.4.2, but has a major security issue that was patched in future releases: https://grafana.com/blog/2020/06/03/grafana-6.7.4-and-7.0.2-released-with-important-security-fix/ Infra Solr is running SOLR 7.7 and has an RCE vulnerability. This was patched in SOLR 8.3, which is not part of Ambari 2.7.5's InfraSolr. Zookeeper packaged is 3.4.6, but SSL implementation was added in 3.5.5 > As mentioned already please create a support case with Cloudera along with the vulnerability CVE number so we can check with our team and confirm whether our product is vulnerable to the security concern or not. If it is so we can provide a patch to overcome it. If you are happy with the comment, Mark it "Accept as Solution".
... View more
08-31-2021
01:36 AM
Hi @Eric_B I saw some questions talking about "Patch Upgrades" but is there a guide to upgrading individual components in a cluster via Ambari or however? > You may not able to upgrade individual components via Ambari. You can either install a component or you can upgrade to the next available HDP 3.X version but I can see you are in the latest 3.1.5 version. If you felt your Hadoop components have a particular vulnerability issue. Please feel free to raise a case with Cloudera so we will check and clarify the same. If the vulnerability is legitimate and could cause harm to your infrastructure we can provide a patch to the issue. In that way, you can overcome it. If you are happy with the comment, Mark it Accepts as Solution.
... View more
08-22-2021
12:09 AM
Hi @Nil_kharat Still not resolved the issue. You may need to check the HS2 logs and application logs to figure the slowness. And one more thing how can we track the job that are running by user's. 1. Go to RM UI > Running/finished/killed > check the User column 2. CM > YARN > Applications > Based upon the user you can search over here. If you are happy with the response mark it as Accepts as Solution
... View more
08-19-2021
12:59 AM
Hi @Nil_kharat Generally, in Hive you may see Query slowness, Query failure, Configuration issue, Alerts, Services down, Vulnerability issue, some bugs this kind of issues you may see.
... View more
08-16-2021
09:06 AM
Hi @Nil_kharat If your jobs are at ACCEPTED most probably it is because the AM does not have enough memory to launch in that particular queue. You can click on top of the particular ACCEPTED job to see the detail. Can you try to increase the Maximum AM Resource(Ambari > Tile Icon > Yarn Queue Manager > Particular queue) to 50% and try to re-run the query and check. If you are happy with the response mark it as Accept as Solution
... View more
08-15-2021
10:43 PM
Hi @Nil_kharat If you are running a hive job and if you noticed it is slower than the earlier run then you may need to check few things in the below order. 1. Is there any change in the code, 2. Is there any new data load added to the table which is getting processed, 3. Beeline console output where you can see the Hive counters details, 4. Hiveserver2 and Hivemetastore logs to see the compilation and execution time of the faster and slower run. 5. QueryId of the job(to check in HS2 and application logs) 6. Faster and slower run of application logs where identify the slowness. 7. explain <query> for the faster and slower(to see the record change and stats info) If you are happy with the response mark it as Accept as Solution
... View more
08-14-2021
07:41 PM
Hi @Nil_kharat Yes, you can also use the particular lsof command as well to figure out the number of established connections to the particular port.
... View more
08-13-2021
04:50 AM
1 Kudo
Hi @amitshanker Thanks for the update. I can see than you had set hive.server2.webui.use.spnego as true, that means kerberos is enabled in your cluster. If spnego and kerberos are enabled then few settings needs to be changed in browser. Could you please follow the below link. https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/cdh_sg_browser_access_kerberos_protected_url.html
... View more
08-12-2021
09:37 PM
1 Kudo
Thanks for the details. Can you share the hive-site.xml file Can you share me the complete screenshot of the error you are facing Do you have kerberos enabled in your cluster?
... View more
08-12-2021
01:46 AM
1 Kudo
Hi @amitshanker Could you let me know whether you are using CDH/HDP/CDP? What is the document which you had followed to enable HS2 UI? Can you share me the screenshot of the error you are facing Do you have kerberos enabled in your cluster?
... View more
08-12-2021
01:44 AM
Hi @ryu [root@test02 ~]# hdfs dfs -rmr /tmp/root/testdirectory ... ... 21/08/11 12:08:30 WARN fs.TrashPolicyDefault: Can't create trash directory: hdfs://test/user/root/.Trash/Current/tmp/root ... ... rmr: Failed to move to trash: hdfs://test/tmp/root/testdirectory: Permission denied: user=root, access=WRITE, inode="/user":hdfs:hdfs:drwxr-xr-x Upon checking the logs you are trying to delete a testdirectory and if you do that it will try to move the file to trash directory(since trash is enabled) under "/user/root/.Trash" location. Since the folder /user("inode="/user":hdfs:hdfs:drwxr-xr-x") has hdfs as primary and group so eventually the user root falls under others(that is the third one r-x). The other users does not have write permission so that he was not able to write. Either give write permission for user root in to the folder or try to delete the folder as hdfs user to overcome the issue. If you are happy with the reply, mark it Accept as Solution
... View more
08-10-2021
10:33 PM
Hi @Nil_kharat Question: Can anyone tell me how to check how many users are connected to the hive server2? Answer: Login to the server where hiveserver2 is running and run the below command to figure out the number of connections to the HS2 # netstat -ntpl | grep 10000 In the above command 10000 is the HS2 port number You can also check this from Cloudera manager CM > Hive_on_Tez > Instances > HiveServer2 > Open Connections And along with can we see which user is connected? If you enable Hiveserver2 UI then under Active Sessions you can see which user connected from which IP. Please check the below link. https://cwiki.apache.org/confluence/display/Hive/Setting+Up+HiveServer2#SettingUpHiveServer2-WebUIforHiveServer2 If you are happy with the answer, mark it Accept as solution
... View more
07-25-2021
07:21 PM
Hi @michael_boulter Could you let us know what is HDP version you are using? There is no TEZ UI from HDP 3 and in CDP. Could you check whether the Hive is running in port 10000/10001? Could you cross-check you were able to log in to beeline and run random queries?
... View more
07-19-2021
06:36 AM
I can see you are in HDP but you are using Cloudera Driver(https://www.cloudera.com/downloads/connectors/hive/odbc/2-6-4.html) The first thing you need to do is start using Hortonworks Driver if you are using Ambari, link to download and configure is present in https://www.cloudera.com/downloads/hdp.html How to configure it? Check out the below link https://docs.cloudera.com/HDPDocuments/other/connectors/hive-jdbc/2.6.7/Simba-Hive-JDBC-Install-and-Configuration-Guide.pdf The error which had been shared should not harm connecting to Hive. To figure out the exact reason for the ODBC failure you need to enable TRACE logs as below You can configure the log level following those ways: If the client uses an ODBC on MS Windows machines, set the Log level on the Driver to TRACE and try a connection attempt, gather the driver logs: ODBC Driver Configuration > DataSourceAdministrator> select the DSN> Configure > Logging Options > set LogLevel: TRACE ODBC Driver Configuration > DataSourceAdministrator> select the DSN> Configure > Logging Options > set Log Path: "C:/" OK > OK ODBC Driver Configuration > DataSourceAdministrator> select the DSN> Configure > Test...
... View more
07-11-2021
06:52 PM
Hi @srinivasp I believe you are missing SELECT and LIST privilege for the table in ranger for the user. Can you try to list the tables using Hive user from beeline and check whether you are seeing the same problem?
... View more