Member since
07-30-2019
117
Posts
6
Kudos Received
4
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 2163 | 01-17-2020 01:21 AM | |
| 16746 | 10-24-2019 10:32 PM | |
| 11112 | 10-23-2019 11:39 PM | |
| 11126 | 10-23-2019 04:31 AM |
01-17-2020
01:21 AM
1 Kudo
@kentlee406 The issue seems to be on the Yarn side i.e. more specifically with the Resource Manager. We can see the below message: 20/01/16 13:13:12 INFO client.RMProxy: Connecting to ResourceManager at quickstart.cloudera/10.0.2.15:8032 20/01/16 13:13:15 WARN hdfs.DFSClient: Caught exception java.lang.InterruptedException at java.lang.Object.wait(Native Method) at java.lang.Thread.join(Thread.java:1281) at java.lang.Thread.join(Thread.java:1355) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.closeResponder(DFSOutputStream.java:967) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.endBlock(DFSOutputStream.java:705) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:894) I would be great if you can check if there any other Yarn applications running in your cluster. I would also suggest a complete restart of your Yarn services.
... View more
10-24-2019
10:32 PM
4 Kudos
Hi @BTKR/Tej, Your subject and your actual question differ a lot and both are completely different things. If you are concerned about the number of connections going to Metastore database from the Hive Metastore process, you can use the below way: 1. Find out the PID of HMS process on the server using the below command: ps -ef | grep -i hivemetastore 2. Once you have the PID, get the put of below command: lsof -p PID | grep ESTABLISHED This will give you list of all the connections being made to and by the Hivemetastore process. This will also include the connections made "TO" Hivemetastore process from the clients i.e. from the Hive CLI shells. Please look for the database type in the output to confirm the connections FROM Hivemetastore to HMS DB. For example on my side, I have the below output: In the above photo, all the outputs that have "mysql" in it are being made from HiveMetastore Process to HMS DB. As per your question in the description, if you want to find out how many threads are try to connect to the database at any time, you can collect the jstack of the HMS process and then look for the threads referring the mysql calls(which is the database type in my case, you can look for oracle or postgres if any of those are your database types). Also, I get a feeling that you are concerned by the number of connections being made to the database. You can check the below property via Hive CLI and beeline(this property will not listed in the Ambari as it is built in): set datanucleus.connectionPool.maxPoolSize; --This will give to the connection pool size. 10 is the default value, if set to something else, please let me know. Also, share the output of below query: set datanucleus.connectionPoolingType; Do confirm the exact HDP version you are on!!! Please note the connection pool set to 10 does not mean there will only 10 connections to HMS DB, there can be more connections, but if this value is increased, the number of connections also increases exponentially to HMS DB. Sometimes, it is suggested to increase the connection pool size to accommodate the huge load of queries on Hive. So, if you are using your Hive services extensively, and the connectionpoolsize is set to a higher value, I would suggest to fix the issue on the HMS DB side to to allow more number of connections. For example, on MySQL, there is max_connections, you can increase it to 1000 or more. Let me know if the above information was helpful!! Thanks, Rohit Rai Malhotra
... View more
10-23-2019
11:39 PM
Hey @Jesse_s, You can check the below link for documentation: https://www.cloudera.com/downloads.html Also, please accept the answer, if it helped.
... View more
10-23-2019
11:34 PM
Hi @soumya, Can you please confirm exactly how the issue was resolved. Please do accept the answer which helped you.
... View more
10-23-2019
05:01 AM
Hi @soumya, It seems there is some confusion here. As per my understanding, you are trying to connect to Spark Thrift Server via beeline. Please correct me, if I am wrong. You you to be able to connect to Spark Thrift server via beeline, you need to make sure you are providing the correct hostname and portnumber in the JDBC URL you are using in the beeline. For example: jdbc:hive2://host:portnumber/ Here, the "host" will be the Hostname of the server where Spark Thriftserver is running. Let us say it is running on abc.soumya.com. The default portnumber for Spark Thriftserver is 10000. But this portnumber can be configured to something else as well. You need to find the correct portnumber. Thus, you connect string would look like below: jdbc:hive2://abc.soumya.com:10000/ You can refer the below link for more information on this: https://spark.apache.org/docs/latest/sql-distributed-sql-engine.html
... View more
10-23-2019
04:31 AM
Hi Jesse, There seems to be some encryption enabled at your end, that does not allow the other user to access the ODBC driver connection. This could be some Windows level configuration or some security parameter. Found below article, relevant to the similar error: https://stackoverflow.com/questions/30886839/key-not-valid-for-use-in-specified-state-how-to-load-profile-of-user-to-imper https://bytes.com/topic/asp-net/answers/566477-dpapi-decrypt-error-decryption-failed-key-not-valid-use-specified-state
... View more
10-23-2019
03:53 AM
It would be great if you can share the exact error you are facing. Also, can you please try creating the table as below: CREATE EXTERNAL TABLE IF NOT EXISTS tsvtab ( > name string, > region_code int, > sal int, > add string > ) > ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' > STORED AS TEXTFILE; load data inpath 'hdfs://hadoophdinsightigi-2019-10-21t07-33-15-078z@hadohdistorage.blob.core.windows.net/user/HadoopPOCDir/data.tsv' into table tsvtab; OR CREATE EXTERNAL TABLE IF NOT EXISTS tsvtab ( > name string, > region_code int, > sal int, > add string > ) > ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' > STORED AS TEXTFILE; load data inpath '/user/HadoopPOCDir/data.tsv' into table tsvtab; NOTE: I have change the path from "wasb//" to "hdfs//" in the first command and removed the unwanted details from the second command.
... View more