Member since
01-25-2019
75
Posts
10
Kudos Received
13
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1089 | 02-25-2021 02:10 AM | |
637 | 02-23-2021 11:31 PM | |
1017 | 02-18-2021 10:18 PM | |
1439 | 02-11-2021 10:08 PM | |
7896 | 02-01-2021 01:47 PM |
08-10-2021
07:34 AM
Hello @Moudma Rstudio is an application which I guess might be running on windows or mac OS. On these OS you will be downloading the ODBC application where you configure the DSN parameters. Under that configuration only, there is an Advanced Option where you will find the properties.
... View more
08-10-2021
12:45 AM
Hello @Moudma Open the ODBC application in Windows and click on any one of the DSN and further click on Advanced Options... to get the above window. Regards, Tushar
... View more
06-16-2021
12:57 AM
Hello @pauljoshiva If the crashed data node comes up, then the data replica count will be 4 that is the case of over replicated blocks. HDFS will automatically delete the excess replicas as the default replication factor has to be maintained 3. The replica from the now active datanode is going to be removed. https://docs.cloudera.com/runtime/7.2.9/hdfs-overview/topics/hdfs-how-namenode-manages-blocks-on-a-failed-datanode.html
... View more
05-20-2021
04:44 AM
It seems a wrong configuration/password is passed in ranger configuration which is unable to open the keystore using the same. $JAVA_HOME/keytool -list -keystore <keystore path with .keystore.jks> -storepass <password> Check with the above command if you are able to list the keystore contents using the password you pass above. Ensure the same is configured in the ranger configuration.
... View more
05-20-2021
04:40 AM
@dmharshit A znode is supposed to be formed under zookeeper. Check for zookeeper logs during the HS2 start as to why it is or HS2 is not able to create znode.
... View more
05-20-2021
04:39 AM
1 Kudo
-Try the below. -When you start the HS2, track the logs and look for the first error in HS2. -Parallelly ensure that your zookeeper is up and running and check the zk logs during the restart to validate if the connections are coming to ZK and whether zk is accepting or rejecting the connection.
... View more
05-14-2021
06:39 AM
Hello @snm1523 Could you please help me with the below details. Step 1: Connect to any of the edge node and run the below. id -Gn <new username> kinit <username> connect to HS2 using the jdbc conn string and then run, select current_user(); run the query select * from tablename which gives error. Step 2: Now from the same edge node, connect to one of the impala node. id -Gn <new username> kinit <username> Connect to Impala using the jdbc connection string beeline -u "jdbc:hive2://<impala-coordinator-node>:21050/default;principal=impala/<LB FQDN NAME>@REALM_NAME;ssl=true;sslTrustStore=<truststore path>; **Use LB FQDN Name in principal section if you have LB else use the coordinator FQDN hostname above. If you have ssl enabled then use:-->sslTrustStore=<truststore path> else you can remove from connection string. and run , select current_user(); and run the same query here. Step 3: ssh to host running Hue server. If you have multiple host running hue server then ssh to them one by one and repeat the same. id -Gn <username> output kinit username connect once to HS2 and Impala each via beeline then run select current_user() and the query. Once done share all the results to validate things.
... View more
05-13-2021
02:59 PM
Hello @snm1523 First and for most thing, ensure the users are synced in all the nodes. for all the newly added users run id -Gn <username> on all the impala and hs2 nodes and ensure that this matches across all the nodes.
... View more
05-13-2021
02:52 PM
Hello @Charles25 I would like to see the HS2 logs when the connection is being initiated. Need to understand why is the HS2 rejecting connections if the connection from client is hitting HS2.
... View more
05-10-2021
11:26 PM
Issue
Partition discovery such as "metastore.partition.management.task.frequency" and "partition.retention.period" does not take effect in Hive if metastore.compactor.initiator.on is not turned on by default on HMS. The same property is responsible for activating Hive Metastore background tasks which involves partition discovery as well.
Resolution
To ensure partition discovery works as expected, do the following:
Go to CM > Hive >Configuration.
Search for Hive Metastore Server Advanced Configuration Snippet (Safety Valve) for hive-site.xml.
Click on + >Add the following: Name :--> metastore.compactor.initiator.on
Value :--> true
Save the changes and restart.
Ensure the tables have the property discover.partitions is set to True for certain tables.
... View more
Labels:
05-03-2021
10:50 PM
1 Kudo
In Beeline, the command-line options such as sslTrustStore, trustStorePassword, showDbInPromp etc are case sensitive.
For example, below is a working connection string from a test bed:
beeline -u "jdbc:hive2://host-A-fqdn:21051/default;principal=impala/host-A-fqdn@COE.CLOUDERA.COM;ssl=true;sslTrustStore=/opt/cloudera/security/truststore.jks"
In the above example, the common mistakes are principal mentioned as Principal and sslTrustStore mentioned as ssltruststore.
Here, if the case sensitivity is not followed, Beeline silently ignores the command line options and drops them:
//Sample string
beeline -u "jdbc:hive2://host-A-fqdn:21051/default;Principal=impala/host-A-fqdn@COE.CLOUDERA.COM;ssl=true;ssltruststore=/opt/cloudera/security/truststore.jks"
If you use the above connection string, at first, you will encounter a Kerberos issue as the property "principal" will be dropped and the actual Kerberos authentication will fail. If you fix the Kerberos issue, then you would encounter SSL related error as the ssltruststore needs to be written as "sslTrustStore".
You can find the other command-line options under Beeline Command Options.
... View more
Labels:
04-23-2021
12:04 PM
Hello Team, First thing, are you able to connect to HS2 from any of the edge node? If that is connecting successfully, could you share the same to ensure we form the right connection string here. Also, could you attach the trace logs here and HS2 logs parallelly at the same as well.
... View more
02-25-2021
09:42 AM
1 Kudo
Hello @marccasajus
Yes, this has been documented internally as a BUG (OPSAPS-53043) and is currently not fixed.
Also, it looks you have already applied the changes which would address this.
... View more
02-25-2021
02:10 AM
Hello @SajawalSultan It seems you are running the job via user cloudera_user and it needs access to /user/<username> directory to create scratch directories which it is unable to create because user "cloudera_user" does not has permissions. hdfs:supergroup:drwxr-xr-x /user Run hdfs dfs -chmod 777 /user from hdfs user to ensure you get proper access to /user directory. Let me know if this solves your sqoop import.
... View more
02-25-2021
02:05 AM
Hello @Sample If we don't have hadoop ecosystem, hive and impala would not exist in first place. If you have hive on one side(basically hadoop ecosystem) and mysql on other end. Now if you want to import data into hive from mysql, you will have to make use of sqoop to perform the same and vice-versa. Let me know if the above answers all your questions.
... View more
02-25-2021
12:27 AM
Hello @saamurai Thanks for the confirmation. Cheers! Was your question answered? Make sure to mark the answer as the accepted solution. If you find a reply useful, say thanks by clicking on the thumbs up button.
... View more
02-24-2021
11:44 PM
Hello @saamurai We have separate drivers for Impala and hive and I am not sure why you intent on using hive driver for Impala. We do connect to Impala from edge nodes via beeline which is jdbc but the sole purpose is to perform some tests whether connectivity works fine or not. We do not recommend to use beeline for Impala as we have impala-shell designed for the same. Cloudera recommends to use specific drivers along with version compatibility for each components.
... View more
02-23-2021
11:31 PM
1 Kudo
hello @Benj1029 You need to go to the below path on the host which is hosting HIveserver2 process. cd /var/log/hive/ vi hiveserver2.log file and just before the shutdown try looking at the stack trace, the would help you with some pointers.
... View more
02-18-2021
10:18 PM
1 Kudo
Well @ryu, My understanding is when you are storing things on HDFS and that too things related to hive, it is best to use managed table considering in mind that CDP is now coming up with compaction features where in small file issue would automatically get addressed. Compaction will not happen on external tables. one would prefer to choose external tables if the data is stored outside HDFS like S3. This is my understanding, but again it could vary on customer to customer based on their use cases.
... View more
02-18-2021
10:38 AM
Hello @ryu There is no such path as best path but obviously not /tmp location. You can create some path under /user/external_tables and further create the tables here. Again it totally depends upon you how you are designing and your use case.
... View more
02-18-2021
10:34 AM
Hello @bb9900m The use case in your scenario is make use of Load balancer. You can make user of external Load Balancer and balance the impala coordinators behind the Load balancer. This way, you will have the Load balancer Ip exposed to the client(Virtual IP) which further will balance the load based upon the load balancing mechanism set. Attaching the link and sample configuration for your reference. https://docs.cloudera.com/runtime/7.2.6/impala-manage/topics/impala-load-balancer-configure.html When it comes to SSL, make sure you have the LB hostname added as SAN names on all the impala certificates. Let me know if you have issues in any of the points above.
... View more
02-18-2021
10:26 AM
Hello @saamurai Could you please share the link where you read so. Meanwhile you can use the latest driver links to connect to Hive and Impala respectively for CDP. //For Hive https://docs.cloudera.com/documentation/other/connectors/hive-jdbc/2-6-13/Cloudera-JDBC-Driver-for-Apache-Hive-Install-Guide.pdf //For Impala https://docs.cloudera.com/documentation/other/connectors/impala-jdbc/latest/Cloudera-JDBC-Driver-for-Impala-Install-Guide.pdf You have separate drivers for Hive and Impala respectively. Let me know if the above helps.
... View more
02-17-2021
01:20 AM
Hello @uk_travler Compaction will not honour hive.compactor.job.queue. Basically compactions works differently for fully acid tables and insert only tables. For fully acid tables, when you perform a manual/auto compaction, there are two jobs spawned, one MR which is responsible for compaction which will honour compaction queue and another tez job which is responsible for stats analysis and is a tez job submitted to default queue. For inserts only tables, when you perform a manual/auto compaction, there is tez job spawned which is submitted to default queue. There is a jira raised raised which is being worked on it. Bug details for your reference. HIVE-24781 let me know if you have any doubts on the above.
... View more
02-11-2021
10:08 PM
1 Kudo
Hello @ryu The purpose of Ranger is to give the necessary user authentication to access the tables/database. If you allow a certain user access to a particular table/database, the user will be able to perform those actions on the table/database and the unwanted user automatically will not be able to remove the table. Let say there are two users test1 and test2. If I allow test1 user to have access to table t1 and test2 user to have access to table t2, test1 will not be able to see table t2 and test2 will not be able to see test1. You can further also add granularity as to what user can perform what actions on the table. This authorization is checked via Ranger hook which is present in the Hiveserver2. Let me know if the above answers your queries.
... View more
02-10-2021
09:45 PM
Hello @ryu Well you can make that user similar to hive user. hive user is mapped to hadoop group and you can make alterations to simulate a normal user to hive user but again as I mentioned earlier, you'll have to spend time managing it and eventually land up in spending more time in tshooting if things break. Remember hadoop is a complex setup with multiple components talking to each other. 🙂
... View more
02-10-2021
09:41 PM
Hello @dwill @Srivatsan CDH 5.7.3 is a very old version and there has been lot of fixes post that. Again coming to that error. Generally when you see the query in created state, we need to check where exactly the query is waiting. For example, query can be in created state if it is not able to fetch metadata from catalog server which is needed for submitting the query. It can also be in created state if the resources are less and query is queued.
... View more
02-09-2021
07:46 AM
Hello @ryu If you run the job with the end user, you will eventually end up managing internal permissions, job submission permissions your self. Also you will find difficulty integrating things as per my experience. But if you submit the job and let the hive user take care of the file creation, managing part in the backend, admin's job life become easier. You also will be able to hook/integrate things more properly. the above was just a jist, recommendations are to authenticate using the end user but then keep the impersonation off and let the hive take care of things in the backend.
... View more
02-01-2021
09:06 PM
@anujseeker and the HMS logs ?
... View more
02-01-2021
01:52 PM
hello @pphot You can migrate HMS and HS2 instances to any other hosts. So you can add another hosts for HS2 and HMS instance and remove the previous ones once the new ones are added and are functioning normally. For backend database, if you migrate then you have to update the configuration in the CM so that HMS has the updated information and access as to which host it needs to communicate in order to access its backend DB. Please note this is mandatory as Hive has all the information stored in its backend DB. Let me know if the above helps. Regards, Tushar
... View more
02-01-2021
01:47 PM
Hello @BhaveshP The error you are seeing is because certificate CN/SAN name mismatch than that of hostname. Let me try to explain you with an example. client [A]--> Balancer[B] -->NIFI Nodes[C,D,E] Here you have a client A who wants to access NIFI nodes C,D and E and he wants to access via B. When you create SSL certificates for C,D,E, you create 3 certificates with 3 different "Common Names" C, D. E respectively. Now when you try connecting to nifi nodes C,D,E respectively directly from client A, you will not observe the issue. But when you try to access C,D,E via balancer or any proxy B, you are likely to get the error. WHY? Client A is trying to talk to Nifi Node C or D or E but to client the NIFI node is B. During SSL handshake, the certificates are passed by Nifi server to A where in client gets confused becuase it wanted to talk to B however the certificate it got is from C. Name don't match. FIX: Make use to SAN Names in certificates. (Subject Alternative Names) Issue certificate of C,D,E such that it has a SAN name as B. What I am trying to say is, the certificate should have both the name that of C and B in nifi node C, D and B in nifi node D and E and B in nifi node E. In this way client will come to know that when it talks to B and when the certificate received to A from C, it will not get confused because the certificate will have two name present in SAN that of B and C. Let me know if the above gives some clarity as to what exactly is happening.
... View more