Member since
11-11-2019
610
Posts
33
Kudos Received
25
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1490 | 02-28-2023 09:32 PM | |
2428 | 02-27-2023 03:33 AM | |
25054 | 12-24-2022 05:56 AM | |
2006 | 12-05-2022 06:17 AM | |
5217 | 11-25-2022 07:37 AM |
09-05-2024
04:27 AM
1 Kudo
hi @denysobukhov Is your cluster SSL and LDAP enabled? Are you able to connect from beeline? Please review https://community.cloudera.com/t5/Community-Articles/How-to-Connect-to-Hiveserver2-Using-Cloudera-JDBC-driver/ta-p/376336 and change it as per the usage.
... View more
08-19-2024
06:43 AM
hi @APentyala 1. 1. Data Modeling Design: Which model is best suited for a Lakehouse implementation, star schema or snowflake schema? Ans: We don't have those designs or we are not aware of those 2. We are using CDP (Private) and need to implement updates and deletes (SCD Type 1 & 2). Are there any limitations with Hive external tables? Ans: There are no limitations for EXTERNAL tables. Are you using HDFS or islon to store? 3. Are there any pre-built dimension models or ER models available for reference? apentyala Ans : We don't have any thing as such
... View more
08-14-2024
09:08 PM
2 Kudos
@APentyala Please find the answers below: 1. Which data modeling approach is recommended for this domain? Ans: If you have large data, we would recommend to go with Partitioning or multi-level partitioning. You could implement Bucketing if the data inside partition is large. 2. Are there any sample models available for reference? Ans: You could take a refrence for partitioning and bucketing in https://www.linkedin.com/pulse/what-partitioning-vs-bucketing-apache-hive-shrivastava/ You could create a new table perfroom CTAS with Dynamic Partitiining from the existing table Refrence: https://www.geeksforgeeks.org/overview-of-dynamic-partition-in-hive/ 3. What best practices should we follow to ensure data integrity and performance? Ans: Please follow below best parctices: a. Paartion and bucket it b. You could use Iceberg table which would reduce the significant load on Metastore, if you are using CDP Public CLoud or CDP private CLoud(ECS/Opesnshit) c. Use ORC/parquet d. Use EXTERNAL tables,if you dont perfrom Update/Delete as reading External table is faster. 4. How can we efficiently manage large-scale data ingestion and processing? Ans: The model follows as: Kafka/Spark Streaming: Ingestion Spark: Data Modelling Hive: Warehosuing where you extract the data Please. be specific on the use case. 5. Are there any specific challenges or pitfalls we should be aware of when implementing a lakehouse in this sector? Ans: There should be no challenges, we would request to provide more briefing on this.
... View more
07-30-2024
08:19 AM
@Maicat You can not typecast array to the string. There are 2 ways you can use 1. Select the nth object of the array. SELECT level5[0] AS first_genre FROM my_table; WHere 0 is the first object 2. You can flatten it SELECT column1 FROM my_table LATERAL VIEW explode(level5) genre_table AS level5;
... View more
02-23-2024
07:48 AM
f you deacivate and reactivate with "Public Locadbalancer" and "Public Executor", this should work
... View more
02-12-2024
11:57 PM
1 Kudo
To use an internal load balancer for Cloudera Data Warehouse (CDW), you must select the option to enable an internal load balancer while activating the Azure environment from the CDW UI. Otherwise, CDW uses the Standard public load balancer that is enabled by default when you provision an AKS cluster. Before activating you should remove.https://docs.cloudera.com/data-warehouse/cloud/azure-environments/topics/dw-azure-enable-internal-aks-lb.html
... View more
02-02-2024
12:01 AM
1 Kudo
Can you decstivate environment, remove loadbalancer and activate and check? I want to isolate if the issue is with LBR or with Hue itself
... View more
01-30-2024
01:33 AM
1 Kudo
I also see, there is an open jira and not resolved yet DWX-135. We need to wait until it is fixed.
... View more
01-30-2024
01:28 AM
1 Kudo
@andym did you follow https://docs.cloudera.com/data-warehouse/cloud/azure-environments/topics/dw-azure-enable-internal-aks-lb.html
... View more
09-27-2023
06:56 AM
2 Kudos
If you need to connect HiveServer2 from Third parties like Dbbeaver OR PowerBI or any Java client, you can connect using the Cloudera JDBC driver. The driver can be downloaded from Hive JDBC Connector 2.6.21 for Cloudera Enterprise. We will cover the scenario for SSL, Zookeeper,Kerberos, LDAP, and LoadBalancer to connect to HiveServer2 using the driver. Please note that the Beeline URL is not the same as the JDBC driver. The list of properties for the JDBC driver is listed at Cloudera JDBC Driver 2.6.21 for Apache Hive. You need to use com.cloudera.hive.jdbc.HS2Driver class to connect to HiveServer2. Lets assume: Host: c2345.node.cloudera.com Kerberos Realm: EXAMPLE.COM Plain Connection: jdbc:hive2://c2345.node.cloudera.com:10000/default;LogLevel=6;LogPath=/tmp where LogLevel =6 ==> is more verbose level of logging Kerberos+SSL+binary: jdbc:hive2://c2345.node.cloudera.com:10000/default;SSL=1;SSLTrustStore=/home/keystore-cdp/cm-auto-global_truststore.jks;SSLTrustStorePwd=xxxxxxxxxx;LogLevel=6;LogPath=/tmp/logs;KrbRealm=EXAMPLE.COM;KrbHostFQDN=c2345.node.cloudera.com;KrbServiceName=hive;AuthMech=1 If you import root certificate of HiveServer2 to CACERTS in JDK, you do not need to specify SSLTrustStore and SSLTrustStorePwd, it takes the trustsore and password from CACERTS jdbc:hive2://c2345.node.cloudera.com:10000/default;SSL=1;LogLevel=6;LogPath=/tmp/logs;KrbRealm=EXAMPLE.COM;KrbHostFQDN=c2345.node.cloudera.com;KrbServiceName=hive;AuthMech=1 where AuthMEch =1 ==> uses Kerberos authentication LDAP+SSL+binary: jdbc:hive2://c2345.node.cloudera.com.com:10000/default;SSL=1;SSLTrustStore=/home/keystore-cdp/cm-auto-global_truststore.jks;SSLTrustStorePwd=xxxx;LogLevel=6;LogPath=/tmp/logs;AuthMech=3;UID=test1;PWD=Password1 where AuthMEch =3 ==> uses LDAP authentication where UID and PWD is for the user present in the LDAP. LDAP+SSL+HTTP: jdbc:hive2://c2345.node.cloudera.com.com:10001/default;SSL=1;SSLTrustStore=/home/keystore-cdp/cm-auto-global_truststore.jks;SSLTrustStorePwd=xxxx;LogLevel=6;LogPath=/tmp/logs;AuthMech=3;UID=test1;PWD=Password1;transportMode=http;httpPath=cliservice The port has been changed from 10000 to 10001.transportMode and httpPath is added. Kerberos+SSL+HTTP: jdbc:hive2://c2345.node.cloudera.com.com:10001/default;SSL=1;SSLTrustStore=/home/keystore-cdp/cm-auto-global_truststore.jks;SSLTrustStorePwd=xxxx;LogLevel=6;LogPath=/tmp/logs;AuthMech=1;KrbRealm=EXAMPLE.COM;KrbHostFQDN=c2345.node.cloudera.com;KrbServiceName=hive The port has been changed from 10000 to 10001.transportMode and httpPath is added. Zookeeper+SSL+LDAP: You can use Zookeeper to connect to HiveServer2 for high availability (HA) jdbc:hive2://zk=c2345.node2.cloudera.com.com:2181/hiveserver2,c2345.node3.cloudera.com.com:2181/hiveserver2,c2345.node4.cloudera.com.com:2181/hiveserver2;SSL=1;SSLTrustStore=/home/keystore-cdp/cm-auto-global_truststore.jks;SSLTrustStorePwd=xxxx;LogLevel=6;LogPath=/tmp/logs;AuthMech=3;UID=test1;PWD=Password1 Zookeeper+SSL+Kerberos: jdbc:hive2://zk=c2345.node2.cloudera.com.com:2181/hiveserver2,c2345.node3.cloudera.com.com:2181/hiveserver2,c2345.node4.cloudera.com.com:2181/hiveserver2;SSL=1;SSLTrustStore=/home/keystore-cdp/cm-auto-global_truststore.jks;SSLTrustStorePwd=xxx;LogLevel=6;LogPath=/tmp/logs;AuthMech=1;KrbRealm=EXAMPLE.COM;KrbHostFQDN=_HOST;KrbServiceName=hive KrbHostFQDN=_HOST is used as string to connect to any hiveserver2 host. _HOST is replaced by exact hostname to which it will connect internally. HA-Proxy+SSL+Kerberos: Configure HA for hiveserer2 from Configuring the HiveServer load balancer. Connect using below URL: jdbc:hive2://ha-proxy-host.com:11000/default;SSL=1;SSLTrustStore=/home/keystore-cdp/cm-auto-global_truststore.jks;SSLTrustStorePwd=xxx;LogLevel=6;LogPath=/tmp/logs;AuthMech=1;KrbRealm=EXAMPLE.COM;KrbHostFQDN=_HOST;KrbServiceName=hive Where, KrbHostFQDN=_HOST is used as string to connect to any HiveServer2 host. _HOST is replaced by exact hostname to which it will connect internally. HA-Proxy+SSL+LDAP: jdbc:hive2://ha-proxy-host.com:11000/default;SSL=1;SSLTrustStore=/home/keystore-cdp/cm-auto-global_truststore.jks;SSLTrustStorePwd=xxx;LogLevel=6;LogPath=/tmp/logs;AuthMech=3;UID=test1;PWD=Password1
... View more
Labels: