Member since
07-31-2019
14
Posts
1
Kudos Received
0
Solutions
05-19-2022
09:05 AM
Hi Community, We are working on a new requirement where I need to disable table-level TTL and rather use cell-level TTL for one of our HBase tables. I can disable the table-level TTL by altering the table property and set cell-level TTL for new incoming data. However, I can't find anything on how can I set cell-level TTL for the data / rows that are already existing in the table. Any suggestions will be very helpful. Thank you, Snehasish
... View more
Labels:
01-14-2022
03:00 AM
Hi, For me, changing the load balancer's listener protocol and target group's protocol to TCP did the trick. If that does not work out for you, can you please put some more details about your setup so that the community can help?
... View more
03-12-2021
02:29 AM
1 Kudo
Hi Community, I'm having a hard time to understand the difference between Impala's refresh command vs the compute stats command. I understand that 'refresh' command refreshes the metadata of a database / table and 'compute stats' calculates the volume of data and its distribution, but my confusion is, isn't this re-calculation already done within the 'refresh' command? My understanding might be completely wrong, hence reaching out to the SMEs. Can anyone please help me explain when to use 'refresh' and when to use 'compute stats'? Thank you in advance, Snehasish
... View more
Labels:
- Labels:
-
Apache Impala
01-08-2020
07:00 AM
Hi @vaishaakb Thank you for the reply. I understand that dropping table / database in the cluster doesn't replicate it back to the cloud back-up. We have a usecase where each month our internal customers create some tables / databases, work on them for few days and then drop them once they are done. As a result of this, S3 bucket has many abandoned databases which is growing day-by-day. It would be really helpful if you could advise a way to keep the S3 bucket in-sync with the Hive. Thank you, Snehasish
... View more
12-30-2019
02:05 AM
Hello Community, I want to decommission an entire cluster, so I went looking around for any documentation that mentions any recommended procedure to follow but couldn't find any. Hence, i turned towards the community, if anyone here have decommissioned a kerberized and sentry-enabled cluster and can advice me on any best practices I can follow. I think it is worth mentioning that our cluster is created using Cloudera Altus Director. Thank you, Snehasish
... View more
Labels:
11-14-2019
09:24 AM
Hi Community, In my cluster, I'm using Hive Replication to S3 to backup databases on daily-basis. I was referring to the documentation and couldn't find anything on whether any database/table/file in user's personal directory is also deleted from S3 if it is dropped from the cluster. From the documentation, If you configure replication of a Hive table and then later drop that table, the table remains on the destination cluster. The table is not dropped when subsequent replications occur. Can anyone please confirm if the above point is applicable when replicating to S3? What approach is taken / recommended to keep the cluster and backup on S3 in-sync? Thank you, Snehasish
... View more
Labels:
10-03-2019
06:36 AM
Hi Community,
We are using AWS Network Load Balancer to balance out the traffic between 6 impala daemons.
Recently we started facing issues where 2 Impala daemons won't receive query and just hangs the connection over port 21050.
On further investigation we found that there were around 5k approx CLOSE_WAIT statuses for the connection between the LB and Impala daemon.
tcp 1 0 10.XXX.XXX.68:21050 10.XXX.XXX.80:64135 CLOSE_WAIT 11084/impalad
tcp 1 0 10.XXX.XXX.68:21050 10.XXX.XXX.80:58169 CLOSE_WAIT 11084/impalad
tcp 1 0 10.XXX.XXX.68:21050 10.XXX.XXX.80:64652 CLOSE_WAIT 11084/impalad
tcp 1 0 10.XXX.XXX.68:21050 10.XXX.XXX.80:62075 CLOSE_WAIT 11084/impalad
tcp 1 0 10.XXX.XXX.68:21050 10.XXX.XXX.80:52393 CLOSE_WAIT 11084/impalad
tcp 1 0 10.XXX.XXX.68:21050 10.XXX.XXX.80:41447 CLOSE_WAIT 11084/impalad
tcp 1 0 10.XXX.XXX.68:21050 10.XXX.XXX.80:47034 CLOSE_WAIT 11084/impalad
tcp 1 0 10.XXX.XXX.68:21050 10.XXX.XXX.80:49452 CLOSE_WAIT 11084/impalad
tcp 1 0 10.XXX.XXX.68:21050 10.XXX.XXX.80:28327 CLOSE_WAIT 11084/impalad
tcp 1 0 10.XXX.XXX.68:21050 10.XXX.XXX.80:52498 CLOSE_WAIT 11084/impalad
tcp 1 0 10.XXX.XXX.68:21050 10.XXX.XXX.80:21168 CLOSE_WAIT 11084/impalad
tcp 1 0 10.XXX.XXX.68:21050 10.XXX.XXX.80:40079 CLOSE_WAIT 11084/impalad
tcp 1 0 10.XXX.XXX.68:21050 10.XXX.XXX.80:35664 CLOSE_WAIT 11084/impalad
tcp 1 0 10.XXX.XXX.68:21050 10.XXX.XXX.80:4191 CLOSE_WAIT 11084/impalad
tcp 1 0 10.XXX.XXX.68:21050 10.XXX.XXX.80:14935 CLOSE_WAIT 11084/impalad
tcp 1 0 10.XXX.XXX.68:21050 10.XXX.XXX.80:63036 CLOSE_WAIT 11084/impalad
tcp 1 0 10.XXX.XXX.68:21050 10.XXX.XXX.80:5158 CLOSE_WAIT 11084/impalad
tcp 1 0 10.XXX.XXX.68:21050 10.XXX.XXX.80:60134 CLOSE_WAIT 11084/impalad
Everytime I restart the daemons, the CLOSE_WAIT disappears and the connection establishes successfully but after few minutes these CLOSE_WAIT statuses piles up and chokes the connection again.
Our cluster is Kerberized and TLS-SSL enabled. The NLB is internal and there is only one user using it through JDBC driver.
I'm stuck with this issue for over a week now, any suggestion will be very much helpful.
Thank you.
... View more
Labels:
- Labels:
-
Apache Impala
09-13-2019
03:54 AM
Hi @EricL, Thank you for your reply. We use Client/Server SSL encryption method with a CA signed certificate. Best, Snehasish
... View more
09-11-2019
11:07 AM
Hi Community,
I'm working on adding load balancer in front of my Impala deamons.
I will be using AWS ELB
As this is my first time I have few queries that are bothering me:
When I create ELB which SSL certificate should I provide, the root CA certificate or the ones that cloudera manager creates while configuring AUTO_TLS, for example cm-auto-global_cacerts.pem? (My Cluster is already kerberized and TLS/SSL is enabled)
Once the ELB is created, I will have to add the information to Impala service setting in CM and regenerate the Kerberos Credential. Will I be able to revert if something goes wrong in the process? I mean, if I remove the added configuration from Impala service setting in CM and restart, will it safely go back to working as before?
Thank you in advance.
... View more
Labels:
- Labels:
-
Apache Impala
-
Cloudera Manager
08-22-2019
05:10 AM
Hello Community,
I have a CDH 5.14 cluster hosted on EC2 machines which is kerberized with Active Directory and have Sentry for authorization of databases.
I want to use SSSD to secure my Linux hosts (RHEL 7.x) with Active Directory.
I have been going through this post but there are few queries which are bothering me to proceed forwards:
1. There are service-users (Hive, YARN, etc.) in AD that are already created during Kerberization of my cluster. So, if I go ahead and implement SSSD, then will these pre-existing service-users be able to communicate?
2. If something goes wrong will I be able to rollback? If yes, how?
... View more
Labels:
- Labels:
-
Apache Hadoop
08-08-2019
01:28 AM
lp.security.ldapConfig.activeDirectory.roleMapping.DirectorAdminGroupCn: <ADMIN_GROUP_CN> As per the documentation and my understanding the proper syntax would be lp.security.ldapConfig.activeDirectory.roleMapping.<Active_Directory_Group_CN>: <ADMIN> / <READONLY> Please correct me if I'm wrong. Thank you.
... View more
08-07-2019
03:07 AM
Scenario: I recently integrated Altus Director with Active Directory for role based authentication & authorisation. After implementation I noticed that the default admin credential (admin/admin) was not working anymore, which was expected. My question is: Is it possible/recommended to create another 'admin' user in Altus director (or in Active Directory) as a master credential just for back-up? Do Altus Director have a Authentication Backend Order (eg. Database then External) like we have in Cloudera Manager? Suppose, if a user is present in admin group as well as readonly group, then what role does Altus Director assumes for that user? Thank you.
... View more
Labels:
- Labels:
-
Cloudera Manager
07-31-2019
05:50 AM
Scenario:
I submit a heavy query in Impala through Hue which is expected to take around 2-3 hours.
So, if my Hue session times out due to inactivity then will my query be still running in the background or will it get cancelled?
Hue's idle session timeout has been set to 20 mins and rest all other configurations are left to default values.
... View more
Labels:
- Labels:
-
Apache Impala
-
Cloudera Hue