Member since
12-11-2015
200
Posts
29
Kudos Received
30
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
368 | 08-14-2024 06:24 AM | |
1347 | 10-02-2023 06:26 AM | |
1236 | 07-28-2023 06:28 AM | |
7722 | 06-02-2023 06:06 AM | |
614 | 01-09-2023 12:20 PM |
03-05-2017
03:20 AM
Command to test reverse resolution is described in this link http://linuxcommando.blogspot.in/2008/07/how-to-do-reverse-dns-lookup.html
... View more
03-05-2017
03:15 AM
1 Kudo
First of all. I am sorry for getting back late on this question.
One of the key factor of kerberos authentication is its reliability on DNS reverse resolution.
Quote from MIT KDC - https://web.mit.edu/kerberos/krb5-1.4/krb5-1.4.4/doc/krb5-admin/Getting-DNS-Information-Correct.html
----
[1]
Getting DNS Information Correct
Several aspects of Kerberos rely on name service. In order for Kerberos to provide its high level of security, it is less forgiving of name service problems than some other parts of your network. It is important that your Domain Name System (DNS) entries and your hosts have the correct information.
----
Lets say the virtual ip as haproxy.com
And loadbalancers are running on below nodes
haproxy1.com - 10.0.0.1
haproxy2.com - 10.0.0.2
haproxy3.com - 10.0.0.3
impalad running on nodes
impalad1 - 10.0.0.4
impalad2 - 10.0.0.5
impalad3 - 10.0.0.6
====
# Forward resolution configs for DNS
haproxy1.com IN A 10.0.0.1
haproxy.com IN CNAME haproxy1.com
====
Now haproxy.com resolves to ip 10.0.0.1
reverse resolution of ip 10.0.0.1 will result in answer haproxy1.com. This breaks the expectation of kerberos [1] authentication, so service ticket request will fail when you run impala-shell -i haproxy.com
[2]
So our aim is to achieve DNS resolution like this.
haproxy.com -> 10.0.0.1
10.0.0.1 -> haproxy.com
We can now alter the reverse resolution of DNS to achieve this
Reverse zone configuration:
====
inverse-query-haproxy1.com IN PTR haproxy1.com
10.0.0.1 IN CNAME inverse-query-haproxy1.com
====
With these above set of configs we can achieve forward and reverse resolution as expected in [2]
Caution Note:
If you run CM agents on one of the proxy machines, i.e. its a part of the cluster, its identity will have to change permanently to the VIP name, because reverse DNS will now never show the original hostname, which could cause other services to have issues unless listening_hostname is configured to use the VIP name. Ideally the haproxy machine should not be added as part of the cluster in CM hosts control, to avoid this from happening - it should be a standalone box.
... View more
02-19-2017
03:13 AM
@bushnoh This is normal. Once you setup loadbalancer infront of impalad, the impalad will expose itself through the service principal name(SPN) of the loadbalancer to the external client. If you check the varz page of individual impalad, you can notice following parameters https://<impalad-hostname>:25000/varz principal ==> LB SPN be_principal ==> IMPALAD SPN This shows that impalad expects LB's SPN for clients communication whereas for internal communication[within impalad's] it uses its own SPN. be_principal --> Backend principal for internal communication. hence it is required to contact the impalad with LB's SPN.
... View more
01-19-2017
09:19 AM
You're welcome!
... View more
01-19-2017
09:05 AM
1 Kudo
@ski309 Was the below action performed after moving the namenode? https://www.cloudera.com/documentation/enterprise/5-7-x/topics/admin_nn_migrate_roles.html#concept_ff5_tdg_ts The HiveMetastoreDatabase maintains the location of tables and databases. So once after moving the namenode, it is necessary to perform the above step to update the locations in HMS.
... View more
01-19-2017
08:57 AM
1. Can you check the value of "fs.defaultFS" in core-site.xml file in impalad process directory a. impalad process directory -- /var/run/cloudera-scm-agent/process/<num>-impala-IMPALAD Replace <num> with the latest number under process directory Then you can run grep -Rn 8020 * -b1 Please let me know if the hostname in the value tag matches the current namenode
... View more
12-05-2016
10:46 AM
Compute stats is an expensive operation For a table with below definition CREATE TABLE default.test123 ( a INT, b INT ) PARTITIONED BY ( c STRING ); When you run compute stats, it spawns child queries as follows SELECT NDV(a) AS a, CAST(-1 as BIGINT), 4, CAST(4 as DOUBLE), NDV(b) AS b, CAST(-1 as BIGINT), 4, CAST(4 as DOUBLE), NDV(c) AS c, CAST(-1 as BIGINT), MAX(length(c)), AVG(length(c)), NDV(d) AS d, CAST(-1 as BIGINT), 8, CAST(8 as DOUBLE), NDV(e) AS e, CAST(-1 as BIGINT), 8, CAST(8 as DOUBLE) FROM default.mytable i.e) compute stats does compute the distinct values in each columns[NDV - number of distinct values], max and average length values in string columns. This on a big table is an expensive operation and it takes more time and resource The profile of compute stats will contains the below section which will explain you the time taken for "Child queries" in nanoseconds Start execution: 0 Planning finished: 1999998 Child queries finished: 550999506 Metastore update finished: 847999239 Rows available: 847999239 Profile Collection: ================== a. Go to Impala > Queries b. Identify the query you are interested in and from the dropdown on the right select "Query Details" c. On the resulting page select "Download Profile" 1/ To understand the cpu utilisation that you highlighted here. Could you please provide the profile of the insert query. 2/ And how did you confirm that impala is causing 100% cpu utilisation? Did you run top and notice impalad process taking all the cpu?
... View more
10-25-2016
06:58 AM
You are right. hivecli wont reach the metastore if it doesnt have proper configuration files. Did the node inwhich you ran the hiveCLI had proper hive gateway assigned to it? If not, the table that you created would have been created on the derby database(embedded DB of HMS) and this would inturn makes the changes invisible to impala, because impala catalog always grabs the metadata from the remote metastore. This also clarifies that other behaviour that you noticed. i.e creating database on one node was not being reflected on the hivecli opened in another node because the metadata are being added on the embedded DB
... View more
10-25-2016
04:31 AM
1 Kudo
Do you have sentry enabled in the cluster? Do you notice the database when you run the show database query on beeline from the same user? With HIVE CLI the show database query's results will not be filtered based on the user privileges. But with beeline and impala-shell -- the user privileges will be evaluated before rendering the result. If the respective user dont have read privilege on the database, it wont be listed in the result.
... View more
03-30-2016
12:09 PM
Hi, Encryption at rest is used for protecting your data from an unauthorized user who has no read permission in hdfs or has no access to cluster and is trying to read it from the disk directly. In your example the directory /tmp/user1zone1 has read access for all cluster users and hence user2 is allowed to read from it. drwxr-xr-x - user1 supergroup 0 2016-02-10 02:42 /tmp/user1zone1
... View more
- « Previous
- Next »