Member since
12-26-2016
15
Posts
1
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2926 | 08-19-2018 03:43 AM |
08-20-2018
02:57 PM
1 Kudo
@rabbit, the path is already defined on Ambari (hdfs -> configs -> DataNode Directories) this is the path you defined for HDFS to write the block information on the actual disk location on each data node. in your case: This path must have defined on ambari as: /data/hadoop-data/dn/ - under this, HDFS creates remaining folders starting from "current" Please check your ambari -> hdfs properties and confirm. I hope this help you.
... View more
08-19-2018
01:48 PM
Hi, Please make sure the hiveserver2 port, if you want to connect HS2 in http mode, the default port should be 10001. So please make sure you are connecting right HS2 server that is running under http mode.
... View more
08-19-2018
03:43 AM
One is large environment with 20+ pb in size and data is completely different from other environment data, and reasons for different lakes are they both fall in different internal departments and data is also different and customers are also different, again depends on the data these cluster(s) servers located in different data centers and one is open for company wide enterprise network and others open for an internal network within enterprise network.
... View more
08-18-2018
10:01 PM
No, you brought the node with a new drive, at that time, NN sees new working drive,it will start allocating new blocks to the new drive. But in any of the above operations, I don’t see any reason for corrupting or deleting data on other drives or on other nodes. This happens all the time in any production environments.
... View more
08-18-2018
06:05 PM
Hi, Not sure if my answer helps you or not: But i can give you some details: We built an enterprise Data Lake using HDP 2.x, how many data lakes(environments) you wanted to build, it depends on the data & requirements. at my workplace, we got multiple production environments, where we got different kinds of data and we enabled 'distcp' between couple of environments, to get some data feed from other environments, but the end users and requirements are clearly different for these environments and one more difference is, different kinds of end users & data and multiple ways they can access these environments. (some wanted the data in "NRT" (near real tme) and some users can wait for the results). So we provided multiple ways to access and to get the data from our data lake -- end users chose the best way that meets their requirements. Hope this helps.
... View more
08-18-2018
05:20 PM
Hello, in our environment (in fact in any environment), this is very common and we always face these disk failure issues and we usually do not shutdown that node, instead we note down the node details for replacing the faulty drive at a later point of time. But when a drive fails, the Namenode identifies this fault drive and also the missing blocks and it usually take care of these missing blocks by copying them from another datanode/drive. In my opinion, you don't have to really worry for a faulty disk on a datanode and you can bring back the node and integrate with cluster, so that the cluster can use the remaining good disks on that node. thanks,
... View more
05-12-2017
09:01 PM
dvillarreal, Thank you for the response, we have actually found the root cause, apologies for not posting solution we implemented on this forum earlier, after enabling debug we found our knox hosts were not able to connect individual datanodes on port 1022, once this FW is resolved, our external tool user was able to read a file from HDFS, but we are still having an issue and it is: when i tried using curl command to read a file from an edgenode going through knox, i am still not able to connect and the output log shows, it was trying to hit :KNOX LB URL on port 8443 (KNOX LB listens on port 443), but in knox config, we got a front end URL: <KNOX LB URL>/gateway, but when we pursued with HW support, they told me that we need to change 'front end URL" to: <KNOX LB URL>:443/gateway, by default it was trying to hit KNOX LB on port 8443, so to avoid this, they asked me to include port: 443 also, but my concern is, our external tool user is able to use KNOX without any issues, so i am not sure to make this change or not? Can you please advise me?
... View more
04-24-2017
04:54 PM
Friends, can anyone please help me with the following KNOX Read a File issue: This is currently working for "listing a file (using "ListStatus") with our Knox Gateway LOAD BALANCE R URL: $ curl -i -k -L -u <user id> 'https://knoxgateway.<CORP DOMAIN>:443/gateway/default/webhdfs/v1/user/<user id>/servers?op=ListStatus'
Enter host password for user '<user id>':
HTTP/1.1 200 OK
Set-Cookie: JSESSIONID=9m4tcprbrs1eapxa0ljk5sfj;Path=/gateway/default;Secure;HttpOnly
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Cache-Control: no-cache
Expires: Mon, 24 Apr 2017 13:25:28 GMT
Date: Mon, 24 Apr 2017 13:25:28 GMT
Pragma: no-cache
Expires: Mon, 24 Apr 2017 13:25:28 GMT
Date: Mon, 24 Apr 2017 13:25:28 GMT
Pragma: no-cache
Server: Jetty(6.1.26.hwx)
Content-Type: application/json
Content-Length: 281
{"FileStatuses":{"FileStatus":[{"accessTime":1492803070763,"blockSize":134217728,"childrenNum":0,"fileId":219423467,"group":"hdfs","length":249,"modificationTime":1492803071085,"owner":"<user id>","pathSuffix":"","permission":"777","replication":3,"storagePolicy":0,"type":"FILE"} Now, i want to read this file using the same KNOX GATEWAY LB URL: $curl -i -k -L -u <user id> 'https://knoxgateway.<CORP DOMAIN>:443/gateway/test/webhdfs/v1/user/<user id>/servers?op=OPEN' <
Enter host password for user '<user id>':
HTTP/1.1 307 Temporary Redirect
Set-Cookie: JSESSIONID=1sopqfdu53c61xutx0ufk0hij;Path=/gateway/test;Secure;HttpOnly
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Cache-Control: no-cache
Expires: Mon, 24 Apr 2017 13:34:31 GMT
Date: Mon, 24 Apr 2017 13:34:31 GMT
Pragma: no-cache
Expires: Mon, 24 Apr 2017 13:34:31 GMT
Date: Mon, 24 Apr 2017 13:34:31 GMT
Pragma: no-cache
Location: https://knoxgateway.<CORP DOMAIN>/knox/test/webhdfs/data/v1/webhdfs/v1/user/<user id>/servers?_=AAAACAAAABAAAAEAlcGYLi4LTj7bhrrDPr1o2u6UIMEkO_aYiGAxiS4hu39uo-Homt5CbB2pwJ9p0Lkl2-7-l0vxINRjR70Ub7SA3D_ZKcoN46q0Bj97ceByV8hZgwEiIvyZmwSYEdKTVRCKV3VOhbuw1peDAJMhlS8SwYoPsRUOmPsdbmX5NLysp7mM7qktkmbHJyf_qXiAwNYuXmIhPBW_PZMmwjmQXckj7mDGAk61P-qWy1rSPoyPZ5oZ6y-7Uwijew0C3FNZzISDJICX6ePU2ptLEJOu1G8FaQonOUi37pvblYUuKSo-0wiLnBKRIvzrjfPzvh0tKrXi7FbCQnbn9sG0IyFjWssqlIoOlUVbf-Jo9eVF653ZyIqGjIYn9aX-7g
Server: Jetty(6.1.26.hwx)
Content-Type: application/octet-stream
Content-Length: 0
HTTP/1.1 404 Not Found
Cache-Control: must-revalidate,no-cache,no-store
Content-Type: text/html;charset=ISO-8859-1
Content-Length: 1324
Server: Jetty(8.1.14.v20131031)
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/>
<title>Error 404 Not Found</title>
</head>
<body>
<h2>HTTP ERROR: 404</h2>
<p>Problem accessing /knox/default/webhdfs/data/v1/webhdfs/v1/user/<user id>/servers. Reason:
<pre> Not Found</pre></p>
<hr /><i><small>Powered by Jetty://</small></i>
I am really confused with this Error: " Problem accessing /knox/default/webhdfs/data/v1/webhdfs/v1/user/<user id>/servers: --> For me this path is wrong and i guess it is redirecting to a wrong path using 'rewrite.xml'? but i don't think we ever modified this file during the KNOX setup. Can anyone please help me / guide me for fixing this issue? I greatly appreciate your help. thank you,
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Knox
04-24-2017
02:19 PM
Hi Arpit, Thank you for the response. Actually our issue was resolved after refreshing client configs on the Namenode Host. Looks like the Namenode has cached the old configuration for DN and we were asked by HW support to Restart Namenode (or) if not possible, atleast refresh client config, we first refreshed client config that resolved our issue.
... View more
04-20-2017
07:24 PM
Good Morning Experts, I am currently facing the following issue: We have 200 nodes hadoop cluster and we configured rack awareness, suddenly we noticed one of the datanode was missing from Ambari, but we do have that datanode process on that particular node, when we looked at the logs, we have noticed the following error: Initialization failed for Block pool BP-3x84848-92929299 (Datanode Uuid 6048438486-d001-47af-a899-6493aca15c4c) service to hostname.com/<data node ip>:8020 Failed to add /default-rack/<datanode ip>:1019: You cannot have a rack and a non-rack node at the same level of the network topology. We have added the datanode again from ambari, but after starting the datanode, but it is still complaining with the above errors in the datanode logs. I didn't see any similar question in the community, so i am looking for your help. Since this is currently an issue in our production cluster, can anyone please help me quickly? I greatly appreciate your quick help. thanks, ~hdpadmin
... View more
Labels:
- Labels:
-
Apache Hadoop
03-12-2017
08:54 PM
Hi Jay, Thank you for the response. Actually we got all these things in place, but i realized that our trust store password was incorrect and i was able to fix that issue, later it complained about a self signed cert on the ranger admin server, so I imported hive cert into ranger trust store and did set common name correctly on hive/ranger configuration and finally my issue was resolved. Thanks again for the reply. Subrah.
... View more
03-12-2017
04:55 AM
Friends, I need your advise/help on the following issue: We have successfully configured Hiveserver2 / SSL Ranger Plugin / Kerberos, but i never tested earlier, but recently i found a hive / ranger plugin issue, that is @ when I tried to connect hiveserver2 through beeline, i was able to connect to hiveserver2, but when i typed 'show databases' i was not getting any result, but in hiveserver2 logs,i found the following errors: Here is the hiveserver2 log: 2017-03-11 22:46:56,026 ERROR [Thread-9]: util.RangerRESTClient (RangerRESTClient.java:getTrustManagers(342)) - Unable to read the necessary SSL Keystore and TrustStore Files
java.io.IOException: Keystore was tampered with, or password was incorrect
at sun.security.provider.JavaKeyStore.engineLoad(JavaKeyStore.java:780)
at sun.security.provider.JavaKeyStore$JKS.engineLoad(JavaKeyStore.java:56)
at sun.security.provider.KeyStoreDelegator.engineLoad(KeyStoreDelegator.java:225)
at sun.security.provider.JavaKeyStore$DualFormatJKS.engineLoad(JavaKeyStore.java:70)
at java.security.KeyStore.load(KeyStore.java:1445)
at org.apache.ranger.plugin.util.RangerRESTClient.getTrustManagers(RangerRESTClient.java:323)
at org.apache.ranger.plugin.util.RangerRESTClient.buildClient(RangerRESTClient.java:190)
at org.apache.ranger.plugin.util.RangerRESTClient.getClient(RangerRESTClient.java:177)
at org.apache.ranger.plugin.util.RangerRESTClient.getResource(RangerRESTClient.java:157)
at org.apache.ranger.admin.client.RangerAdminRESTClient.createWebResource(RangerAdminRESTClient.java:162)
at org.apache.ranger.admin.client.RangerAdminRESTClient.getServicePoliciesIfUpdated(RangerAdminRESTClient.java:70)
at org.apache.ranger.plugin.util.PolicyRefresher.loadPolicyfromPolicyAdmin(PolicyRefresher.java:215)
at org.apache.ranger.plugin.util.PolicyRefresher.loadPolicy(PolicyRefresher.java:183)
at org.apache.ranger.plugin.util.PolicyRefresher.run(PolicyRefresher.java:156)
Caused by: java.security.UnrecoverableKeyException: Password verification failed
at sun.security.provider.JavaKeyStore.engineLoad(JavaKeyStore.java:778)
... 13 more
2017-03-11 22:46:56,027 ERROR [Thread-9]: util.PolicyRefresher (PolicyRefresher.java:loadPolicyfromPolicyAdmin(238)) - PolicyRefresher(serviceName=TEST_hive): failed to refresh policies. Will continue to use last known version of policies (-1)
java.lang.IllegalArgumentException: SSLContext must not be null I have verified the "java keystore & trust store password" as i was able to list those two stores (keytool command) with the passwords. Can anyone please help me on resolving this issue. Thank you, Subrah
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Ranger