Member since
05-03-2016
23
Posts
2
Kudos Received
4
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1673 | 04-08-2019 01:04 PM | |
949 | 02-21-2019 02:37 PM | |
1399 | 11-22-2017 11:35 AM | |
1138 | 06-03-2016 06:49 AM |
04-08-2019
01:04 PM
Got the hint logs are exactly pointing the issue. "ls -ld /" shows me it has 777 permission. Just removed the write permission for group and other users, my issue is solved. All these while I only checked the permission for the subsequent folders after "/" but the problem lies with "/" itself.
... View more
04-08-2019
12:35 PM
After a successful fresh Installation of HDP 3.1.0 on a 2 node Ubuntu 18.0.4 instances, we were able to get all the services up and running. But after a night off, Datanode doesn't start. Following is the error: ERROR datanode.DataNode (DataNode.java:secureMain(2883)) - Exception in secureMain
java.io.IOException: The path component: '/' in '/var/lib/hadoop-hdfs/dn_socket' has permissions 0777 uid 0 and gid 0. It is not protected because it is world-writable. This might help: 'chmod o-w /'. For more information: https://wiki.apache.org/hadoop/SocketPathSecurity
at org.apache.hadoop.net.unix.DomainSocket.validateSocketPathSecurity0(Native Method)
at org.apache.hadoop.net.unix.DomainSocket.bindAndListen(DomainSocket.java:193)
at org.apache.hadoop.hdfs.net.DomainPeerServer.<init>(DomainPeerServer.java:40)
at org.apache.hadoop.hdfs.server.datanode.DataNode.getDomainPeerServer(DataNode.java:1194)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initDataXceiver(DataNode.java:1161)
at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:1416)
at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:500)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2782)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2690)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2732)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2876)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2900)
2019-04-08 17:36:52,452 INFO util.ExitUtil (ExitUtil.java:terminate(210)) - Exiting with status 1: java.io.IOException: The path component: '/' in '/var/lib/hadoop-hdfs/dn_socket' has permissions 0777 uid 0 and gid 0. It is not protected because it is world-writable. This might help: 'chmod o-w /'. For more information: https://wiki.apache.org/hadoop/SocketPathSecurity
2019-04-08 17:36:52,456 INFO datanode.DataNode (LogAdapter.java:info(51)) - SHUTDOWN_MSG I did check the socket file and it's parent directories, I don't see the 777 permission at any level. But still the error appears while starting the datanode. Couldn't find any solution, so posted here for help. I have also uploaded the complete logs for the datanode. hadoop-hdfs-datanode.txtRegards, Vinay MP
... View more
Labels:
02-21-2019
02:37 PM
I didin't find any configuration from Kylin side. Tomcat will be bundled inside Kylin. So changing the port on tomcat server.xml helped.
... View more
02-20-2019
06:10 PM
How can I change the Apache kylin port ? The default listen port is 7070, I have salt-bootstrap running on the Azure VMs in that port. I went through kylin.properties and didn't find a relevant property for listen port.
... View more
- Tags:
- kylin
07-19-2018
11:30 AM
I faced same problem. I had created cluster template from Cloudbreak 2.4.0. I used the same template with cloudbreak 2.7.0, cluster creation failed with "Failed to retrieve server certificate" Error. After referring this thread, I compared the Image ID used in 2.4.0 and 2.7.0 template and found them to be different. So if there is a problem with the image used to create the instances, it can lead to this error.
... View more
11-22-2017
11:35 AM
Hey All, After series of tests, we decided to move to Centos 7.4 and Upgrade to HDP-2.6.3.0 With Centos 7.4 and Ambari Version 2.6.0.0, I don't see this issue eventhough I have 'Python 2.7.5' With reference to my previous comment, it looks to be an Ambari issue.
... View more
10-09-2017
07:28 AM
@Akhil S Naik @Jay SenSharma It's a firewall issue. Now the ambari server is responding properly. I did go through Jay's article. Thanks for sharing, that will help in future. Regards, Vinay MP
... View more
10-08-2017
11:22 AM
Ambari-server performance is way too slow. I freshly installed ambari server on centos 7.3, with oracle jdk 1.8 I know only centos is supported till 7.2, But on the same configuration Ambari 2.2.2 works absolutely fine. I did test with chrome, IE and firefox. Performance is bad in all of them. It takes nearly 2 mins to login. P.S.:- This is a fresh installation, I am trying to "launch Install wizard". Navigation to every page is nearly 2-3 mins. and I am not able to proceed after "Get started" tab. Are there any known issues? Regards, Vinay MP
... View more
Labels:
- Labels:
-
Apache Ambari
09-28-2017
06:09 AM
@Jay SenSharma Haven't found a feasible solution. As mentioned in the issue description, Downgrading Python 2.6 is not feasible as there are OS dependencies and based on below link: https://stackoverflow.com/questions/46274499/ambari-agent-certificate-verify-failed-is-it-safe-to-disable-the-certificate I have got a suggestion it's not a good idea to disable certificate verification in Python. Sharing some more information from our investigation, Just thinking it might help others: We use AWS EC2 With Python 2.7, JDK 1.8 and Cent OS 7.2 there is no issue. Everything is smooth. With Python 2.7, JDK 1.8 and Cent OS 7.3 and Centos 7.4 we are seeing this issue. What I have reported here, is with respect to Centos 7.3 and with Centos 7.4 Issue is slightly different: Certificate verification fails while adding nodes to the cluster itself. Downgrading from centos 7.3 to 7.2 is not straight forward. And AWS EC2 market place provides Centos 7.0 Image and when we create instance from this image, it applies security and patch updates resulting in Centos 7.3. We can create our own Image of Centos 7.3 from existing servers but, It's always good be with the latest update for the OS for security reasons. To finish it shortly, we have workarounds but not a solution yet 🙂 Thanks for your help. I will update the solution which we follow. Regards, Vinay MP
... View more
09-18-2017
08:01 AM
Ambari version: 2.2.2.18 HDP stack: 2.4.3 OS: centos 7.3 Issue description: Ambari-server can't communicate with Ambari agent. I can see below error in the ambari-agent logs: ERROR 2017-09-18 06:35:34,684 NetUtil.py:84 - [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:579) ERROR 2017-09-18 06:35:34,684 NetUtil.py:85 - SSLError: Failed to connect. Please check openssl library versions. I am facing this issue recently and it appears this can be replicated consistently after the instances are restarted. (I am using EC2 instances). I am able to register agent nodes successfully, install HDP cluster, run yarn jobs etc... no problem at all. Once i restart my instances, I see this problem. There are some solutions already posted for this problem like:
Downgrade the Python from 2.7 to lower. This is a known problem of Ambari with Python 2.7 Control the certificate verification by disabling it.
Set "verify = disable"; under /etc/python/cert-verification.cfg I don't want to play with Python as it can disrupt lot many things like Cassandra, yum package manager etc... Second work around is very much easy and it works well! Now comes my question :- Is it safe to disable the certificate verification in Python ? i.e. by setting property verify = disable Regards, Vinay MP
... View more
Labels:
- Labels:
-
Apache Ambari
06-06-2017
11:39 AM
I did face the same problem in HDP 2.3 cluster with open JDK 1.6 I tried the above solution but it didn't work for me. I then decided to try with HDP 2.4 with open JDK 1.7, Now kerberos setup is successful.
... View more
01-25-2017
06:19 AM
Hey @rguruvannagari, Sorry to reply so late and taking so long time on this. Since Thrift server wasn't required for our project we decided to stop in the cluster. And thank you for the suggestion. Now got some free time and verified. Yarn was keeping it in the ACCEPTED state as long as memory wasn't available. Once memory is available, I can see the hive prompt as the application goes to RUNNING state.
... View more
01-10-2017
09:38 AM
Thanks. I had created few folder under /usr/hdp and faced same issue. It's a good practice to not to create any files, folders under /usr/hdp as the script doesn't like it. Easy to move/create the folders (Thank modifying the script) somewhere else if required. And that solves my issue!
... View more
01-04-2017
12:50 PM
Hey All, I faced this issue in HDP 2.4 cluster running on centos. When I run 'hive' command, it always used to halt like below: Tried to add the proxyuser properties for hosts and groups and finally found that's not actually causing this. I stop Spark thrift server, immediately it gives me the hive prompt and I can work with hive cli. Anybody faced similar problems ??? Regards, Vinay MP
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Spark
06-03-2016
06:49 AM
Finally I managed to get a new 16GB machine where I can run the VM with good performance. As an initial practice I was using a 8GB machine. I used the same VM, the command went through fine in 16GB machine and it failed in 8GB machine. Not exactly sure whether memory wasn't sufficient (i didn't see any OOM / related exceptions in 8GB machine) to run these tests in 8GB machine but I am glad the problem is solved. @Ian Roberts, @Predrag Minovic Thanks for taking time to reply. Regards, Vinay MP
... View more
05-04-2016
07:26 AM
Hi @Ian Roberts , @Predrag Minovic Thanks for the suggestions. I will try them and update. As per now I checked netstat and I was able to see resourcemanager was up and listening on 8030, 8050 and few more ports. All of a sudden I am not able to open terminal session to Node1 (one of the host in my VM). I will fix that and verify the mapreduce example. Regards, Vinay MP
... View more
05-03-2016
12:52 PM
Hello, I am running below command from the map reduce examples for PI, it is failing and I can see socket timeout exception in the logs. I am not able to find a solution anywhere till now, would be glad if someone can help. Command: yarn jar hadoop-mapreduce-examples.jar pi 5 10 (From the directory: /usr/hdp/2.3.0.0-2557/hadoop-mapreduce) Below is the log trace: 2016-04-20 06:12:48,333 WARN [RMCommunicator Allocator] org.apache.hadoop.ipc.Client: Exception encountered while connecting to the server : java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.17.0.2:53751 remote=node1/172.17.0.2:8030]
2016-04-20 06:12:51,884 ERROR [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator: ERROR IN CONTACTING RM.
java.io.IOException: Failed on local exception: java.io.IOException: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.17.0.2:53751 remote=node1/172.17.0.2:8030]; Host Details : local host is: "node1/172.17.0.2"; destination host is: "node1":8030;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:773 I can see the property in advances yarn site -- yarn.resourcemanager.scheduler.address node1:8030 Hosts file entry: [root@node1 ~]# cat /etc/hosts
172.17.0.2 node1 127.0.0.1 localhost
::1 localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
[root@node1 ~]# Not sure what is the problem. I can ping localhost / node1/127.0.0.1 from node1 terminal. Regards, Vinay MP
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache YARN