Member since
07-15-2014
57
Posts
9
Kudos Received
6
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
6809 | 06-05-2015 05:09 PM | |
1714 | 12-19-2014 01:03 PM | |
3199 | 12-17-2014 08:23 PM | |
8264 | 12-16-2014 03:07 PM | |
13741 | 08-30-2014 11:14 PM |
12-15-2014
10:40 AM
This is the error. CRITICAL Initialization failed for Block pool BP-1219478626-192.168.1.20-1418484473049 (Datanode Uuid null) service to nn1home/10.192.128.227:8022 Datanode denied communication with namenode because the host is not in the include-list: DatanodeRegistration(10.192.128.231, datanodeUuid=ff6a2644-3140-4451-a59f-496478a000d7, infoPort=50075, ipcPort=50020, storageInfo=lv=-56;cid=cluster18;nsid=850143528;c=0) at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:889) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:4798) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1037) at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:92) at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:26378) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:587) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
... View more
12-15-2014
10:33 AM
I have built and existing demo cluster. It was working perfectly well. however because of some changes, I have to change the IP address of each of the data nodes. I did a grep -R 'oldIP' /etc on each machine and edited the files which contained the old IP addresses and replaced them with new IP. I rebooted each machine. However despite doing that when I do sudo -u hdfs hadoop dfsadmin -report it shows me 2 dead data nodes and it lists the old IP addresses. How can I remove old IP and then replace them with new IP addresses?
... View more
Labels:
- Labels:
-
Apache Hadoop
-
HDFS
08-30-2014
11:14 PM
1 Kudo
I was able to solve the problem. I have to specify "python" as well in the mapper like sudo -u hdfs hadoop jar /usr/lib/hadoop-0.20-mapreduce/contrib/streaming/hadoop-streaming.jar -input /sample/cite75_99.txt -output /foo -mapper 'python RandomSample.py 10' -file RandomSale.py -numReduceTasks 1
... View more
08-30-2014
10:40 PM
I changed my python code to #!/usr/bin/env python import sys, random file = open("/tmp/log.txt", "w") for line in sys.stdin: file.write("line: " + line + "\n") file.close() When i run my job, I see exactly the same error and the file /tmp/log.txt is not created on any machine. so I guess the script is not even being invoked I suppose.
... View more
08-30-2014
09:11 PM
I have a 5 node hadoop cluster on which I can execute the following streaming job successfully sudo -u hdfs hadoop jar /usr/lib/hadoop-0.20-mapreduce/contrib/streaming/hadoop-streaming.jar -input /sample/apat63_99.txt -output /foo1 -mapper 'wc -l'-numReduceTasks 0 But when I try to execute a streaming job using python sudo -u hdfs hadoop jar /usr/lib/hadoop-0.20-mapreduce/contrib/streaming/hadoop-streaming.jar -input /sample/apat63_99.txt -output /foo5 -mapper 'AttributeMax.py 8'-file '/tmp/AttributeMax.py'-numReduceTasks 1 I get an error packageJobJar:[/tmp/AttributeMax.py,/tmp/hadoop-hdfs/hadoop-unjar2062240123197790813/][]/tmp/streamjob4074525553604040275.jar tmpDir=null14/08/2911:22:58 WARN mapred.JobClient:UseGenericOptionsParserfor parsing the arguments.Applications should implement Toolfor the same.14/08/2911:22:58 INFO mapred.FileInputFormat:Total input paths to process :114/08/2911:22:59 INFO streaming.StreamJob: getLocalDirs():[/tmp/hadoop-hdfs/mapred/local]14/08/2911:22:59 INFO streaming.StreamJob:Running job: job_201408272304_003014/08/2911:22:59 INFO streaming.StreamJob:To kill this job, run:14/08/2911:22:59 INFO streaming.StreamJob: UNDEF/bin/hadoop job -Dmapred.job.tracker=jt1:8021-kill job_201408272304_003014/08/2911:22:59 INFO streaming.StreamJob:Tracking URL: http://jt1:50030/jobdetails.jsp?jobid=job_201408272304_003014/08/2911:23:00 INFO streaming.StreamJob: map 0% reduce 0%14/08/2911:23:46 INFO streaming.StreamJob: map 100% reduce 100%14/08/2911:23:46 INFO streaming.StreamJob:To kill this job, run:14/08/2911:23:46 INFO streaming.StreamJob: UNDEF/bin/hadoop job -Dmapred.job.tracker=jt1:8021-kill job_201408272304_003014/08/2911:23:46 INFO streaming.StreamJob:Tracking URL: http://jt1:50030/jobdetails.jsp?jobid=job_201408272304_003014/08/2911:23:46 ERROR streaming.StreamJob:Jobnot successful.Error: NA14/08/2911:23:46 INFO streaming.StreamJob: killJob... In my job tracker console I see errors java.io.IOException: log:null
R/W/S=2359/0/0in:NA [rec/s] out:NA [rec/s]minRecWrittenToEnableSkip_=9223372036854775807 LOGNAME=null
HOST=null
USER=mapred
HADOOP_USER=null
last Hadoop input:|null|last tool output:|null|Date:FriAug2911:22:43 CDT 2014java.io.IOException:Broken pipe
at java.io.FileOutputStream.writeBytes(NativeMethod) at java.io.FileOutputStream.write(FileOutputStream.java:282) at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65) at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109) at java.io.DataOutputStream.write(DataOutputStream.java:90) at org.apache.hadoop.streaming.io.TextInputWriter.writeUTF8(TextInputWriter.java:72) at org.apache.hadoop.streaming.io.TextInputWriter.writeValue(TextInputWriter.java:51) at org.apache.hadoop.streaming.PipeMapper.map(PipeMapper.java:110) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.streaming.Pipe The python code itself is pretty simple #!/usr/bin/env pythonimport sys
index = int(sys.argv[1])max =0for line in sys.stdin
fields = line.strip().split(",")if fields[index].isdigit(): val = int(fields[index])if(val > max😞 max = val
else:print max
... View more
Labels:
- Labels:
-
Apache Hadoop
-
HDFS
-
MapReduce
08-01-2014
08:34 AM
2 Kudos
When I try to start the job traker using this command service hadoop-0.20-mapreduce-jobtracker start I can see this error org.apache.hadoop.security.AccessControlException: Permission denied: user=mapred, access=WRITE, inode="/":hdfs:supergroup:drwxr-xr-x
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:224)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:204)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:149)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:4891)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:4873)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:4847)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:3192)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:3156)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3137)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:669) I found this blog post which tries to address this issue http://blog.spryinc.com/2013/06/hdfs-permissions-overcoming-permission.html I followed the steps here and did groupadd supergroup
usermod -a -G supergroup mapred
usermod -a -G supergroup hdfs but i still get this problem. The only different between the blog entry and me is that for me the error is on the "root" dir whereas for the blog it is for the "/user" Here is my mapred-site.xml <?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>jt1:8021</value>
</property>
<property>
<name>mapred.local.dir</name>
<value>/tmp/mapred/jt</value>
</property>
<property>
<name>mapred.system.dir</name>
<value>/tmp/mapred/system</value>
</property>
<property>
<name>mapreduce.jobtracker.staging.root.dir</name>
<value>/user</value>
</property>
<property>
<name>mapred.job.tracker.persist.jobstatus.active</name>
<value>true</value>
</property>
<property>
<name>mapred.job.tracker.persist.jobstatus.hours</name>
<value>24</value>
</property>
<property>
<name>mapred.jobtracker.taskScheduler</name>
<value>org.apache.hadoop.mapred.FairScheduler</value>
</property>
<property>
<name>mapred.fairscheduler.poolnameproperty</name>
<value>user.name</value>
</property>
<property>
<name>mapred.fairscheduler.allocation.file</name>
<value>/etc/hadoop/conf/fair-scheduler.xml</value>
</property>
<property>
<name>mapred.fairscheduler.allow.undeclared.pools</name>
<value>true</value>
</property>
</configuration> I also found this blog http://www.hadoopinrealworld.com/fixing-org-apache-hadoop-security-accesscontrolexception-permission-denied/ I did sudo -u hdfs hdfs dfs -mkdir /home sudo -u hdfs hdfs dfs -chown mapred:mapred /home sudo -u hdfs hdfs dfs -mkdir /home/mapred sudo -u hdfs hdfs dfs -chown mapred /home/mapred sudo -u hdfs hdfs dfs -chown hdfs:supergroup / but still problem is not resolved 😞 Please help. I wonder why it is going for the "root" dir inode="/":hdfs:supergroup:drwxr-xr-x
... View more
Labels:
- Labels:
-
Apache Hadoop
-
HDFS
-
MapReduce
-
Security
07-29-2014
04:35 PM
Thank you so much. Your answer is absolutely correct. I went to each server and did nn1: service zookeeper-server init --myid=1 --force nn2: service zookeeper-server init --myid=2 --force jt1: service zookeeper-server init --myid=3 --force earlier I had chosen an ID of 1 on every machine. I also corrected my zoo.cfg. to ensure right entries. Now it works and I am able to do sudo -u hdfs hdfs zkfc -formatZK Thank you so much!
... View more
07-26-2014
09:44 PM
I am reading this article http://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html I am having problems in visualizing how this code will execute in a distributed environment. So when I package this jar and execute this on jar on a hadoop cluster. Below is my understanding of things and also my doubts and questions 1. First the Run method will be called which will setup the JobConf object and will run the code. (which machine will the main method execute on? the job tracker node? the task tracker node? 2. Now suppose a machine is randomly chosen to run the main method. My understanding is that this JAR file will be serialized and sent to few machines running task tracker where the map funcion will be run first. For this, the input file will be split and fragments will be serialized to the nodes running the map tasks. (Question here is that does hadoop persist these split files as well on HDFS... or are the splits in memory?) 3. The map function will create a key value pair and will sort it as well. (Question here is that does hadoop persist the output of the map functions to HDFS before giving it off to the reduce processes?) 4. Now hadoop will start reduce processes accross the cluster to run the reduce code. This code will be given teh ouput of the map tasks. 5. My biggest confusion is that after each reduce has run and we have output from each reduce process. how do we then merge those outputs into the final output? So for example, if we were calculating the value of pi (there is a sample for that) .... how is the final value calculated from the output of different reduce tasks? Sorry if this question is very basic or very broad... I am just trying to lean stuff.
... View more
Labels:
- Labels:
-
Apache Hadoop
-
HDFS
07-26-2014
12:47 AM
Yes your suggestion is right. the 32 machine did not have the firewall switched off. when I did serivce stop firewalld.server and service disable firewalld.service it started to work fine.
... View more
07-25-2014
08:16 AM
When I issue the command sudo -u hdfs hdfs zkfc -formatZK i get the error 14/07/24 00:24:34 INFO zookeeper.ClientCnxn: Opening socket connection to server nn1/192.168.1.30:2181. Will not attempt to authenticate using SASL (java.lang.SecurityException: Unable to locate a login configuration)
14/07/24 00:24:34 INFO zookeeper.ClientCnxn: Socket connection established to nn1/192.168.1.30:2181, initiating session
14/07/24 00:24:34 INFO zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect
14/07/24 00:24:35 INFO zookeeper.ClientCnxn: Opening socket connection to server nn2/192.168.1.31:2181. Will not attempt to authenticate using SASL (java.lang.SecurityException: Unable to locate a login configuration)
14/07/24 00:24:35 INFO zookeeper.ClientCnxn: Socket connection established to nn2/192.168.1.31:2181, initiating session
14/07/24 00:24:35 INFO zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect
14/07/24 00:24:35 INFO zookeeper.ClientCnxn: Opening socket connection to server jt1/192.168.1.32:2181. Will not attempt to authenticate using SASL (java.lang.SecurityException: Unable to locate a login configuration)
14/07/24 00:24:35 INFO zookeeper.ClientCnxn: Socket connection established to jt1/192.168.1.32:2181, initiating session
14/07/24 00:24:35 INFO zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect
14/07/24 00:24:37 INFO zookeeper.ClientCnxn: Opening socket connection to server nn1/192.168.1.30:2181. Will not attempt to authenticate using SASL (java.lang.SecurityException: Unable to locate a login configuration)
14/07/24 00:24:37 INFO zookeeper.ClientCnxn: Socket connection established to nn1/192.168.1.30:2181, initiating session
14/07/24 00:24:37 INFO zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect
14/07/24 00:24:37 INFO zookeeper.ClientCnxn: Opening socket connection to server nn2/192.168.1.31:2181. Will not attempt to authenticate using SASL (java.lang.SecurityException: Unable to locate a login configuration)
14/07/24 00:24:37 INFO zookeeper.ClientCnxn: Socket connection established to nn2/192.168.1.31:2181, initiating session
14/07/24 00:24:37 INFO zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect
14/07/24 00:24:37 INFO zookeeper.ClientCnxn: Opening socket connection to server jt1/192.168.1.32:2181. Will not attempt to authenticate using SASL (java.lang.SecurityException: Unable to locate a login configuration)
14/07/24 00:24:37 INFO zookeeper.ClientCnxn: Socket connection established to jt1/192.168.1.32:2181, initiating session
14/07/24 00:24:37 INFO zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect
14/07/24 00:24:39 INFO zookeeper.ClientCnxn: Opening socket connection to server nn1/192.168.1.30:2181. Will not attempt to authenticate using SASL (java.lang.SecurityException: Unable to locate a login configuration)
14/07/24 00:24:39 INFO zookeeper.ClientCnxn: Socket connection established to nn1/192.168.1.30:2181, initiating session
14/07/24 00:24:39 INFO zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect
14/07/24 00:24:39 ERROR ha.ActiveStandbyElector: Connection timed out: couldn't connect to ZooKeeper in 5000 milliseconds
14/07/24 00:24:40 INFO zookeeper.ZooKeeper: Session: 0x0 closed
14/07/24 00:24:40 INFO zookeeper.ClientCnxn: EventThread shut down
14/07/24 00:24:40 FATAL ha.ZKFailoverController: Unable to start failover controller. Unable to connect to ZooKeeper quorum at nn1:2181,nn2:2181,jt1:2181. Please check the configured value for ha.zookeeper.quorum and ensure that ZooKeeper is running. I have confirmed that the zookeeper service is running on every machine by [root@nn1 ~]# service zookeeper-server start
JMX enabled by default
Using config: /etc/zookeeper/conf/zoo.cfg
Starting zookeeper ... already running as process 1065. I can also do an nc from every machine to every machine [root@nn1 ~]# nc nn1 2181
^C
[root@nn1 ~]# nc nn2 2181
^C
[root@nn1 ~]# nc jt1 2181
^C
[root@nn1 ~]# I can see this in the zookeeper event log 2014-07-24 00:24:18,706 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@774] - Notification time out: 60000
2014-07-24 00:24:34,956 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /192.168.1.30:35151
2014-07-24 00:24:34,956 [myid:1] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
2014-07-24 00:24:34,956 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket connection for client /192.168.1.30:35151 (no session established for client)
2014-07-24 00:24:37,075 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /192.168.1.30:35154
2014-07-24 00:24:37,076 [myid:1] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
2014-07-24 00:24:37,076 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket connection for client /192.168.1.30:35154 (no session established for client)
2014-07-24 00:24:39,432 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /192.168.1.30:35157
2014-07-24 00:24:39,433 [myid:1] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
2014-07-24 00:24:39,433 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket connection for client /192.168.1.30:35157 (no session established for client)
2014-07-24 00:25:18,709 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@190] - Have smaller server identifier, so dropping the connection: (2, 1)
2014-07-24 00:25:18,710 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@190] - Have smaller server identifier, so dropping the connection: (3, 1)
2014-07-24 00:25:18,711 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@774] - Notification time out: 60000
2014-07-24 00:26:18,713 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@190] - Have smaller server identifier, so dropping the connection: (2, 1)
2014-07-24 00:26:18,715 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@190] - Have smaller server identifier, so dropping the connection: (3, 1)
2014-07-24 00:26:18,716 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@774] - Notification time out: 60000
2014-07-24 00:26:40,619 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /192.168.1.30:35170
2014-07-24 00:26:43,508 [myid:1] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@349] - caught end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid 0x0, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:220)
at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:662)
2014-07-24 00:26:43,511 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket connection for client /192.168.1.30:35170 (no session established for client)
2014-07-24 00:27:18,717 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@190] - Have smaller server identifier, so dropping the connection: (2, 1)
2014-07-24 00:27:18,719 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@190] - Have smaller server identifier, so dropping the connection: (3, 1)
2014-07-24 00:27:18,719 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@774] - Notification time out: 60000
... View more
Labels:
- Labels:
-
Apache Zookeeper
-
HDFS
- « Previous
- Next »