Member since
07-25-2016
40
Posts
5
Kudos Received
0
Solutions
08-29-2017
07:17 AM
0
down vote
favorite
iam using hadoop apache 2.7.1 on centos 7
in HA cluster that consists of two namenodes and 6 data nodes
and i realized the following error that always found in my log
DataXceiver error processing WRITE_BLOCK operation src: /172.16.1.153:38360 dst: /172.16.1.153:50010
java.io.IOException: Connection reset by peer
so i updated the following propetey in hdfs-site.xml
in order to increase the number of available threads in all datanodes <property>
<name>dfs.datanode.max.transfer.threads</name>
<value>16000</value>
</property>
and i increased the number of open files too by editing baschrc ulimit -n 16384
but i still getting this error in my data nodes logs so while sending write requests to the cluster i issued the following command on data nodes to know number of threads cat /proc/processid/status
and they never exceeds 100 thread and in in order to know open files number issued
sysctl fs.file-nr and they never exceeds 300 open files so why iam always getting this error in data node logs and what is it's effect on performance
... View more
- Tags:
- hadoop
- Hadoop Core
Labels:
- Labels:
-
Apache Hadoop
08-22-2017
01:36 PM
iam using hadoop apache 2.7.1 and it's high available and when issuing the command hdfs dfs -put file1 /hadoophome/ iam not able to put my file with following log in one of the available data nodes i have 6 data nodes and the replication factor is 3 2017-08-22 15:01:07,351 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-840293587-192.168.25.22-1499689510217:blk_1074111953_371166, type=HAS_DOWNSTREAM_IN_PIPELINE terminating 2017-08-22 15:01:07,938 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-840293587-192.168.25.22-1499689510217:blk_1074111949_371162, type=HAS_DOWNSTREAM_IN_PIPELINE java.io.EOFException: Premature EOF: no length prefix available at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2280) at org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:244) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.run(BlockReceiver.java:1237) at java.lang.Thread.run(Thread.java:748) 2017-08-22 15:01:08,984 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-840293587-192.168.25.22-1499689510217:blk_1074111954_371167 src: /192.168.25.8:35957 dest: /192.168.25.2:50010 2017-08-22 15:01:09,146 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /192.168.25.8:35957, dest: /192.168.25.2:50010, bytes: 82, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_-1028946205_23, offset: 0, srvID: 075f3a14-9b13-404d-8ba8-4066212655d7, blockid: BP-840293587-192.168.25.22-1499689510217:blk_1074111954_371167, duration: 3404900 2017-08-22 15:01:09,146 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-840293587-192.168.25.22-1499689510217:blk_1074111954_371167, type=HAS_DOWNSTREAM_IN_PIPELINE terminating 2017-08-22 15:01:09,280 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: IOException in BlockReceiver.run(): java.net.SocketTimeoutException: 300 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected local=/192.168.25.2:50010 remote=/192.168.25.2:48085] at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164) at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159) at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) at java.io.DataOutputStream.flush(DataOutputStream.java:123) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.sendAckUpstreamUnprotected(BlockReceiver.java:1473) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.sendAckUpstream(BlockReceiver.java:1410) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.run(BlockReceiver.java:1323) at java.lang.Thread.run(Thread.java:748) 2017-08-22 15:01:09,281 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-840293587-192.168.25.22-1499689510217:blk_1074111949_371162, type=HAS_DOWNSTREAM_IN_PIPELINE java.net.SocketTimeoutException: 300 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected local=/192.168.25.2:50010 remote=/192.168.25.2:48085] at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164) at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159) at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) at java.io.DataOutputStream.flush(DataOutputStream.java:123) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.sendAckUpstreamUnprotected(BlockReceiver.java:1473) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.sendAckUpstream(BlockReceiver.java:1410) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.run(BlockReceiver.java:1323) at java.lang.Thread.run(Thread.java:748) 2017-08-22 15:01:09,281 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-840293587-192.168.25.22-1499689510217:blk_1074111949_371162, type=HAS_DOWNSTREAM_IN_PIPELINE terminating 2017-08-22 15:01:10,808 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Slow BlockReceiver write data to disk cost:3173ms (threshold=300ms) 2017-08-22 15:01:10,808 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exception for BP-840293587-192.168.25.22-1499689510217:blk_1074111949_371162 java.nio.channels.ClosedByInterruptException at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:417) at org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:57) at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131) at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at java.io.BufferedInputStream.read1(BufferedInputStream.java:275) at java.io.BufferedInputStream.read(BufferedInputStream.java:334) at java.io.DataInputStream.read(DataInputStream.java:149) at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:199) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:472) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:849) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:804) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:251) at java.lang.Thread.run(Thread.java:748) 2017-08-22 15:01:10,809 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock BP-840293587-192.168.25.22-1499689510217:blk_1074111949_371162 received exception java.nio.channels.ClosedByInterruptException 2017-08-22 15:01:10,809 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: dn2:50010:DataXceiver error processing WRITE_BLOCK operation src: /192.168.25.2:48085 dst: /192.168.25.2:50010 java.nio.channels.ClosedByInterruptException at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:417) at org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:57) at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131) at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at java.io.BufferedInputStream.read1(BufferedInputStream.java:275) at java.io.BufferedInputStream.read(BufferedInputStream.java:334) at java.io.DataInputStream.read(DataInputStream.java:149) at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:199) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:472) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:849) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:804) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:251) at java.lang.Thread.run(Thread.java:748) 2017-08-22 15:01:11,314 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-840293587-192.168.25.22-1499689510217:blk_1074111949_371162 src: /192.168.25.2:48091 dest: /192.168.25.2:50010 2017-08-22 15:01:11,314 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Recover RBW replica BP-840293587-192.168.25.22-1499689510217:blk_1074111949_371162 2017-08-22 15:01:11,314 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Recovering ReplicaBeingWritten, blk_1074111949_371162, RBW getNumBytes() = 53932032 getBytesOnDisk() = 53932032 getVisibleLength()= 53932032 getVolume() = /hdd/data_dir/current getBlockFile() = /hdd/data_dir/current/BP-840293587-192.168.25.22-1499689510217/current/rbw/blk_1074111949 bytesAcked=53932032 bytesOnDisk=53932032 2017-08-22 15:01:11,619 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-840293587-192.168.25.22-1499689510217:blk_1074111949_371169, type=HAS_DOWNSTREAM_IN_PIPELINE java.io.EOFException: Premature EOF: no length prefix available at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2280) at org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:244) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.run(BlockReceiver.java:1237) at java.lang.Thread.run(Thread.java:748) 2017-08-22 15:01:11,630 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exception for BP-840293587-192.168.25.22-1499689510217:blk_1074111949_371169 java.net.SocketTimeoutException: 300 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/192.168.25.2:50010 remote=/192.168.25.2:48091] at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131) at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at java.io.BufferedInputStream.read1(BufferedInputStream.java:275) at java.io.BufferedInputStream.read(BufferedInputStream.java:334) at java.io.DataInputStream.read(DataInputStream.java:149) at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:199) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:472) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:849) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:804) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:251) at java.lang.Thread.run(Thread.java:748) 2017-08-22 15:01:11,630 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-840293587-192.168.25.22-1499689510217:blk_1074111949_371169, type=HAS_DOWNSTREAM_IN_PIPELINE: Thread is interrupted. 2017-08-22 15:01:11,630 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-840293587-192.168.25.22-1499689510217:blk_1074111949_371169, type=HAS_DOWNSTREAM_IN_PIPELINE terminating 2017-08-22 15:01:11,630 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock BP-840293587-192.168.25.22-1499689510217:blk_1074111949_371169 received exception java.net.SocketTimeoutException: 300 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/192.168.25.2:50010 remote=/192.168.25.2:48091] 2017-08-22 15:01:11,631 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: dn2:50010:DataXceiver error processing WRITE_BLOCK operation src: /192.168.25.2:48091 dst: /192.168.25.2:50010 java.net.SocketTimeoutException: 300 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/192.168.25.2:50010 remote=/192.168.25.2:48091] at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131) at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at java.io.BufferedInputStream.read1(BufferedInputStream.java:275) at java.io.BufferedInputStream.read(BufferedInputStream.java:334) at java.io.DataInputStream.read(DataInputStream.java:149) at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:199) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:472) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:849) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:804) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:251) at java.lang.Thread.run(Thread.java:748) 2017-08-22 15:01:11,959 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-840293587-192.168.25.22-1499689510217:blk_1074111956_371170 src: /192.168.25.8:35958 dest: /192.168.25.2:50010 2017-08-22 15:01:11,982 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /192.168.25.8:35958, dest: /192.168.25.2:50010, bytes: 82, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_899288339_23, offset: 0, srvID: 075f3a14-9b13-404d-8ba8-4066212655d7, blockid: BP-840293587-192.168.25.22-1499689510217:blk_1074111956_371170, duration: 22005000 2017-08-22 15:01:11,982 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-840293587-192.168.25.22-1499689510217:blk_1074111956_371170, type=LAST_IN_PIPELINE, downstreams=0:[] terminating 2017-08-22 15:01:15,027 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-840293587-192.168.25.22-1499689510217:blk_1074111957_371172 src: /192.168.25.2:48097 dest: /192.168.25.2:50010 2017-08-22 15:01:15,037 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /192.168.25.2:48097, dest: /192.168.25.2:50010, bytes: 82, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_-1375829046_24, offset: 0, srvID: 075f3a14-9b13-404d-8ba8-4066212655d7, blockid: BP-840293587-192.168.25.22-1499689510217:blk_1074111957_371172, duration: 3663100 2017-08-22 15:01:15,038 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-840293587-192.168.25.22-1499689510217:blk_1074111957_371172, type=HAS_DOWNSTREAM_IN_PIPELINE terminating any help to determine what is the problem please?
... View more
Labels:
- Labels:
-
Apache Hadoop
08-15-2017
12:33 PM
iam using hadoop apache 2.7.1
and i have configured data node directory to have multiple directories
<property>
<name>dfs.data.dir</name>
<value>/opt/hadoop/data_dir,file:///hdd/data_dir/</value>
<final>true</final>
</property>
according to this configuration writing file data should happens on both directories
/opt/hadoop/data_dir and file:///hdd/data_dir/ with same blocks names and on same sub directories names but in my cluster this behavior is not happening some times it writes data blocks to
local directory /opt/hadoop/data_dir and some times it writes data blocks to external hard directory file:///hdd/data_dir what could be possible reasons and how to control this behavior
... View more
Labels:
- Labels:
-
Apache Hadoop
08-09-2017
01:55 PM
i discovered that my problem was in journal node and not in namenode
even though the log of namenode shows the error mentioned in question jps shows journal node but it is fake because journal node service is shut down
even though it is found in jps output so as a solution i issue hadoop-daemon.sh stop journalnode
then hadoop-daemon.sh start journalnode and then namenode starts to work again
... View more
08-05-2017
08:12 AM
1 Kudo
iam using hadoop apache 2.7.1 high availability cluster that consists of
two name nodes mn1,mn2 and 3 journal nodes
but while i was working on cluster i faced the following error
when i issue start-dfs.sh mn1 is standby and mn2 is active
but after that if one of theses two namenodes are off there is no possibility
to turn it on again
and here are the last lines of log of one of these two name nodes
2017-08-05 09:37:21,063 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Need to save fs image? false (staleImage=true, haEnabled=true, isRollingUpgrade=false)
2017-08-05 09:37:21,063 INFO org.apache.hadoop.hdfs.server.namenode.NameCache: initialized with 3 entries 72 lookups
2017-08-05 09:37:21,088 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Finished loading FSImage in 7052 msecs
2017-08-05 09:37:21,300 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: RPC server is binding to mn2:8020
2017-08-05 09:37:21,304 INFO org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue
2017-08-05 09:37:21,316 INFO org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 8020
2017-08-05 09:37:21,353 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered FSNamesystemState MBean
2017-08-05 09:37:21,354 WARN org.apache.hadoop.hdfs.server.common.Util: Path /opt/hadoop/metadata_dir should be specified as a URI in configuration files. Please update hdfs configuration.
2017-08-05 09:37:21,361 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode.
java.lang.IllegalStateException
at com.google.common.base.Preconditions.checkState(Preconditions.java:129)
at org.apache.hadoop.hdfs.server.namenode.LeaseManager.getNumUnderConstructionBlocks(LeaseManager.java:119)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getCompleteBlocksTotal(FSNamesystem.java:5741)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startCommonServices(FSNamesystem.java:1063)
at org.apache.hadoop.hdfs.server.namenode.NameNode.startCommonServices(NameNode.java:678)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:664)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:811)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:795)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1488)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1554)
2017-08-05 09:37:21,364 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
2017-08-05 09:37:21,365 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at mn2/192.168.25.22
************************************************************/
... View more
- Tags:
- hadoop-core
Labels:
- Labels:
-
Apache Hadoop
07-16-2017
08:40 AM
in order to analyze my problem and because i am working on virtual machines i copied second name node virtual machine (i mean the name node which behaves well and change it's ip and host name to be the same of first name node which is slower so i have same network and software configuration) and i got the same results, what a strange state it's related to hostname only , i will mention my problem again when mn1 is off and mn2 is active and on and ever thing is ok ,but when mn2 is off and mn1 is active , mn1 responds slower as data nodes send requests to this dead mn2 name node host name and i don't know why and that's of ocourse when issuing webhdfs requests only and not on hdfs commands eventhough my requests are directed to active name node from client application
... View more
07-10-2017
12:46 PM
it doesn't succeed any other options?
... View more
07-09-2017
06:36 AM
iam using hadoop apache 2.7.1 with zookeeper quorum in order to achieve automatic failover on centos 7 and every thing is done ok but i am facing this strange problem supposing my active two name nodes are mn1 and mn2 when mn1 is off so because of automatic failover mn2 is active and iam using webhdfs requests which are directed to mn2 by client and the performance is greate but in the reverse status i mean when mn2 is off and mn1 becomes active and webhdfs requests are directed to mn1 the performance got stack until mn2 is on again even though if we turn on a mock mn2 i mean a host with same name with no services notice1: mn1 and mn2 has same configurations and all cluster nodes are vms which are connected to virtual switch so there are no network issues to worry about notice2:there are no difference between two active name nodes if we issue hdfs commands the difference only in webhdfs interface requests (in curl requests) what could be possible reasons of this difference in performance between two active name nodes ?
... View more
Labels:
- Labels:
-
Apache Hadoop
04-17-2017
12:50 PM
i had found the answer in this previous question https://community.hortonworks.com/questions/23046/question-on-hdfs-automatic-failover.html
... View more
04-17-2017
12:50 PM
I am using Apache Hadoop-2.7.1 on cluster that consists of three nodes nn1 master name node nn2 (second name node) dn1 (data node) i have configured high availability,and nameservice and zookeeper is working in all three nodes
and it is started on nn2 as leader first of all i have to mention that
nn1 is active and nn2 is stand by when i kill name node on nn1 ,nn2 becomes active so automatic fail over is happening but with the following scenario (which i apply when nn1 is active and nn2 is standby)and which is : when i turn off nn1 (nn1 whole crashing) nn2 stay stand by and doesn't become active so automatic failover is not happening with noticeable error in log Unable to trigger a roll of the active NN(which was nn1 and now it is closed ofcourse)
shouldn't automatic fail over happens with two existing journal nodes on nn2 and dn1 and what could be possible reasons ?
... View more
Labels:
- Labels:
-
Apache Hadoop
04-11-2017
12:36 PM
iam working on hadoop apache 2.7.1 and i have a cluster consists of 3 nodes nn1,nn2,dn1 nn1 is the dfs.default.name so it is master name node i have installed httpfs and started it of course after restarting all the services when nn1 is active and nn2 is standby i can send this request http://nn1:14000/webhdfs/v1/aloosh/oula.txt?op=open&user.name=root
from browser and a dialog of open or save for this file appears but when i kill name node in nn1 and start it again as normal and because of high availability nn1 becomes stand by and nn2 becomes active so here httpfs should work even if nn1 becomes stand by but sending the same request now http://nn1:14000/webhdfs/v1/aloosh/oula.txt?op=open&user.name=root
gives me the error {"RemoteException":{"message":"Operation category READ is not supported in state standby","exception":"RemoteException","javaClassName":"org.apache.hadoop.ipc.RemoteException"}}
shouldn't httpfs overcome nn1 stand by status and brings the file ? is that because of wrong configuration may be,or is there any other reason my core-site is <property>
<name>hadoop.proxyuser.root.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
</property>
... View more
Labels:
- Labels:
-
Apache Hadoop
04-11-2017
08:20 AM
i solved this proplem by
issuing stop-all.sh and then start-all.sh then httpfs.sh start even
though i had applied this solution before but with stop-dfs.sh and then
start-dfs.sh then httpfs.sh start and i don't see a differnce?but that
worked for me,so restarting all services solved my proplem
... View more
04-11-2017
06:14 AM
iam using hadoop 2.7.1 and not azure and this is the out put of hadoop version Hadoop 2.7.1
Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r 15ecc87ccf4a0228f35af08fc56de536e6ce657a
Compiled by jenkins on 2015-06-29T06:04Z
Compiled with protoc 2.5.0
From source with checksum fc0a1a23fc1868e4d5ee7fa2b28a58a and i have no ssl cofigured ,and iam able to login to my cluster but is this call http://192.168.4.128:14000/webhdfs/v1/hadoophome/myfile.txt/?op=open&user.name=root is a right call to send httpfs request from windows browser to hadoop cluster or i can't make an httpfs request from windows browser becuase it depends on tomcat server
... View more
04-10-2017
02:12 PM
1 Kudo
iam using hadoop apache 2.7.1 and i have a 3 nodes cluster i configured httpfs and started it on name node (192.168.4.128) we know that we can make webhdfs request from browser example :if we want to open a file through webhdfs request we call the following url from browser http://192.168.4.128:50070/webhdfs/v1/hadoophome/myfile.txt/?user.name=root&op=OPEN
and the response will be the dialog of save or open file but if we are using httpfs ,can we make an httpfs request from browser if i call the following request from browser http://192.168.4.128:14000/webhdfs/v1/hadoophome/myfile.txt/?op=open&user.name=root
i get the following error {"RemoteException":{"message":"org.apache.hadoop.fs.FileSystem: Provider org.apache.hadoop.fs.azure.NativeAzureFileSystem not a subtype","exception":"ServiceConfigurationError","javaClassName":"java.util.ServiceConfigurationError"}}
and if i issue https://192.168.4.128:14000/webhdfs/v1/aloosh/oula.txt/?op=open&user.name=root
i get the error An error occurred during a connection to 192.168.4.128:14000. SSL received a record that exceeded the maximum permissible length. Error code: SSL_ERROR_RX_RECORD_TOO_LONG
so can httpfs request be done from browser ? my core-site.xml has these httpfs properties <property>
<name>hadoop.proxyuser.root.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
</property>
... View more
Labels:
- Labels:
-
Apache Hadoop
04-09-2017
10:28 AM
iam using hadoop apache 2.7.1 on centos 7 and iam new to linux and i want to use httpfs so i should follow the following link https://hadoop.apache.org/docs/current/hadoop-hdfs-httpfs/ServerSetup.html but the problem that i can't find any download for httpfs-3.0.0-alpha2.tar.gz in
in order to tar it please any link or mirror to help and will this httpfs version works with my existing hadoop version
... View more
Labels:
- Labels:
-
Apache Hadoop
04-05-2017
11:17 AM
iam using hadoop apache 2.7.1 after setting high availability in hadoop cluster the automatic zookeeper fail over controller zkfc will apply fencing method to fence(stop) one of the two name nodes if it goes down and dfs.ha.fencing.methods in hdfs-site property handles this method as sshfence but my question is what about if we have a passworded ssh can fencing happens or automatic fail over works only with password less ssh ? is there any way to make sshfencce include password in ssh in configuration?
... View more
Labels:
- Labels:
-
Apache Hadoop
04-04-2017
05:47 AM
iam using hadoop apache 2.7.1 and i have followed the link you applied https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html and finally tried to force one of the two name nodes to be active manually by applying hdfs haadmin -transitionToActive hadoop-master with the following response 17/04/04 03:13:06 WARN ha.HAAdmin: Proceeding with manual HA state
management even though automatic failover is enabled for NameNode at
hadoop-slave-1/192.168.4.111:8020 17/04/04 03:13:07 WARN util.NativeCodeLoader: Unable to load
native-hadoop library for your platform... using builtin-java classes where
applicable 17/04/04 03:13:07 WARN ha.HAAdmin: Proceeding with manual HA state
management even though automatic failover is enabled for NameNode at
hadoop-master/192.168.4.128:8020 Operation failed: End of File Exception between local host is:
"hadoop-master/192.168.4.128"; destination host is:
"hadoop-master":8020; : java.io.EOFException; For more details
see:http://wiki.apache.org/hadoop/EOFException what should i do with two stand by name nodes should i apply name node format on one of these two name nodes
... View more
04-03-2017
02:04 PM
running ./zkCli on both namenodes shows same error Welcome to ZooKeeper!
JLine support is enabled
[zk: localhost:2181(CONNECTING) 0] 2017-04-03 09:57:34,141 [myid:] - INFO [main-SendThread(127.0.0.1:2181):ClientCnxn$SendThread@1032] - Opening socket connection to server 127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2017-04-03 09:57:34,148 [myid:] - WARN [main-SendThread(127.0.0.1:2181):ClientCnxn$SendThread@1162] - Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141)
... View more
04-03-2017
02:03 PM
running ./zkCli on two name nodes shows the same error Welcome to ZooKeeper!
JLine support is enabled
[zk: localhost:2181(CONNECTING) 0] 2017-04-03 09:57:34,141 [myid:] - INFO [main-SendThread(127.0.0.1:2181):ClientCnxn$SendThread@1032] - Opening socket connection to server 127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2017-04-03 09:57:34,148 [myid:] - WARN [main-SendThread(127.0.0.1:2181):ClientCnxn$SendThread@1162] - Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141)
... View more
04-03-2017
01:29 PM
even though i started zookeper server and i get a leader mode in one of two namenodes and follower mode in the other name node and data node, i still get same problem that both of two name nodes are stand by ,also there are no log files under log directory that is configured in zoo.cfg ,so i can't know zoo keeper errors but i think when .zkServer.sh status gives a status(followe or leader) it indicates that every thing with zookeeper is all right isn't it ?
... View more
04-03-2017
01:01 PM
i have configured high availability in my cluster
which consists of three nodes
hadoop-master(192.168.4.128)(name node)
hadoop-slave-1(192.168.4.111) (another name node )
hadoop-slave-2 (192.168.4.106) (data node)
without formatting name node ( converting a non-HA-enabled cluster to be HA-enabled) as described here
https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html
but i got two name nodes working as standby
so i tried to move the transition of one of these two nodes to active by applying the following command
hdfs haadmin -transitionToActive mycluster --forcemanual
with the following out put 17/04/03 08:07:35 WARN ha.HAAdmin: Proceeding with manual HA state management even though
automatic failover is enabled for NameNode at hadoop-master/192.168.4.128:8020
17/04/03 08:07:36 WARN ha.HAAdmin: Proceeding with manual HA state management even though
automatic failover is enabled for NameNode at hadoop-slave-1/192.168.4.111:8020
Illegal argument: Unable to determine service address for namenode 'mycluster'
my core-site is <property>
<name>dfs.tmp.dir</name>
<value>/opt/hadoop/data15</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://hadoop-master:8020</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/usr/local/journal/node/local/data</value>
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://mycluster</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/tmp</value>
</property>
my hdfs-site.xml is <property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>/opt/hadoop/data16</value>
<final>true</final>
</property>
<property>
<name>dfs.data.dir</name>
<value>/opt/hadoop/data17</value>
<final>true</final>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>hadoop-slave-1:50090</value>
</property>
<property>
<name>dfs.nameservices</name>
<value>mycluster</value>
<final>true</final>
</property>
<property>
<name>dfs.ha.namenodes.mycluster</name>
<value>hadoop-master,hadoop-slave-1</value>
<final>true</final>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.hadoop-master</name>
<value>hadoop-master:8020</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.hadoop-slave-1</name>
<value>hadoop-slave-1:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.hadoop-master</name>
<value>hadoop-master:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.hadoop-slave-1</name>
<value>hadoop-slave-1:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://hadoop-master:8485;hadoop-slave-2:8485;hadoop-slave-1:8485/mycluster</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>hadoop-master:2181,hadoop-slave-1:2181,hadoop-slave-2:2181</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>root/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>3000</value>
</property>
what should the service address value be ? and what are possible solutions i can apply in order
to turn on one name node of the two nodes to active state ? note the zookeeper server on all three nodes is stopped
... View more
Labels:
- Labels:
-
Apache Hadoop
12-14-2016
09:49 AM
thanks for your brilliant detailed answer
... View more
12-14-2016
09:48 AM
source sql server table ,destination is hive table i haven't configured any permissions configurations in hadoop yet so my problem is because of polybase limited inserted rows thanks for you
... View more
12-13-2016
02:09 PM
2 Kudos
when we use PolyBase which is sql server 2016 technique
and add an external table to a table in hive
and we want to insert data in this external table =>inserting this data in associated hive table
my question is ?
is there any limit in external table max inserted records
i mean if iam inserting data in external table from another sql server table that has more than 30000 records
i encounter this error
Cannot execute the query "Remote Query" against OLE DB provider
"SQLNCLI11" for linked server "SQLNCLI11". 110802;An internal DMS error
occurred that caused this operation to fail. Details: Exception:
Microsoft.SqlServer.DataWarehouse.DataMovement.Common.ExternalAccess.HdfsAccessException,
Message: Java exception raised on call to
HdfsBridge_DestroyRecordWriter: Error [0
at
org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.getDatanodeStorageInfos(DatanodeManager.java:513)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.updatePipelineInternal(FSNamesystem.java:6379)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.updatePipeline(FSNamesystem.java:6344)
at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.updatePipeline(NameNodeRpcServer.java:822)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.updatePipeline(ClientNamenodeProtocolServerSideTranslatorPB.java:971)
at
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
] occurred while accessing external file.
while inserting less than 30000 records leads to every thing works ok and data is inserted in hive
will this error becuase of one of the reasons
1- there is a limit in external table insert records number
2- there is a limit in poly base configuration
3- Any other problem in hive
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Hive
12-08-2016
10:55 AM
Another Question Please:what is the benefit of installing hive server on other nodes rather than name node if we choose another node to install hive server rather than name node will hive commands be handled from this node or we should install hive on name node firstly
... View more
12-08-2016
08:57 AM
i don't know alot about i hive
i have explored many tutorials about hive and all are talking about hive commands syntax
but if we want to talk about this echosystem
when we have a cluster and install hive service on namenode only
and create a table in hive and then insert 10 records in
is hive table going to be replicated in all cluster data nodes when replication factor inclueds all data nodes ?
or is it going to be found only in name node and no replication occurs?
should hive be installed in all cluster nodes ?
is automatic Replication is only for Hdfs Files and not for hive?
is hive table equal to hdfs file ?
how is hive table represented and how to find this table when working
with hdfs if we didn't specify location in its creation statement?
are hive table stored blocks able to be understood if we explore like hdfs files are ?
Can you give me links please
... View more
- Tags:
- Hadoop Core
- Hive
Labels:
- Labels:
-
Apache Hive
12-08-2016
08:51 AM
We know that hadoop main purpose is increasing Performance through adding more data nodes but my question is if we want to retrieve the data only with out the need to process it or analyze it ) is adding more data nodes will be useful or it doesn't increase the performance at all because we have retrieve operations only with out any computations or map reduce jobs
... View more
- Tags:
- hadoop
- Hadoop Core
Labels:
- Labels:
-
Apache Hadoop
07-30-2016
11:49 AM
in my search i always encounter these tools hive ,hue, sqoop and each
one has a specific installation way requirements ,specific operating
system version,specific hadoop version ,specific environment to deal
with like cloudera,ambari but i'm still not able to understand
the relation between these tools ,i mean is hive part of hue or it
can be stand alone tool for data importing and exporting to sql
server, is sqoop is another tool for data processing ,can it be
standalone tool ? i want some one to explain these infrastructure
of hadoop and what is the relation between these tools which i cant
found directly while exploring google,what is the best tool for data
importing and exporting to sql server ?
... View more
- Tags:
- hadoop
- Hadoop Core
Labels:
- Labels:
-
Apache Hadoop