Member since
02-10-2016
36
Posts
14
Kudos Received
0
Solutions
12-22-2017
11:15 AM
Thank you
... View more
12-19-2017
01:31 PM
Thanks for the response. I wanted to know if the memory assignment could be done, without providing these while submitting jobs.
... View more
12-18-2017
01:03 PM
Labels:
- Labels:
-
Apache Hadoop
06-29-2017
11:34 AM
The DN process is running if do a check in the machine using ps -ef. But Ambari incorrectly shows the DataNode process as stopped.
... View more
06-27-2017
01:13 PM
In Ambari UI, the data node is in stopped state few seconds after starting it. As mentioned in the earlier reply, with hdfs fsck command the newly added nodes are also listed, though Ambari doesnt recognize the addition.
... View more
06-27-2017
10:49 AM
I'm trying to add 2 new datanodes to an existing HDP2.3 cluster through Ambari. The existing 36 data nodes have configuration 10CPU's, 56GB RAM and 8.5 TB disk size. The data node heap size is set as 1 GB.The 2 new ones to be added have configuration of 6 CPU's, 25GB RAM and 1 TB disk size. The HDFS disk usage is 7%. I'm able to start the NodeManager and AmbariMetrics service in the new nodes, but the datanode service goes down immediately after starting. Below are the logs from hadoop-hdfs-datanode-worker1.log 2017-06-27 12:07:30,047 INFO datanode.DataNode (BPServiceActor.java:blockReport(488)) - Successfully sent block report 0x2235b2b47bf3a, containing 1 storage report(s), of which we sent 1. The reports had 19549 total blocks and used 1 RPC(s). This took 10 msec to generate and 695 msecs for RPC and NN processing. Got back no commands.
2017-06-27 12:07:36,003 ERROR datanode.DataNode (DataXceiver.java:run(278)) - worker1.bigdata.net.net:50010:DataXceiver error processing unknown operation src: /10.255.yy.yy:49656 dst: /10.255.xx.xx:50010
java.io.EOFException
at java.io.DataInputStream.readShort(DataInputStream.java:315)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.readOp(Receiver.java:58)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:227)
at java.lang.Thread.run(Thread.java:745)
2017-06-27 12:08:00,180 INFO datanode.DataNode (DataXceiver.java:writeBlock(655)) - Receiving BP-1320493910-10.255.zz.zz-1479412973603:blk_1100238956_26515824 src: /10.254.yy.yy:45293 dest: /10.255.xx.xx:50010
2017-06-27 12:08:00,326 INFO DataNode.clienttrace (BlockReceiver.java:finalizeBlock(1432)) - src: /10.254.yy.yy:45293, dest: /10.255.xx.xx:50010, bytes: 26872748, op: HDFS_WRITE, cliID: DFSClient_attempt_1498498030455_0521_r_000001_0_-908535141_1, offset: 0, srvID: f148bbe2-8f2a-489b-b03d-c8322aecd43e, blockid: BP-1320493910-10.255.zz.zz-1479412973603:blk_1100238956_26515824, duration: 122445075
2017-06-27 12:08:00,326 INFO datanode.DataNode (BlockReceiver.java:run(1405)) - PacketResponder: BP-1320493910-10.255.12.202-1479412973603:blk_1100238956_26515824, type=HAS_DOWNSTREAM_IN_PIPELINE terminating
Thanks in advance.
... View more
Labels:
- Labels:
-
Apache Ambari
-
Apache Hadoop
06-16-2017
10:58 AM
I have a multi-tenanted HDP2.3 cluster. It has been configured with an S3 end-point in custom hdfs-site.xml. Is it possible to add another S3 end-point for another tenant? If so, what should be the property name?
Thanks in Advance.
... View more
Labels:
- Labels:
-
Hortonworks Data Platform (HDP)
06-15-2016
07:08 AM
1 Kudo
I have a HDP 2.0 cluster where I'm executing a mapreduce program which takes Hive(0.14) table as input. There are a large number of small files for the Hive table and hence large number of mapper containers are being requested. Please let me know if there is a way to combine small files before being input to mapreduce job?
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache YARN
04-14-2016
10:17 AM
Thanks for the suggestions.Two of the data nodes in the cluster had to be replaced, as it didn't have enough disk space. I have also set the below in hdfs configuration and the jobs started executing fine even though I have noticed "Premature end of fail" error in data node logs. dfs.client.block.write.replace-datanode-on-failure.policy=ALWAYS
... View more
04-12-2016
12:48 PM
I'm trying to execute a MapReduce streaming job in a 10 node Hadoop cluster(HDP2.2). There are 5 datanodes in the cluster. When the reduce phase reaches almost 100% completion, I'm getting the below error in client logs: Error: java.io.IOException: Failed to replace a bad
datanode on the existing pipeline due to no more good datanodes being available
to try. (Nodes: current=[x.x.x.x:50010], original=[x.x.x.x:50010]).
The current failed datanode replacement policy is DEFAULT, and a client may
configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy'
in its configuration The data node on which the jobs were executing contained below logs: INFO datanode.DataNode (BlockReceiver.java:run(1222)) - PacketResponder:
BP-203711345-10.254.65.246-1444744156994:blk_1077645089_3914844,
type=HAS_DOWNSTREAM_IN_PIPELINE
java.io.EOFException: Premature EOF: no length prefix available
at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2203)
java.io.IOException: Premature EOF from inputStream
at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:194)
2016-04-10 08:12:14,477 WARN datanode.DataNode
(BlockReceiver.java:run(1256)) - IOException in BlockReceiver.run():
java.io.IOException: Connection reset by peer
016-04-10 08:13:22,431 INFO datanode.DataNode
(BlockReceiver.java:receiveBlock(816)) - Exception for
BP-203711345-x.x.x.x -1444744156994:blk_1077645082_3914836
java.net.SocketTimeoutException: 60000 millis timeout while
waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected
local=/XX.XXX.XX.XX:50010 remote=/XX.XXX.XX.XXX:57649]
The NameNode logs contained the below warning: WARN blockmanagement.BlockPlacementPolicy
(BlockPlacementPolicyDefault.java:chooseTarget(383)) - Failed to place enough
replicas, still in need of 1 to reach 2 (unavailableStorages=[DISK],
storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK],
creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=false) For more
information, please enable DEBUG log level on
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy I had tried setting the below parameters in hdfs-site.xml dfs.datanode.handler.count =10
dfs.client.file-block-storage-locations.num-threads = 10
dfs.datanode.socket.write.timeout=20000
But still the error persists. Kindly suggest a solution. Thanks
... View more
Labels:
- Labels:
-
Apache Hadoop