Member since
02-10-2016
36
Posts
14
Kudos Received
0
Solutions
12-22-2017
11:15 AM
Thank you
... View more
12-19-2017
01:31 PM
Thanks for the response. I wanted to know if the memory assignment could be done, without providing these while submitting jobs.
... View more
12-18-2017
01:03 PM
Labels:
- Labels:
-
Apache Hadoop
06-16-2017
10:58 AM
I have a multi-tenanted HDP2.3 cluster. It has been configured with an S3 end-point in custom hdfs-site.xml. Is it possible to add another S3 end-point for another tenant? If so, what should be the property name?
Thanks in Advance.
... View more
Labels:
- Labels:
-
Hortonworks Data Platform (HDP)
06-15-2016
07:08 AM
1 Kudo
I have a HDP 2.0 cluster where I'm executing a mapreduce program which takes Hive(0.14) table as input. There are a large number of small files for the Hive table and hence large number of mapper containers are being requested. Please let me know if there is a way to combine small files before being input to mapreduce job?
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache YARN
04-14-2016
10:17 AM
Thanks for the suggestions.Two of the data nodes in the cluster had to be replaced, as it didn't have enough disk space. I have also set the below in hdfs configuration and the jobs started executing fine even though I have noticed "Premature end of fail" error in data node logs. dfs.client.block.write.replace-datanode-on-failure.policy=ALWAYS
... View more
04-12-2016
12:48 PM
I'm trying to execute a MapReduce streaming job in a 10 node Hadoop cluster(HDP2.2). There are 5 datanodes in the cluster. When the reduce phase reaches almost 100% completion, I'm getting the below error in client logs: Error: java.io.IOException: Failed to replace a bad
datanode on the existing pipeline due to no more good datanodes being available
to try. (Nodes: current=[x.x.x.x:50010], original=[x.x.x.x:50010]).
The current failed datanode replacement policy is DEFAULT, and a client may
configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy'
in its configuration The data node on which the jobs were executing contained below logs: INFO datanode.DataNode (BlockReceiver.java:run(1222)) - PacketResponder:
BP-203711345-10.254.65.246-1444744156994:blk_1077645089_3914844,
type=HAS_DOWNSTREAM_IN_PIPELINE
java.io.EOFException: Premature EOF: no length prefix available
at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2203)
java.io.IOException: Premature EOF from inputStream
at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:194)
2016-04-10 08:12:14,477 WARN datanode.DataNode
(BlockReceiver.java:run(1256)) - IOException in BlockReceiver.run():
java.io.IOException: Connection reset by peer
016-04-10 08:13:22,431 INFO datanode.DataNode
(BlockReceiver.java:receiveBlock(816)) - Exception for
BP-203711345-x.x.x.x -1444744156994:blk_1077645082_3914836
java.net.SocketTimeoutException: 60000 millis timeout while
waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected
local=/XX.XXX.XX.XX:50010 remote=/XX.XXX.XX.XXX:57649]
The NameNode logs contained the below warning: WARN blockmanagement.BlockPlacementPolicy
(BlockPlacementPolicyDefault.java:chooseTarget(383)) - Failed to place enough
replicas, still in need of 1 to reach 2 (unavailableStorages=[DISK],
storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK],
creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=false) For more
information, please enable DEBUG log level on
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy I had tried setting the below parameters in hdfs-site.xml dfs.datanode.handler.count =10
dfs.client.file-block-storage-locations.num-threads = 10
dfs.datanode.socket.write.timeout=20000
But still the error persists. Kindly suggest a solution. Thanks
... View more
Labels:
- Labels:
-
Apache Hadoop
03-22-2016
06:00 AM
I have upgraded to Hadoop 2.7 now. I have done configurations changes for s3a and the queries are executing successfully. Thank you.
... View more
02-26-2016
06:49 AM
1 Kudo
Though have not yet upgraded to Hadoop 2.7, I made the configuration changes for s3a as per the documentation. On executing Hive create query, I got the below exception: FAILED: AmazonClientException Unable to execute HTTP request: Connect to hive-bucket.s3.amazonaws.com:443 timed out
... View more
02-22-2016
10:09 AM
1 Kudo
@Artem Ervits Copied jets3t.properties to all data nodes. Currently I'm getting below exception: org.apache.hadoop.fs.s3.S3Exception: org.jets3t.service.ServiceException: S3 Error Message. -- ResponseCode: 403, ResponseStatus: Forbidden, XML Error Message: <?xml version="1.0" encoding="UTF-8"?><Error><Code>AccessDenied</Code><Message>Access Denied</Message><Resource>/hive-bucket</Resource><RequestId></RequestId></Error>
at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.processException(Jets3tNativeFileSystemStore.java:470)
... View more