Member since
03-03-2016
69
Posts
46
Kudos Received
0
Solutions
11-15-2016
10:53 AM
1 Kudo
Hi, I have already tested my HDFS HA cluster with my java application (i.e simple file upload/download). Now that I have configured federated cluster, How can I use it in same code with few necessary modifications. Here is my code example for HDFS HA cluster. Please suggest for Federation HA cluster. Problem is I think my Configuration object is getting null while getting the FileSystem object for federated cluster. conf.set("fs.defaultFS", "hdfs://HadoopTestHA");
conf.set("dfs.replication","4");
conf.set("dfs.ha.namenodes.HadoopTestHA", "nn1,nn2");
conf.set("dfs.namenode.rpc-address.HadoopTestHA.nn1","hadoop4ind.india:8020");
conf.set("dfs.namenode.rpc-address.HadoopTestHA.nn2","hadoop5ind.india:8020");
conf.set("dfs.client.failover.proxy.provider.HadoopTestHA","org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider");
copyFromLocalHDFS(conf,"Fed_Java_Test"); // File uploading code
... View more
Labels:
- Labels:
-
Apache Hadoop
10-21-2016
07:58 AM
Hi, I had 1 millions of files under replicated in my HDFS cluster for a testing scenario, It replicated successfully but I want to know that Which nodes are participating on this operation ? Like What is the role of namenode and datanode in this under-replication process ?
... View more
Labels:
- Labels:
-
Apache Hadoop
10-11-2016
01:19 PM
Point 1 : I can't do cat /etc/hadoop/conf/dfs.exclude as its not a file, I am using ambari to manage the HDFS cluster. Point 2 : Size is now low, but the issue was happening when HDFS size is like around 20 Gb and I am uploading the files of max 1 Mb.
... View more
10-11-2016
11:20 AM
@Sagar Shimpi : Please have a look at the screenshot attached and 2nd point : Value of dfs.hosts.exclude is as follows <property>
<name>dfs.hosts.exclude</name>
<value>/etc/hadoop/conf/dfs.exclude</value>
</property>
... View more
10-11-2016
06:10 AM
@David Streever : Can you please have a look at this ? I have also attached my data node details in above comment I am stuck here. No more files are being uploaded to any of the data node. Also I have observed that if both datanode is up and running than process doesn't throw an exception, but if one of them goes down, than after some time this exception is thrown, how to resolve this ? Is it like, if one datanode is not communicating with other for few minutes,then this whole upload process can be stopped after that time.
... View more
10-10-2016
02:34 PM
adding my datanode health status as comment, [hdfs@hadoop3ind1 root]$ hdfs dfsadmin -report
Configured Capacity: 315522809856 (293.85 GB)
Present Capacity: 288754439992 (268.92 GB)
DFS Remaining: 285925896192 (266.29 GB)
DFS Used: 2828543800 (2.63 GB)
DFS Used%: 0.98%
Under replicated blocks: 1475
Blocks with corrupt replicas: 1
Missing blocks: 0
Missing blocks (with replication factor 1): 0
-------------------------------------------------
Live datanodes (2):
Name: 192.168.11.47:50010 (hadoop3ind1.india)
Hostname: hadoop3ind1.india
Decommission Status : Normal
Configured Capacity: 157761404928 (146.93 GB)
DFS Used: 1563369472 (1.46 GB)
Non DFS Used: 11485380608 (10.70 GB)
DFS Remaining: 144712654848 (134.77 GB)
DFS Used%: 0.99%
DFS Remaining%: 91.73%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 2
Last contact: Mon Oct 10 20:02:06 IST 2016
Name: 192.168.11.45:50010 (hadoop1ind1.india)
Hostname: hadoop1ind1.india
Decommission Status : Normal
Configured Capacity: 157761404928 (146.93 GB)
DFS Used: 1265174328 (1.18 GB)
Non DFS Used: 15282989256 (14.23 GB)
DFS Remaining: 141213241344 (131.52 GB)
DFS Used%: 0.80%
DFS Remaining%: 89.51%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 2
Last contact: Mon Oct 10 20:02:05 IST 2016
... View more
10-10-2016
01:58 PM
Hi, I am facing the issue while uploading files from local file system to HDFS using java api method as fs.copyFromLocalFile(new Path("D://50GBTtest"),
new Path("/50GBTest")); after around 1000 files uploaded, I got an exception saying as follows, Oct 10, 2016 6:45:34 PM org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer nextBlockOutputStream
INFO: Abandoning BP-1106394772-192.168.11.45-1476099033336:blk_1073743844_3020
Oct 10, 2016 6:45:34 PM org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer nextBlockOutputStream
INFO: Excluding datanode 192.168.11.45:50010
Oct 10, 2016 6:45:34 PM org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer run
WARNING: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /50GBTest/50GBTtest/disk10544.doc could only be replicated to 0 nodes instead of minReplication (=1). There are 2 datanode(s) running and 2 node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1640)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3161)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3085)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:830)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:500)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2273)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2269)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2267)
at org.apache.hadoop.ipc.Client.call(Client.java:1411)
at org.apache.hadoop.ipc.Client.call(Client.java:1364)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy7.addBlock(Unknown Source)
at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy7.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:368)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1449)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1270)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:526)
I have checked that 1 datanode is running and has lot of free space as well. I have kept another datanode down due to some scenario test. Can anybody have the ideas ? or faced this issue in past ?
... View more
Labels:
- Labels:
-
Apache Hadoop
10-07-2016
06:05 AM
Previous comment was for namenode restart and this is I am showing you the datanode restart after allocating more memory to java. Traceback (most recent call last):
File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/datanode.py", line 167, in <module>
DataNode().execute()
File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 219, in execute
method(env)
File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 530, in restart
self.start(env, upgrade_type=upgrade_type)
File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/datanode.py", line 62, in start
datanode(action="start")
File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89, in thunk
return fn(*args, **kwargs)
File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_datanode.py", line 72, in datanode
create_log_dir=True
File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/utils.py", line 267, in service
Execute(daemon_cmd, not_if=process_id_exists_command, environment=hadoop_env_exports)
File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 154, in __init__
self.env.run()
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 160, in run
self.run_action(resource, action)
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 124, in run_action
provider_action()
File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 238, in action_run
tries=self.resource.tries, try_sleep=self.resource.try_sleep)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 70, in inner
result = function(command, **kwargs)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 92, in checked_call
tries=tries, try_sleep=try_sleep)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 140, in _call_wrapper
result = _call(command, **kwargs_copy)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 291, in _call
raise Fail(err_msg)
resource_management.core.exceptions.Fail: Execution of 'ambari-sudo.sh su hdfs -l -s /bin/bash -c 'ulimit -c unlimited ; /usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh --config /usr/hdp/current/hadoop-client/conf start datanode'' returned 1. starting datanode, logging to /var/log/hadoop/hdfs/hadoop-hdfs-datanode-hadoop1ind1.india.out
@Pradeep kumar : Can you please have a look at both the log and help me out extending my current HDFS storage
... View more
10-07-2016
05:59 AM
I have done all the steps you have given above and I am facing an issue right now while restarting the HDFS service. Here is the log attached below. Traceback (most recent call last):
File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/datanode.py", line 167, in <module>
DataNode().execute()
File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 219, in execute
method(env)
File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 530, in restart
self.start(env, upgrade_type=upgrade_type)
File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/datanode.py", line 62, in start
datanode(action="start")
File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89, in thunk
return fn(*args, **kwargs)
File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_datanode.py", line 72, in datanode
create_log_dir=True
File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/utils.py", line 267, in service
Execute(daemon_cmd, not_if=process_id_exists_command, environment=hadoop_env_exports)
File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 154, in __init__
self.env.run()
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 160, in run
self.run_action(resource, action)
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 124, in run_action
provider_action()
File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 238, in action_run
tries=self.resource.tries, try_sleep=self.resource.try_sleep)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 70, in inner
result = function(command, **kwargs)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 92, in checked_call
tries=tries, try_sleep=try_sleep)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 140, in _call_wrapper
result = _call(command, **kwargs_copy)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 291, in _call
raise Fail(err_msg)
resource_management.core.exceptions.Fail: Execution of 'ambari-sudo.sh su hdfs -l -s /bin/bash -c 'ulimit -c unlimited ; /usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh --config /usr/hdp/current/hadoop-client/conf start datanode'' returned 1. starting datanode, logging to /var/log/hadoop/hdfs/hadoop-hdfs-datanode-hadoop1ind1.india.out
Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x00000000bc800000, 864026624, 0) failed; error='Cannot allocate memory' (errno=12)
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (malloc) failed to allocate 864026624 bytes for committing reserved memory.
# An error report file with more information is saved as:
# /var/log/hadoop/hdfs/hs_err_pid51884.log Can you please look out at and tell me what exactly went wrong ?
... View more
10-05-2016
05:17 AM
No I am not looking for learning purpose but I am doing research for my organization to take this in production as well later. Thanks @Arun
... View more