About viruvekariya

viruvekariya · ‎11-15-2016

Hi, I have already tested my HDFS HA cluster with my java application (i.e simple file upload/download). Now that I have configured federated cluster, How can I use it in same code with few necessary modifications. Here is my code example for HDFS HA cluster. Please suggest for Federation HA cluster. Problem is I think my Configuration object is getting null while getting the FileSystem object for federated cluster. conf.set("fs.defaultFS", "hdfs://HadoopTestHA"); conf.set("dfs.replication","4"); conf.set("dfs.ha.namenodes.HadoopTestHA", "nn1,nn2"); conf.set("dfs.namenode.rpc-address.HadoopTestHA.nn1","hadoop4ind.india:8020"); conf.set("dfs.namenode.rpc-address.HadoopTestHA.nn2","hadoop5ind.india:8020"); conf.set("dfs.client.failover.proxy.provider.HadoopTestHA","org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider"); copyFromLocalHDFS(conf,"Fed_Java_Test"); // File uploading code

viruvekariya · ‎10-21-2016

Hi, I had 1 millions of files under replicated in my HDFS cluster for a testing scenario, It replicated successfully but I want to know that Which nodes are participating on this operation ? Like What is the role of namenode and datanode in this under-replication process ?

viruvekariya · ‎10-11-2016

Point 1 : I can't do cat /etc/hadoop/conf/dfs.exclude as its not a file, I am using ambari to manage the HDFS cluster. Point 2 : Size is now low, but the issue was happening when HDFS size is like around 20 Gb and I am uploading the files of max 1 Mb.

viruvekariya · ‎10-11-2016

@Sagar Shimpi : Please have a look at the screenshot attached and 2nd point : Value of dfs.hosts.exclude is as follows <property> <name>dfs.hosts.exclude</name> <value>/etc/hadoop/conf/dfs.exclude</value> </property>

viruvekariya · ‎10-11-2016

@David Streever : Can you please have a look at this ? I have also attached my data node details in above comment I am stuck here. No more files are being uploaded to any of the data node. Also I have observed that if both datanode is up and running than process doesn't throw an exception, but if one of them goes down, than after some time this exception is thrown, how to resolve this ? Is it like, if one datanode is not communicating with other for few minutes,then this whole upload process can be stopped after that time.

viruvekariya · ‎10-10-2016

adding my datanode health status as comment, [hdfs@hadoop3ind1 root]$ hdfs dfsadmin -report Configured Capacity: 315522809856 (293.85 GB) Present Capacity: 288754439992 (268.92 GB) DFS Remaining: 285925896192 (266.29 GB) DFS Used: 2828543800 (2.63 GB) DFS Used%: 0.98% Under replicated blocks: 1475 Blocks with corrupt replicas: 1 Missing blocks: 0 Missing blocks (with replication factor 1): 0 ------------------------------------------------- Live datanodes (2): Name: 192.168.11.47:50010 (hadoop3ind1.india) Hostname: hadoop3ind1.india Decommission Status : Normal Configured Capacity: 157761404928 (146.93 GB) DFS Used: 1563369472 (1.46 GB) Non DFS Used: 11485380608 (10.70 GB) DFS Remaining: 144712654848 (134.77 GB) DFS Used%: 0.99% DFS Remaining%: 91.73% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 2 Last contact: Mon Oct 10 20:02:06 IST 2016 Name: 192.168.11.45:50010 (hadoop1ind1.india) Hostname: hadoop1ind1.india Decommission Status : Normal Configured Capacity: 157761404928 (146.93 GB) DFS Used: 1265174328 (1.18 GB) Non DFS Used: 15282989256 (14.23 GB) DFS Remaining: 141213241344 (131.52 GB) DFS Used%: 0.80% DFS Remaining%: 89.51% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 2 Last contact: Mon Oct 10 20:02:05 IST 2016

viruvekariya · ‎10-10-2016

Hi, I am facing the issue while uploading files from local file system to HDFS using java api method as fs.copyFromLocalFile(new Path("D://50GBTtest"), new Path("/50GBTest")); after around 1000 files uploaded, I got an exception saying as follows, Oct 10, 2016 6:45:34 PM org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer nextBlockOutputStream INFO: Abandoning BP-1106394772-192.168.11.45-1476099033336:blk_1073743844_3020 Oct 10, 2016 6:45:34 PM org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer nextBlockOutputStream INFO: Excluding datanode 192.168.11.45:50010 Oct 10, 2016 6:45:34 PM org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer run WARNING: DataStreamer Exception org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /50GBTest/50GBTtest/disk10544.doc could only be replicated to 0 nodes instead of minReplication (=1). There are 2 datanode(s) running and 2 node(s) are excluded in this operation. at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1640) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3161) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3085) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:830) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:500) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2273) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2269) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2267) at org.apache.hadoop.ipc.Client.call(Client.java:1411) at org.apache.hadoop.ipc.Client.call(Client.java:1364) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) at com.sun.proxy.$Proxy7.addBlock(Unknown Source) at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) at com.sun.proxy.$Proxy7.addBlock(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:368) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1449) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1270) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:526) I have checked that 1 datanode is running and has lot of free space as well. I have kept another datanode down due to some scenario test. Can anybody have the ideas ? or faced this issue in past ?

viruvekariya · ‎10-07-2016

Previous comment was for namenode restart and this is I am showing you the datanode restart after allocating more memory to java. Traceback (most recent call last): File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/datanode.py", line 167, in <module> DataNode().execute() File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 219, in execute method(env) File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 530, in restart self.start(env, upgrade_type=upgrade_type) File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/datanode.py", line 62, in start datanode(action="start") File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89, in thunk return fn(*args, **kwargs) File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_datanode.py", line 72, in datanode create_log_dir=True File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/utils.py", line 267, in service Execute(daemon_cmd, not_if=process_id_exists_command, environment=hadoop_env_exports) File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 154, in __init__ self.env.run() File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 160, in run self.run_action(resource, action) File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 124, in run_action provider_action() File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 238, in action_run tries=self.resource.tries, try_sleep=self.resource.try_sleep) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 70, in inner result = function(command, **kwargs) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 92, in checked_call tries=tries, try_sleep=try_sleep) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 140, in _call_wrapper result = _call(command, **kwargs_copy) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 291, in _call raise Fail(err_msg) resource_management.core.exceptions.Fail: Execution of 'ambari-sudo.sh su hdfs -l -s /bin/bash -c 'ulimit -c unlimited ; /usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh --config /usr/hdp/current/hadoop-client/conf start datanode'' returned 1. starting datanode, logging to /var/log/hadoop/hdfs/hadoop-hdfs-datanode-hadoop1ind1.india.out @Pradeep kumar : Can you please have a look at both the log and help me out extending my current HDFS storage

viruvekariya · ‎10-07-2016

I have done all the steps you have given above and I am facing an issue right now while restarting the HDFS service. Here is the log attached below. Traceback (most recent call last): File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/datanode.py", line 167, in <module> DataNode().execute() File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 219, in execute method(env) File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 530, in restart self.start(env, upgrade_type=upgrade_type) File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/datanode.py", line 62, in start datanode(action="start") File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89, in thunk return fn(*args, **kwargs) File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_datanode.py", line 72, in datanode create_log_dir=True File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/utils.py", line 267, in service Execute(daemon_cmd, not_if=process_id_exists_command, environment=hadoop_env_exports) File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 154, in __init__ self.env.run() File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 160, in run self.run_action(resource, action) File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 124, in run_action provider_action() File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 238, in action_run tries=self.resource.tries, try_sleep=self.resource.try_sleep) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 70, in inner result = function(command, **kwargs) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 92, in checked_call tries=tries, try_sleep=try_sleep) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 140, in _call_wrapper result = _call(command, **kwargs_copy) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 291, in _call raise Fail(err_msg) resource_management.core.exceptions.Fail: Execution of 'ambari-sudo.sh su hdfs -l -s /bin/bash -c 'ulimit -c unlimited ; /usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh --config /usr/hdp/current/hadoop-client/conf start datanode'' returned 1. starting datanode, logging to /var/log/hadoop/hdfs/hadoop-hdfs-datanode-hadoop1ind1.india.out Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x00000000bc800000, 864026624, 0) failed; error='Cannot allocate memory' (errno=12) # # There is insufficient memory for the Java Runtime Environment to continue. # Native memory allocation (malloc) failed to allocate 864026624 bytes for committing reserved memory. # An error report file with more information is saved as: # /var/log/hadoop/hdfs/hs_err_pid51884.log Can you please look out at and tell me what exactly went wrong ?

viruvekariya · ‎10-05-2016

No I am not looking for learning purpose but I am doing research for my organization to take this in production as well later. Thanks @Arun

Online	Offline
Last Visited	‎06-29-2020 07:57 AM

Member Since	‎03-03-2016 09:01 AM
Last Visited	‎06-29-2020 07:57 AM
Posts	69
Kudos received	46

Cloudera Community

ViewFS use in java application

Which nodes are participating in Under Replication...

Re: replicated to 0 nodes instead of minreplicatio...

Re: replicated to 0 nodes instead of minreplicatio...

Re: replicated to 0 nodes instead of minreplicatio...

Re: replicated to 0 nodes instead of minreplicatio...

replicated to 0 nodes instead of minreplication HD...

Re: How to increase the capacity of HDFS?

Re: How to increase the capacity of HDFS?

Re: which operating system suites best for hadoop ...