Member since
11-17-2016
63
Posts
7
Kudos Received
5
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2160 | 11-23-2017 10:50 AM | |
4777 | 05-12-2017 02:13 PM | |
15506 | 01-11-2017 04:20 PM | |
8567 | 01-06-2017 04:03 PM | |
6318 | 01-06-2017 03:49 PM |
05-11-2017
11:58 AM
What all I did: 1. Increased the memory of NN 2. Increased he disk of overall cluster 3. Increased dfs blocksize from 64MB to 128MB 4. Increased the block count threshold.
... View more
05-11-2017
11:54 AM
Hi All, I have 3 node Cloudera cluster, running Cloudera 5.9. I want to make a web crawler and therefore want to Install Apache Nutch. Can anyone please guide me how to install on a Existing Hadoop Cluster(Hadoop version 2.6.0). I have downloaded the tar from http://www.apache.org/dyn/closer.lua/nutch/2.3.1/apache-nutch-2.3.1-src.tar.gz And extarcted the folder, but when I go inside, I see only these files: [hdfs@X.X.X.X bin]$ pwd
/var/lib/hadoop-hdfs/nutch/apache-nutch-2.3.1/src/bin
[hdfs@X.X.X.X bin]$ ll
total 20
-rwxr-xr-x 1 hdfs hadoop 5453 Jan 10 2016 crawl
-rwxr-xr-x 1 hdfs hadoop 8801 Jan 10 2016 nutch
[hdfs@X.X.X.X apache-nutch-2.3.1]$ ll
total 488
-rw-r--r-- 1 hdfs hadoop 46132 Jan 10 2016 build.xml
-rw-r--r-- 1 hdfs hadoop 82375 Jan 10 2016 CHANGES.txt
drwxr-xr-x 2 hdfs hadoop 4096 May 11 13:23 conf
-rw-r--r-- 1 hdfs hadoop 4903 Jan 10 2016 default.properties
drwxr-xr-x 3 hdfs hadoop 4096 Jan 10 2016 docs
drwxr-xr-x 2 hdfs hadoop 4096 May 11 13:23 ivy
drwxr-xr-x 3 hdfs hadoop 4096 Jan 10 2016 lib
-rw-r--r-- 1 hdfs hadoop 329066 Jan 10 2016 LICENSE.txt
-rw-r--r-- 1 hdfs hadoop 429 Jan 10 2016 NOTICE.txt
drwxr-xr-x 9 hdfs hadoop 4096 Jan 10 2016 src Thanks, Shilpa
... View more
Labels:
- Labels:
-
Apache Solr
-
HDFS
05-05-2017
10:08 AM
After increasing the Heap size of Datanode role on my NN, I have not seen Hprof files getting created. The issue is resolved. Thanks @mathieu.d
... View more
05-04-2017
10:12 AM
Java Heap for Datanode was 1GB for all 3 DNs. Hence I changed the Heap for only NN's Datanode to 3GB. Also, changed the OOM heap dump back to /tmp Just to see if HPROF files are getting generated or not. Thanks, Shilpa
... View more
05-04-2017
09:51 AM
Thanks @mathieu.d. As a work around what I have done is, have changed the HeapDump Path to /dev/null for datanode. I will check what you have suggested and get back as my NN has a datanode role.
... View more
05-03-2017
01:28 PM
Ok. Thanks.
... View more
05-03-2017
10:20 AM
My current main configs after increasing the memory are: yarn.nodemanager.resource.memory-mb - 12GB yarn.scheduler.maximum-allocation-mb - 16GB mapreduce.map.memory.mb - 4GB mapreduce.reduce.memory.mb - 4GB mapreduce.map.java.opts.max.heap - 3GB mapreduce.reduce.java.opts.max.heap - 3GB namenode_java_heapsize - 6GB secondarynamenode_java_heapsize - 6GB dfs_datanode_max_locked_memory - 3GB dfs blocksize - 128 MB do you think I should change something?
... View more
05-03-2017
10:14 AM
Hi All, I have 3 node cluster running on CentOS 6.7. Namenode has been facing issue of .hprof files in /tmp directory leading to 100% disk usage on / mount. The owner of these files are hdfs:hadoop. I know hprof is created when we have a heap dump of the process at the time of the failure. This is typically seen in scenarios with "java.lang.OutOfMemoryError". Hence I increased the RAM of my NN to 112GB from 56GB. My configs are: yarn.nodemanager.resource.memory-mb - 12GB yarn.scheduler.maximum-allocation-mb - 16GB mapreduce.map.memory.mb - 4GB mapreduce.reduce.memory.mb - 4GB mapreduce.map.java.opts.max.heap - 3GB mapreduce.reduce.java.opts.max.heap - 3GB namenode_java_heapsize - 6GB secondarynamenode_java_heapsize - 6GB dfs_datanode_max_locked_memory - 3GB The datanode log on NN has below error but they are also present on other DN (on all 3 nodes basically): 2017-05-03 10:03:17,914 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DataNode{data=FSDataset{dirpath='[/bigdata/dfs/dn/current]'}, localName='XXXX.azure.com:50010', datanodeUuid='4ea75665-b223-4456-9308-1defcad54c89', xmitsInProgress=0}:Exception transfering block BP-939287337-X.X.X.4-148408516
3925:blk_1077604623_3864267 to mirror X.X.X.5:50010: java.net.SocketTimeoutException: 65000 millis timeout while waiting for channel to be ready for read. ch : java.ni
o.channels.SocketChannel[connected local=/X.X.X.4:43801 remote=X.X.X.5:50010]
2017-05-03 10:03:17,922 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: XXXX.azure.com:50010:DataXceiver error processing WRITE_BLO
CK operation src: /X.X.X.4:53902 dst: /X.X.X.4:50010
java.net.SocketTimeoutException: 65000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/X.X.X.4:438
01 remote=/X.X.X.5:50010]
at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:118)
at java.io.FilterInputStream.read(FilterInputStream.java:83)
at java.io.FilterInputStream.read(FilterInputStream.java:83)
at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2241)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:743)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
at java.lang.Thread.run(Thread.java:745)
2017-05-03 10:04:52,371 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: XXXX.azure.com:50010:DataXceiver error processing WRITE_BLO
CK operation src: /X.X.X.4:54258 dst: /X.X.X.4:50010
java.io.IOException: Premature EOF from inputStream
at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:201)
at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213)
at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134)
at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)
at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:500)
at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:896) This log is getting such errors even during night time or early morning when nothing is running. My cluster is used to getting webpage info using wget and then processing the data using SparkR. Apart from this, I am also getting Block count more than threshold, for which I have another thread. http://community.cloudera.com/t5/Storage-Random-Access-HDFS/Datanodes-report-block-count-more-than-threshold-on-datanode-and/m-p/54170#M2851 Please help! I am worried about my cluster. Cluster configs(after recent upgrades) - NN: RAM- 112GB, Core 16, Disk : 500GB DN1: RAM- 56GB, Core 8, Disk: 400GB DN2: RAM- 28GB, Core 4, Disk: 400GB Thanks, Shilpa
... View more
Labels:
- Labels:
-
Apache Spark
-
HDFS
05-03-2017
10:10 AM
Hi @saranvisa and @Fawze, I have increased the RAM of my NN from 56GB 112GB and increased the heap and java memory of the configs. However I still face the issue. Due to some client work I cannot increase the disk untill this friday. Will update once i do that. Thanks
... View more
04-26-2017
02:28 PM
Thanks @saranvisa. I tried to do rebalance but it did not help. The number of blocks on each node is too high and my threshold is 500,000. Do you think adding some disk in the server and then changing the threshold would help?
... View more