Member since
04-27-2016
14
Posts
19
Kudos Received
0
Solutions
01-24-2023
05:34 AM
@bvishal, as this is an older post, you would have a better chance of receiving a resolution by starting a new thread. This will also be an opportunity to provide details specific to your environment that could aid others in assisting you with a more accurate answer to your question. You can link this thread as a reference in your new post.
... View more
05-10-2016
05:11 PM
2 Kudos
Here is a great writeup on file compression in Hadoop - http://comphadoop.weebly.com/
... View more
05-10-2016
01:42 PM
Hi Ed, It would be useful to know if you are aiming for HA or performance. Since it is a small cluster you may use it as a POC and not care much about HA, I don't know. One option not mentioned below is going with 3 masters and 3 slaves in a small HA cluster setup. That allows you to balance services on the masters more and/or dedicate one to be mostly an edge node. If security is a topic that may come in handy. Cheers, Christian
... View more
10-31-2016
11:03 PM
HI @azeltov, I am trying to install R-studio on Hortonworks sandbox 2.5, running through the exception in verify installation step: initctl: Unable to connect to Upstart: Failed to connect to socket /com/ubuntu/upstart: Connection refused I have tried starting, stopping rstudio server, it shows the same message. PS: Since it is a docker container, 8787 port is not opened so I have configured /etc/rstudio/rserver.conf to use port 9000.
... View more
01-03-2019
01:25 PM
1 Kudo
Hi, I'd like to share a situation we encountered where 99% of our HDFS blocks were reported missing and we were able to recover them. We had a system with 2 namenodes with high availability enabled. For some reason, under the data folders of the datanodes, i.e /data0x/hadoop/hdfs/data/current - we had 2 Block Pools folders listed (example of such folder is BP-1722964902-1.10.237.104-1541520732855). There was one folder containing the IP of namenode1 and another containing the IP of namenode 2. All the data was under the BlockPool of namenode 1, but inside the VERSION files of the namenodes (/data0x/hadoop/hdfs/namenode/current/) the BlockPool id and the namespace ID were of namenode 2 - the namenode was looking for blocks in the wrong block pool folder. I don't know how we got to the point of having 2 block pools folders, but we did. In order to fix the problem - and get HDFS healthy again - we just needed to update the VERSION file on all the namenode disks (on both NN machines) and on all the journal node disks (on all JN machines), to point to Namenode 1. We then restarted HDFS and made sure all the blocks are
reported and there's no more missing blocks.
... View more