About ssanupindi

ssanupindi · ‎08-20-2018

@rabbit, the path is already defined on Ambari (hdfs -> configs -> DataNode Directories) this is the path you defined for HDFS to write the block information on the actual disk location on each data node. in your case: This path must have defined on ambari as: /data/hadoop-data/dn/ - under this, HDFS creates remaining folders starting from "current" Please check your ambari -> hdfs properties and confirm. I hope this help you.

ssanupindi · ‎08-19-2018

Hi, Please make sure the hiveserver2 port, if you want to connect HS2 in http mode, the default port should be 10001. So please make sure you are connecting right HS2 server that is running under http mode.

ssanupindi · ‎08-19-2018

One is large environment with 20+ pb in size and data is completely different from other environment data, and reasons for different lakes are they both fall in different internal departments and data is also different and customers are also different, again depends on the data these cluster(s) servers located in different data centers and one is open for company wide enterprise network and others open for an internal network within enterprise network.

ssanupindi · ‎08-18-2018

Hi, Not sure if my answer helps you or not: But i can give you some details: We built an enterprise Data Lake using HDP 2.x, how many data lakes(environments) you wanted to build, it depends on the data & requirements. at my workplace, we got multiple production environments, where we got different kinds of data and we enabled 'distcp' between couple of environments, to get some data feed from other environments, but the end users and requirements are clearly different for these environments and one more difference is, different kinds of end users & data and multiple ways they can access these environments. (some wanted the data in "NRT" (near real tme) and some users can wait for the results). So we provided multiple ways to access and to get the data from our data lake -- end users chose the best way that meets their requirements. Hope this helps.

ssanupindi · ‎04-24-2017

Hi Arpit, Thank you for the response. Actually our issue was resolved after refreshing client configs on the Namenode Host. Looks like the Namenode has cached the old configuration for DN and we were asked by HW support to Restart Namenode (or) if not possible, atleast refresh client config, we first refreshed client config that resolved our issue.

ssanupindi · ‎04-20-2017

Good Morning Experts, I am currently facing the following issue: We have 200 nodes hadoop cluster and we configured rack awareness, suddenly we noticed one of the datanode was missing from Ambari, but we do have that datanode process on that particular node, when we looked at the logs, we have noticed the following error: Initialization failed for Block pool BP-3x84848-92929299 (Datanode Uuid 6048438486-d001-47af-a899-6493aca15c4c) service to hostname.com/<data node ip>:8020 Failed to add /default-rack/<datanode ip>:1019: You cannot have a rack and a non-rack node at the same level of the network topology. We have added the datanode again from ambari, but after starting the datanode, but it is still complaining with the above errors in the datanode logs. I didn't see any similar question in the community, so i am looking for your help. Since this is currently an issue in our production cluster, can anyone please help me quickly? I greatly appreciate your quick help. thanks, ~hdpadmin

Online	Offline
Last Visited	‎11-15-2020 11:46 AM

Member Since	‎12-26-2016 11:15 AM
Last Visited	‎11-15-2020 11:46 AM
Posts	15
Kudos received	1

Cloudera Community

Re: Hortonworks Hadoop platform vs datalake (looki...

Re: How hdfs generates datanode path?

Re: HIVE2 JDBC connection: java.lang.NoSuchFieldEr...

Re: Hortonworks Hadoop platform vs datalake (looki...

Re: Hortonworks Hadoop platform vs datalake (looki...

Re: DATA NODE was removed from Ambari - Due to Ra...

DATA NODE was removed from Ambari - Due to Rackwa...