About msmarnaoui

georgios_gkekas · ‎04-06-2017

This problem has been happening on our side since many months as well. Both with Spark1 and Spark2. Both while running jobs in the shell as well as in Python notebooks. And it is very easy to reproduce. Just open a notebook and let it run for a couple of hours. Or just do some simple dataframe operations in an infinite loop. There seems to be something fundamentally wrong with the timeout configurations in the core of Spark. We will open a case for that as no matter what kind of configurations we have tried, the problem insists.

msmarnaoui · ‎08-12-2016

I found the cause of the problem. it's configuration matter. in fact namenode was installed on master01 but following parameter was set with worker02 (on which no namenode) : dfs.namenode.http-address: worker02.cl02.sr.private:50070 instead of master01.cl02.sr.private:50070 the configuration was altered because the cluster was taken to HA configuration then taken back to non HA. then one of the namenodes was deleted (the one on worker02) without paying attention that the remaining configuration was pointing to worker02. hope I'm clear 🙂

Online	Offline
Last Visited	‎03-12-2019 11:06 AM

Member Since	‎03-23-2016 02:43 PM
Last Visited	‎03-12-2019 11:06 AM
Posts	21
Kudos received	5

Cloudera Community

Re: Ambari can't start namenode: ambari-sudo.sh re...

Re: Spark job stage cancelled because SparkContext...

Re: Ambari can't start namenode: ambari-sudo.sh re...