About mph

mph · ‎01-10-2017

A year ago I implemented a HDP platform. Soon after NiFi was established as the defacto way for integrating external data flows into the cluster. A year on I'm reimplementing the architecture and now HDF is available. So is the assumption now that HDF runs on a node outside of the HDP and pushes data to it, as opposed to how I had it before where NiFi was installed on a node within the HDP cluster.

mph · ‎01-08-2017

Hi Chris, my cluster was hacked and the HDFS data was deleted (includ /user/ and the trash files). I can see in /hadoop/hdfs/namenode the fsimage_ file before the deletes were applied. Could you explain how I would go about reverting to the older fsimage_ file?

mph · ‎11-24-2016

ps - i tried pickling the model object but that didnt work

mph · ‎11-24-2016

Hi, I need to save a model in python spark 1.6.0. I know save()/load functions are available in 2.0 but I'm not in a position to upgrade our HDP cluster at this current time and need a hack. I know Scala 1.6 does support saving of models. Is there some way I could share my model object from python to scala. Im using zeppelin so this might work between paragraphs? Any help much appreciated

mph · ‎08-12-2016

..I found that in my ambari settings this was not specified - on setting this to 0 and setting tez.session.am.dag.submit.timeout.secs to a smaller amount gave me the behaviour i was looking for.

mph · ‎08-11-2016

@Sunile Manjee I checked this and it is false. The remaining container seems to be the application master. When I run the Hive jobs via MapReduce2 they complete fine, its just when they are run in Tez I see this behaviour.

mph · ‎08-11-2016

Hi, When I run an insert into command through beeline Hive/Tez requests 2 containers. Once beeline reports that the row was successfully inserted in to the table I see that the job created (seen in the YARN Manager UI) is still running and holds on to one of the two containers, when I terminate beeline the job listed in the Manager UI then lists as completed. Why is this happening and how can I change my hadoop configuration to stop this happening? Thanks, Mike

mph · ‎08-02-2016

running on HDP2.4

mph · ‎08-02-2016

Its 0.6.0 > https://zeppelin.apache.org/download.html

mph · ‎08-02-2016

..also is there a way in SparkR to check first if the sparkcontext and hivecontexts are running?

Online	Offline
Last Visited	‎07-16-2020 05:48 AM

Member Since	‎04-13-2016 05:05 PM
Last Visited	‎07-16-2020 05:48 AM
Posts	80
Kudos received	12

Cloudera Community

Re: Zeppelin error on restart

Which of these approaches to HDF (Nifi) and HDP in...

Re: When Secondary NameNode performs checkpoint i....

Re: Is there a way to save a model in PYSPARK (pyt...

Is there a way to save a model in PYSPARK (python)...

Re: How do I stop beeline holding onto YARN contai...

Re: How do I stop beeline holding onto YARN contai...

How do I stop beeline holding onto YARN containers...

Re: Does the HiveContext object expire in Zeppelin...

Re: Does the HiveContext object expire in Zeppelin...

Re: Does the HiveContext object expire in Zeppelin...