About mRabramS

mRabramS · ‎05-30-2022

What are the answers to this question since 2022? From what I'm seeing, links to repos of HDP 3.1.4 and earlier are behind a paywall aswell. Is there no free/opensource version of HDP/CDP anymore? And will never be?

mRabramS · ‎02-01-2021

Thanks for the response. The cluster is Kerberized. I think we got around these errors (partly) by trial and error, so it's hard to pinpoint the exact configuration causing the issues. These probably involved permissions in Ranger, but can't tell exactly since there's no config versioning there. I say _partly_ since we still haven't figured out Oozie Bundles (jobs through bundles still don't show logs properly on HUE), but workflows and coordinators seem to be working as expected.

mRabramS · ‎01-06-2021

I'm trying to monitor the job status of a running Oozie job. However, running the following code from a general purpose user ("hue", which is also starting these jobs through runAs): curl --negotiate -u : http://FQDN:8088/proxy/application_1609929757167_0004/ws/v1/mapreduce/jobs/job_1609929757167_0004/tasks I'm getting an error, which does not really say anything useful: {"RemoteException":{"exception":"WebApplicationException","javaClassName":"javax.ws.rs.WebApplicationException"}} On the other hand, I'm able to see the correct response through a yarn user, suggesting some issues with the ACL's. My question: what ACL's and where should I give to the general purpose user so that it could see all the information of the running task? I've added the user to ``yarn.admin.acl``, as well as mapreduce.job.acl-modify-job, mapreduce.job.acl-view-job, mapreduce.jobhistory.admin.acl, but none of these help. What am I missing?

mRabramS · ‎07-15-2020

Thank you! This is a very helpful response! Regarding the 3x replication for the DFS, is this still relevant with the latest releases of hadoop? For example, starting with hadoop 3, they did seem to introduce Erasure Coding to deal with this, although I'm not yet sure whether this is used by default by HDP3 and similar packs/platforms. The other things for consideration that you mentioned are very useful, will keep this in mind. It seems like a good idea to start with the minimum viable setup and scale up from there for the HA. In my experience, the migration of services between the nodes will be simple enough through Ambari, so this should have us covered.

mRabramS · ‎07-14-2020

Thanks for the response! Could you elaborate a little on choosing to split into 3 VMs over 2 VM? Apart from allowing 3 replications, would this add additional performance gains? While this will be a "dev" cluster, both data loss and performance is important. Depending on the performance and possible bottlenecks, the goal is to expand to additional machines in the future. However, in the meantime, would want to have a clean and future-proof start with the existing machine/configuration.

mRabramS · ‎07-14-2020

Hi all, what is the current best practice for setting up a HDP 3.1 framework on a single machine of 20 cores and 512GB RAM? Do we still need to split up the machine to, e.g., 2 x VM, in order to ensure at least 2x dfs replication? Or would a single node cluster be more efficient? If I understand correctly, even if using a JBOD disk setup for the HDFS, with a single node it cluster it will be possible to use at most 1x replication through the ``dfs.replication`` setting? What are your recommendations at this point? @Shelton

mRabramS · ‎07-15-2019

Thanks for clarifying. It was indeed recreated after a full restart. Though, I'm noticing tons of errors, hope these are related to a slow start of all the services.

mRabramS · ‎07-15-2019

On a perfectly running HDP3.1 cluster, one time the ATSv2 Timeline Reader stopped responding. Following some guide I've tried refreshing the service by cleaning the configs and restarting it by yarn app -destroy ats-hbase After this, restarting the service doesn't help, and starting the service by yarn app -start ats-hbase doesn't work, since the config json is missing: ERROR client.ApiServiceClient: File does not exist: <>/user/yarn-ats/.yarn/services/ats-hbase/ats-hbase.json What is this .json and how do I get it running again? Where are some example templates of a working .json that I could use? I remember this file existing somewhere on a fresh install, but sadly reinstall of the cluster is not an option. Another thing, trying a fresh install on my laptop VM does not create such json file too -- I guess the reason is that the laptop is too small for enforcing hbase use? Didn't find anything in the official documentation, so would be thankful for any advice! @Geoffrey Shelton Okot

mRabramS · ‎02-27-2019

This works, great workaround.

mRabramS · ‎02-27-2019

This works, great workaround.

Online	Offline
Last Visited	‎05-30-2022 05:16 PM

Member Since	‎06-21-2017 07:51 AM
Last Visited	‎05-30-2022 05:16 PM
Posts	48
Kudos received	1

Cloudera Community

Re: what free or opensource version will have in c...

Re: Can't figure out Yarn ACL requirements

Can't figure out Yarn ACL requirements

Re: Which cluster configuration is best for hadoop

Re: Which cluster configuration is best for hadoop

Which cluster configuration is best for hadoop

Re: HDP3: Yarn ATSv2 json file

HDP3: Yarn ATSv2 json file

Re: HDP3.0: timeline service V2 reader cannot crea...

Re: HDP3.0: timeline service V2 reader cannot crea...