Member since
06-21-2017
48
Posts
1
Kudos Received
0
Solutions
05-30-2022
01:30 PM
What are the answers to this question since 2022? From what I'm seeing, links to repos of HDP 3.1.4 and earlier are behind a paywall aswell. Is there no free/opensource version of HDP/CDP anymore? And will never be?
... View more
02-01-2021
08:32 AM
Thanks for the response. The cluster is Kerberized. I think we got around these errors (partly) by trial and error, so it's hard to pinpoint the exact configuration causing the issues. These probably involved permissions in Ranger, but can't tell exactly since there's no config versioning there. I say _partly_ since we still haven't figured out Oozie Bundles (jobs through bundles still don't show logs properly on HUE), but workflows and coordinators seem to be working as expected.
... View more
01-06-2021
03:24 AM
I'm trying to monitor the job status of a running Oozie job. However, running the following code from a general purpose user ("hue", which is also starting these jobs through runAs): curl --negotiate -u : http://FQDN:8088/proxy/application_1609929757167_0004/ws/v1/mapreduce/jobs/job_1609929757167_0004/tasks I'm getting an error, which does not really say anything useful: {"RemoteException":{"exception":"WebApplicationException","javaClassName":"javax.ws.rs.WebApplicationException"}} On the other hand, I'm able to see the correct response through a yarn user, suggesting some issues with the ACL's. My question: what ACL's and where should I give to the general purpose user so that it could see all the information of the running task? I've added the user to ``yarn.admin.acl``, as well as mapreduce.job.acl-modify-job, mapreduce.job.acl-view-job, mapreduce.jobhistory.admin.acl, but none of these help. What am I missing?
... View more
Labels:
07-15-2020
06:52 AM
Thank you! This is a very helpful response! Regarding the 3x replication for the DFS, is this still relevant with the latest releases of hadoop? For example, starting with hadoop 3, they did seem to introduce Erasure Coding to deal with this, although I'm not yet sure whether this is used by default by HDP3 and similar packs/platforms. The other things for consideration that you mentioned are very useful, will keep this in mind. It seems like a good idea to start with the minimum viable setup and scale up from there for the HA. In my experience, the migration of services between the nodes will be simple enough through Ambari, so this should have us covered.
... View more
07-14-2020
09:42 AM
Thanks for the response! Could you elaborate a little on choosing to split into 3 VMs over 2 VM? Apart from allowing 3 replications, would this add additional performance gains? While this will be a "dev" cluster, both data loss and performance is important. Depending on the performance and possible bottlenecks, the goal is to expand to additional machines in the future. However, in the meantime, would want to have a clean and future-proof start with the existing machine/configuration.
... View more
07-14-2020
03:41 AM
Hi all, what is the current best practice for setting up a HDP 3.1 framework on a single machine of 20 cores and 512GB RAM? Do we still need to split up the machine to, e.g., 2 x VM, in order to ensure at least 2x dfs replication? Or would a single node cluster be more efficient? If I understand correctly, even if using a JBOD disk setup for the HDFS, with a single node it cluster it will be possible to use at most 1x replication through the ``dfs.replication`` setting? What are your recommendations at this point? @Shelton
... View more
Labels:
07-15-2019
12:19 PM
Thanks for clarifying. It was indeed recreated after a full restart. Though, I'm noticing tons of errors, hope these are related to a slow start of all the services.
... View more
07-15-2019
11:57 AM
On a perfectly running HDP3.1 cluster, one time the ATSv2 Timeline Reader stopped responding. Following some guide I've tried refreshing the service by cleaning the configs and restarting it by yarn app -destroy ats-hbase After this, restarting the service doesn't help, and starting the service by yarn app -start ats-hbase doesn't work, since the config json is missing: ERROR client.ApiServiceClient: File does not exist: <>/user/yarn-ats/.yarn/services/ats-hbase/ats-hbase.json What is this .json and how do I get it running again? Where are some example templates of a working .json that I could use? I remember this file existing somewhere on a fresh install, but sadly reinstall of the cluster is not an option. Another thing, trying a fresh install on my laptop VM does not create such json file too -- I guess the reason is that the laptop is too small for enforcing hbase use? Didn't find anything in the official documentation, so would be thankful for any advice! @Geoffrey Shelton Okot
... View more
Labels:
02-27-2019
01:41 PM
This works, great workaround.
... View more
02-27-2019
01:41 PM
This works, great workaround.
... View more