About myoung

nnasar · ‎12-13-2016

Michael Young, Awesome steps with snap......Please suggest what would be the minimum size of the cluster supported either by aws or gcp? What would be the type/size of machine would be idle?

arooney · ‎09-22-2016

Awesome! Thanks Dominika!

cstanca · ‎08-17-2016

@Michael Young 🙂 I'm on travel and did not remember the path and the file name. Sorry for a bit of imprecise answer. Tried to help timely.

myoung · ‎08-13-2016

It looks like those are settings within your AWS Console. http://docs.aws.amazon.com/general/latest/gr/managing-aws-access-keys.html https://console.aws.amazon.com/iam/home?region=us-east-1#security_credential

bob_heckel · ‎08-23-2017

I should have mentioned I was using VirtualBox v5.1.14

sbomma · ‎08-11-2016

This is a great article for anyone looking to ingest data quickly and store in compressed formats. This will work very well For POC, testing and sandbox type of activities. I used something like this and made it production grade at a client by automating some of the jobs using oozie. Once the data was loaded we also had verification scripts that would audit what came in and what got dropped.. Also we had clean up scripts that would remove all the raw data from HDFS, once the data was set in Hive in ORC format that was compressed and partitioned. With the advent of Nifi and Spark, its worth looking at building an Nifi processor in conjuction with spark jobs to load the data seamlessly into Hive/Hbase in compressed formats as its being loaded.

shivkumar82015 · ‎08-22-2016

Thanks for your reply, Yep i understand the your update, yes i need install the all RPM packages, even if install without using Ambari. I am correcting my question for this and update it

myoung · ‎08-17-2016

@sbhat Is any of this documented somewhere? Looking at: https://github.com/apache/ambari/blob/trunk/ambari-server/docs/api/v1/index.md I don't see any reference to creating users.

rlevas · ‎03-24-2017

@Eric Hanson I don't have an official opinion on this. It really depends on the available resources. If the cluster is really large, then it may be beneficial to put the KDC on its own VM; but for a small cluster (<15 hosts), that may be a bit overkill and the least utilized host for the KDC maybe sufficient. That said, the workload could be spread out by placing a one or more slave KDCs around the cluster, There is also the option to separate the kadmin and krb5kdc processes to different hosts - though this is more for security concerns than for performance or resource concerns. One thing to keep in mind. For Ambari server versions 2.5.0 and below, it appears that the cluster does an abnormal amount of kinit's. This is currently being looked into. So far, it is unclear whether this is a bug, expected behavior, or something in between. The effect of this issue on a small cluster is minimal and not noticeable over a short period of time. On a large cluster (say 900 nodes), the Kerberos log files tend to get large quickly. Performance of the KDC on such a cluster, even when the KDC exists on a host with Hadoop services, does not appear to be affected. The main issue is merely log file size. However, if an issue is found and fixed, less kinit's couldn't hurt. 🙂

gaurab_dawn · ‎09-13-2016

Thanks, It was really helpful

Online	Offline
Last Visited	‎02-08-2019 07:03 PM

Member Since	‎02-09-2016 09:44 PM
Last Visited	‎02-08-2019 07:03 PM
Posts	559
Kudos received	413

Cloudera Community

Re: How can I force the getTwitter processor to no...

Re: Send Ambari Metric to Elasticsearch

Re: Ingesting unformatted, unordered data from hdf...

Re: What would the audit record on Zeppelin users ...

Re: Automate loading data into HDFS

Re: How to test HDP 2.5 TP using Cloudbreak 1.4 on...

Re: hortonworks cloud for aws technical preview cr...

Re: How do you fix login issues after restarting C...

Re: What are access key and secret access key when...

Re: What is the best way to shut down Hortonworks ...

Re: Using Pig to convert uncompressed data to comp...

Re: Adding HDFS service in HDP 2.1, asking more d...

Re: Managing Ambari Users and groups using Rest AP...

Re: Is it recommended to use an existing KDC on an...

Re: Trying to create two node labels "x" as exclus...