Member since
02-09-2016
559
Posts
422
Kudos Received
98
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2808 | 03-02-2018 01:19 AM | |
4471 | 03-02-2018 01:04 AM | |
3024 | 08-02-2017 05:40 PM | |
2808 | 07-17-2017 05:35 PM | |
2070 | 07-10-2017 02:49 PM |
12-13-2016
01:06 PM
Michael Young,
Awesome steps with snap......Please suggest what would be the minimum size of the cluster supported either by aws or gcp?
What would be the type/size of machine would be idle?
... View more
09-22-2016
10:35 AM
Awesome! Thanks Dominika!
... View more
08-17-2016
09:30 PM
@Michael Young 🙂 I'm on travel and did not remember the path and the file name. Sorry for a bit of imprecise answer. Tried to help timely.
... View more
08-13-2016
01:50 AM
It looks like those are settings within your AWS Console. http://docs.aws.amazon.com/general/latest/gr/managing-aws-access-keys.html https://console.aws.amazon.com/iam/home?region=us-east-1#security_credential
... View more
08-23-2017
07:52 PM
I should have mentioned I was using VirtualBox v5.1.14
... View more
08-11-2016
03:38 PM
1 Kudo
This is a great article for anyone looking to ingest data quickly and store in compressed formats. This will work very well For POC, testing and sandbox type of activities. I used something like this and made it production grade at a client by automating some of the jobs using oozie. Once the data was loaded we also had verification scripts that would audit what came in and what got dropped.. Also we had clean up scripts that would remove all the raw data from HDFS, once the data was set in Hive in ORC format that was compressed and partitioned. With the advent of Nifi and Spark, its worth looking at building an Nifi processor in conjuction with spark jobs to load the data seamlessly into Hive/Hbase in compressed formats as its being loaded.
... View more
08-22-2016
05:52 AM
Thanks for your reply, Yep i understand the your update, yes i need install the all RPM packages, even if install without using Ambari. I am correcting my question for this and update it
... View more
08-17-2016
05:19 PM
@sbhat Is any of this documented somewhere? Looking at: https://github.com/apache/ambari/blob/trunk/ambari-server/docs/api/v1/index.md I don't see any reference to creating users.
... View more
03-24-2017
03:26 PM
@Eric Hanson I don't have an official opinion on this. It really depends on the available resources. If the cluster is really large, then it may be beneficial to put the KDC on its own VM; but for a small cluster (<15 hosts), that may be a bit overkill and the least utilized host for the KDC maybe sufficient. That said, the workload could be spread out by placing a one or more slave KDCs around the cluster, There is also the option to separate the kadmin and krb5kdc processes to different hosts - though this is more for security concerns than for performance or resource concerns. One thing to keep in mind. For Ambari server versions 2.5.0 and below, it appears that the cluster does an abnormal amount of kinit's. This is currently being looked into. So far, it is unclear whether this is a bug, expected behavior, or something in between. The effect of this issue on a small cluster is minimal and not noticeable over a short period of time. On a large cluster (say 900 nodes), the Kerberos log files tend to get large quickly. Performance of the KDC on such a cluster, even when the KDC exists on a host with Hadoop services, does not appear to be affected. The main issue is merely log file size. However, if an issue is found and fixed, less kinit's couldn't hurt. 🙂
... View more
09-13-2016
10:09 AM
Thanks, It was really helpful
... View more
- « Previous
- Next »