About tstebbens

tstebbens · ‎01-19-2017

@Sankar T Also Ranger audit logs if you have it installed and have the HDFS plugin enabled. In general if you're worried about who does what on your system then you should consider using Ranger at least and possibly Atlas as well.

tstebbens · ‎01-18-2017

@Baruch AMOUSSOU DJANGBAN Currently this is not possible. HADOOP-10019 is the community JIRA to add this functionality to HDFS.

tstebbens · ‎01-11-2017

@sagar pavan This will happen when there are not enough resources (memory) to run the AppMaster container needed to control the Tez job. In YARN capacity-scheduler.xml there is a property yarn.scheduler.capacity.maximum-am-resource-percent which controls the percentage of total cluster memory that can be used by AM containers. If you have several jobs running then each AM will consume the memory required for one container. If this exceeds the given % of total cluster memory the next AM to run will wait until there are free resources for it to run. You'll need to increase yarn.scheduler.capacity.maximum-am-resource-percent to get the AM to run.

tstebbens · ‎01-10-2017

@Timo Burmeister I don't believe there are any recommendations written down. Knox itself is more-or-less just a proxy server. It uses Jetty internally and will comfortably run in 8GB of RAM on most production systems. As Knox is multi-threaded, having more CPU threads will allow you to process more simultaneous requests through Knox. I'm not aware of any performance testing done so you'd need to experiment to find out based on your expected load how many CPU threads would work, but in general 8 CPU threads should be a good starting point. The number of NICs needed again depends on the work-load you expect also. If you're pushing large volumes of data through Knox then obviously you'll need to think about 10GbE or multiple bonded 1GbE NICs. You should probably also have a separate NIC for the external network and the internal cluster network; unless you're using VLANs or virtual IPs on a single NIC. For most starting configurations I'd say a 1GbE NIC would be sufficient.

tstebbens · ‎12-22-2016

@Sagar Shimpi Probably worth pointing out that this will be fixed in Knox 0.10 by the looks of it: KNOX-644

tstebbens · ‎12-21-2016

@Jay SenSharma It's worth pointing out that unless you're using Ranger 0.4 or below, that API is obsolete. You should be using the v2 API linked to by @mvaradkar: https://cwiki.apache.org/confluence/display/RANGER/REST+APIs+for+Service+Definition%2C+Service+and+Policy+Management

tstebbens · ‎12-06-2016

@Sami Ahmad These messages are the Ranger plugins for HDFS and Hive connecting to Ranger Admin to check that they have the latest policies. If you want to stop these messages then you'll need to turn the Ranger Admin logging down to WARN, but these messages are normal. If you look at the frequency they'll occur every 30 seconds which is the default polling period for the plugins.

tstebbens · ‎12-01-2016

@Arpan Rajani Yes, you can use a wildcard certificate - see https://en.wikipedia.org/wiki/Wildcard_certificate If you're using a CA authority then most will generate wildcard certificates for you. If you're using an internal CA or self-signed certificates then this link shows you how: https://serversforhackers.com/self-signed-ssl-certificates In terms of using it for Hadoop, it is used in the same way as a regular certificate but you only have one certificate for all the services. The main security issue with this is that if someone gets hold of the certificate they can install it on any host in your network that matches the domain in DNS (for example *.example.com) and get a valid certificate on that host.

tstebbens · ‎11-30-2016

@Pradheep Shan Currently modifying the Grafana dashboards is not supported in Ambari.

tstebbens · ‎11-24-2016

@Chris L The Ambari Agent is a python process so uses Python's logging facility. To change the pattern, go to /etc/ambari-agent/conf and copy logging.conf.sample to logging.conf, then edit it and look for the lines [formatter_logfileformatter] format=%(levelname)s %(asctime)s %(filename)s:%(lineno)d - %(message)s Change the 'format' line to suit your needs. There is information on the format for this here. The available attributes that can be written to the file from each log record are given here.

Online	Offline
Last Visited	‎03-12-2019 04:30 PM

Member Since	‎09-11-2015 10:26 AM
Last Visited	‎03-12-2019 04:30 PM
Posts	41
Kudos received	48

Cloudera Community

Re: Ranger Lookup User in Kerberos -> Does the loc...

Re: Need help on user and its access in HDFS for H...

Re: Confusions between Ambari user/group aand Rang...

Re: Symbolic links with hdfs commands

Re: Tez job hang, waiting for AM container to be a...

Re: Is there a way to find the person who deleted ...

Re: Symbolic links with hdfs commands

Re: Tez job hang, waiting for AM container to be a...

Re: What are the server requirements for a Knox Ga...

Re: Knox ldap search fails because the size limit ...

Re: Ranger Policies using API

Re: ranger logfile getting filled

Re: Using single SSL Certificate on multiple hosts...

Re: Adding new dashboard in grafana

Re: Change log4j format for the ambari-worker