About jstraub

jstraub · ‎02-02-2016

Check if the policies have been synced: In Ranger go to Audit -> Plugins (last policy updates are listed in this table) On the namenode check the directory /etc/ranger/<hdfs repository name>; there should be a Json-file with all the policies inside Add a second resource path called /data/raw/* and see if it works

jstraub · ‎02-02-2016

Importing permissions into Ranger should be easy, however I dont think there is an easy way to export all the Hbase permissions. @Nick Dimiduk @nmaillard any idea how to export all user permissions?

jstraub · ‎02-01-2016

As far as I know there is no easy way of doing this and personally I would not recommend changing the default location(s) /usr/hdp/..... I have seen similar objectives/projects (e.g. changing default log directories, away from /var/log) and it is somewhat difficult to change all the necessary configurations + running tests to be sure everything works as expected Could you elaborate why you need/want to change the default location?

jstraub · ‎02-01-2016

Nice! Thanks for sharing 🙂

jstraub · ‎01-30-2016

Thanks @Kyle Pifer, this is strange. Looking at your ps aux output from above, ambari server is running and even established a connection to the database, but I dont see it listening on any port for incoming connections. Did you enable wire encryption (ssl) for ambari?

jstraub · ‎01-30-2016

Since ambari is not listening on port 8080, there is something with the ambari server startup. Could be that the database service is down or the ambari server port is blocked. Open the ambari-server log (tail -fn 100 /var/log/ambari-server/ambari-server.log) and use a second console to start the ambari-server (ambari-server start). Check the log in the other window and see if there are any errors 🙂

jstraub · ‎01-30-2016

Could you also restart the ambari agents of your cluster. It'd be good to have some of the errors that you saw in the ambari-server log, could you post some of the log or upload the log file? After you restarted the ambari server, did it show up in the process (ps -aux | grep ambari-server), also make sure you see the ambari process listening on port 8080 (netstat -anop)

jstraub · ‎01-28-2016

Glad it worked! Happy Hadooping 🙂

jstraub · ‎01-28-2016

Take a look at this https://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html#Mapper How Many Maps? The number of maps is usually driven by the total size of the inputs, that is, the total number of blocks of the input files. The right level of parallelism for maps seems to be around 10-100 maps per-node, although it has been set up to 300 maps for very cpu-light map tasks. Task setup takes a while, so it is best if the maps take at least a minute to execute. Thus, if you expect 10TB of input data and have a blocksize of 128MB, you’ll end up with 82,000 maps, unless Configuration.set(MRJobConfig.NUM_MAPS, int) (which only provides a hint to the framework) is used to set it even higher. Reducer Reducer reduces a set of intermediate values which share a key to a smaller set of values. The number of reduces for the job is set by the user via Job.setNumReduceTasks(int). Overall, Reducer implementations are passed the Job for the job via the Job.setReducerClass(Class) method and can override it to initialize themselves. The framework then calls reduce(WritableComparable, Iterable<Writable>, Context) method for each <key, (list of values)> pair in the grouped inputs. Applications can then override the cleanup(Context) method to perform any required cleanup. Reducer has 3 primary phases: shuffle, sort and reduce. So yes usually you have a map task for every block (unless configured differently). The number of reducers are set by the user when the job is triggered. This might also be helpful https://wiki.apache.org/hadoop/HowManyMapsAndReduces

jstraub · ‎01-28-2016

This address is available on all nodes that have a Yarn Nodemanager installed (usually they are running on the same nodes as the Datanodes). You might have to use the external address that Amazon EC2 provides. http://<nodemanager-address>:8042/node shows the following information:

Online	Offline
Last Visited	‎08-18-2019 08:21 AM

Member Since	‎09-15-2015 02:21 PM
Last Visited	‎08-18-2019 08:21 AM
Posts	457
Kudos received	472

Cloudera Community

Re: NiFi: How do I see the flowfile attributes nam...

Re: NiFi: JSON Array split

Re: Securing Solr with Ranger ERROR 500

Re: Is Ambari Infra open source?

Re: After disabling kerberos , ZKfailover not comi...

Re: Ranger policy is not applied

Re: Is it possible to import existing HBase ACL in...

Re: An easy way to define a prefix for all hdp com...

Re: measuring network latency between nodes

Re: Ambari Server Restarted, now Web Interface Won...

Re: Ambari Server Restarted, now Web Interface Won...

Re: Ambari Server Restarted, now Web Interface Won...

Re: Ambari Metrics Collector Start Failed on 3 No...

Re: While running an Mapreduce job in YARN, does e...

Re: Unable to view application logs on EC2 in HDP ...