Member since
09-15-2015
457
Posts
507
Kudos Received
90
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
16845 | 11-01-2016 08:16 AM | |
12446 | 11-01-2016 07:45 AM | |
11365 | 10-25-2016 09:50 AM | |
2435 | 10-21-2016 03:50 AM | |
5081 | 10-14-2016 03:12 PM |
02-02-2016
10:51 AM
1 Kudo
Check if the policies have been synced: In Ranger go to Audit -> Plugins (last policy updates are listed in this table) On the namenode check the directory /etc/ranger/<hdfs repository name>; there should be a Json-file with all the policies inside Add a second resource path called /data/raw/* and see if it works
... View more
02-02-2016
05:47 AM
1 Kudo
Importing permissions into Ranger should be easy, however I dont think there is an easy way to export all the Hbase permissions. @Nick Dimiduk @nmaillard any idea how to export all user permissions?
... View more
02-01-2016
05:43 PM
3 Kudos
As far as I know there is no easy way of doing this and personally I would not recommend changing the default location(s) /usr/hdp/..... I have seen similar objectives/projects (e.g. changing default log directories, away from /var/log) and it is somewhat difficult to change all the necessary configurations + running tests to be sure everything works as expected Could you elaborate why you need/want to change the default location?
... View more
01-30-2016
09:43 PM
Thanks @Kyle Pifer, this is strange. Looking at your ps aux output from above, ambari server is running and even established a connection to the database, but I dont see it listening on any port for incoming connections. Did you enable wire encryption (ssl) for ambari?
... View more
01-30-2016
08:09 PM
Since ambari is not listening on port 8080, there is something with the ambari server startup. Could be that the database service is down or the ambari server port is blocked. Open the ambari-server log (tail -fn 100 /var/log/ambari-server/ambari-server.log) and use a second console to start the ambari-server (ambari-server start). Check the log in the other window and see if there are any errors 🙂
... View more
01-30-2016
08:37 AM
4 Kudos
Could you also restart the ambari agents of your cluster. It'd be good to have some of the errors that you saw in the ambari-server log, could you post some of the log or upload the log file? After you restarted the ambari server, did it show up in the process (ps -aux | grep ambari-server), also make sure you see the ambari process listening on port 8080 (netstat -anop)
... View more
01-28-2016
06:35 PM
Glad it worked! Happy Hadooping 🙂
... View more
01-28-2016
10:44 AM
4 Kudos
Take a look at this https://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html#Mapper How Many Maps? The number of maps is usually driven by the total size of the inputs, that is, the total number of blocks of the input files. The right level of parallelism for maps seems to be around 10-100 maps per-node, although it has been set up to 300 maps for very cpu-light map tasks. Task setup takes a while, so it is best if the maps take at least a minute to execute. Thus, if you expect 10TB of input data and have a blocksize of 128MB, you’ll end up with 82,000 maps, unless Configuration.set(MRJobConfig.NUM_MAPS, int) (which only provides a hint to the framework) is used to set it even higher. Reducer Reducer reduces a set of intermediate values which share a key to a smaller set of values.
The number of reduces for the job is set by the user via Job.setNumReduceTasks(int). Overall, Reducer implementations are passed the Job for the job via the Job.setReducerClass(Class) method and can override it to initialize themselves. The framework then calls reduce(WritableComparable, Iterable<Writable>, Context) method for each <key, (list of values)> pair in the grouped inputs. Applications can then override the cleanup(Context) method to perform any required cleanup. Reducer has 3 primary phases: shuffle, sort and reduce. So yes usually you have a map task for every block (unless configured differently). The number of reducers are set by the user when the job is triggered. This might also be helpful https://wiki.apache.org/hadoop/HowManyMapsAndReduces
... View more
01-28-2016
07:40 AM
2 Kudos
This address is available on all nodes that have a Yarn Nodemanager installed (usually they are running on the same nodes as the Datanodes). You might have to use the external address that Amazon EC2 provides. http://<nodemanager-address>:8042/node shows the following information:
... View more