About jstraub

jstraub · ‎10-28-2015

Be careful with the docs though, if you use the way its stated in the documentation (=> ambari-server setup), ambari will also ask you about changing user accounts and database configurations. Although the DB scripts are not executed again, this might be a bit confusing.

jstraub · ‎10-27-2015

Hi Cassandra, we did this upgrade last week on a HDP 2.3.2 /Ambari 2.1.2 cluster. We basically installed the Java JDK 1.8.0._60 including the latest JCE files, changed the paths in the /etc/ambari-server/conf/ambari.properties file and restarted the ambari server. Make sure you restart all services and check ps output if all services picked up the newest JDK. I am not familiar with the serverSetup.py, but above procedure did work and all functionality is running. (Engineering confirmed that JDK 1.8.0_60 upgrade should not produce any problems at the moment) Jonas

jstraub · ‎10-27-2015

What is the purpose of the following two configuration parameters in mapred-size.xml? What are recommended values? mapreduce.input.fileinputformat.split.minsize mapreduce.input.fileinputformat.split.maxsize Thanks 🙂

jstraub · ‎10-27-2015

Add the following variable to the yarn-log4j configuration via Ambari export YARN_ROOT_LOGGER=DEBUG,RFA Be careful, this will fill up the Resourcemanager log with a lot of details 🙂

jstraub · ‎10-26-2015

I think the command has changed, its not hdfs dfsadmin anymore Try this: Set a storage policy to a file or a directory. hdfs storagepolicies -setStoragePolicy -path <path> -policy <policy> Get the storage policy of a file or a directory. hdfs storagepolicies -getStoragePolicy -path <path> Source: https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/ArchivalStorage.html#Set_Storage_Policy

jstraub · ‎10-26-2015

Can you check if the users are really in the defined groups? (run: hdfs groups <userid>) Ranger 0.4 or 0.5? My group permissions work on Ranger 0.5, havent seen any issues so far.

jstraub · ‎10-23-2015

Minimum is 2 letters Maximum > 30 letters Just tested it on my cluster. What version of Ranger do you have?

jstraub · ‎10-23-2015

What you mean by "is not working correctly"? Any errors on the UI or in the log? It should be possible, I have created users like solr, admin, user1, ...

jstraub · ‎10-23-2015

One task everybody faces when setting up a new Hadoop cluster, is the allocation of services. Administrators of a cluster on the other hand might ask themselves, how are my services allocated? I have discussed the visualization of HDP clusters and services more often recently and therefore decided to share my application to visualize the current and future state of a cluster. (see link to hosted app at the end of the article) What does Service Allocation mean? Planning a Hadoop cluster involves many steps and tasks that need to be considered. Almost no setup is the same (although there are some similarities). The service allocation is the part that basically tells you what services and what components will be on which node and how many nodes you have or need. This can be quiet tedious and difficult, since not all services play along well with each other, every service has different hardware/setup requirements and adding many services can get confusing. This makes it even more important to have a sound overview of your service allocation. To plan, document and visualize the service allocation or complete Hadoop cluster I have used paper sketches, Excel sheets, text files, Powerpoints, Photoshop and other tools. However these approaches are often time consuming, hard to edit/re-use and in general not the best option. In need for a proper tool I have created this rather small and simple Angular application (at least it was at the beginning), which basically visualizes a cluster by using a simple JSON document as input source (see below). There are three ways to create a cluster visualization: Export a live cluster via Ambari's API Create a cluster by writing a JSON document as seen below Build a new cluster with the latest drag-n-drop build feature Lets say we have a cluster with: 2 Masternodes, 1 Datanode and a couple of different services. The cluster is defined as: { "stack_version":"HDP-2.2", "security_type":"KERBEROS", "name":"bigdata", "hosts_info":[ { "host_name":"c4068.ambari.apache.org", "components":[ "NAMENODE", "RESOURCEMANAGER", "APP_TIMELINE_SERVER", "HISTORYSERVER", "TEZ_CLIENT", "YARN_CLIENT", "HDFS_CLIENT", "HIVE_CLIENT", "MAPREDUCE2_CLIENT" ] }, { "host_name":"c4069.ambari.apache.org", "components":[ "SECONDARY_NAMENODE", "HIVE_METASTORE", "HIVE_SERVER", "HCAT", "WEBHCAT_SERVER", "TEZ_CLIENT", "YARN_CLIENT", "HDFS_CLIENT", "HIVE_CLIENT", "MAPREDUCE2_CLIENT" ] }, { "host_name":"c4070.ambari.apache.org", "components":[ "DATANODE", "NODEMANAGER", "TEZ_CLIENT", "YARN_CLIENT", "HDFS_CLIENT", "HIVE_CLIENT", "MAPREDUCE2_CLIENT" ] } ] } As soon as as the cluster is imported, you can choose between three views: Design flexibility through Environments Environments are basically exportable stack templates that contain information about available services and components as well as their groups and colors. In order to customize the visualization configuration (colors, sorting,...), you can edit the services and components within the application or the exported Environment (JSON). This makes it possible to use different output formats for specific clusters, departments, companies and so on by simply importing the environment when a cluster is imported. Why you might find this app useful: Planning new cluster Easy Ambari Blueprint generation Visualize cluster for concept or documentation Quick overview of a cluster (e.g. for support, sysadmins, ...) Consistent visualization/documentation ... If more people are interested in this project, I will add new features. For example: Filter by node groups (type of node or service or any custom group) Group nodes (Master, Worker, Edge, ...) Implement as Ambari View (?) ... I hope some might find this tool useful. Looking forward to your feedback 🙂 You can find more screenshots here: https://github.com/mr-jstraub/ambari-node-view/tree/master/screens Project & Setup: https://github.com/mr-jstraub/ambari-node-view The above article mainly focused on version 0.3, since then a new version has been released with exciting new features. Read more in the next section below Export, Build, Visualize and Deploy - What's new in v0.4 Since the above article has been publised in October, a lot of changes have been made and the web application has been heavily extended. In this short paragraph, I want to touch rather quickly on the latest enhancements, more details will follow in an additional article. Whats New? The nodes and its services/components have been completely redesigned/restructured Added an option to switch between fullnames and acronyms (e.g. Namenode and NN) New data structure for nodes. Nodes can have multiple hostnames now; this is a major change, since it reduces the data footprint immensly and allows the creation of simpler cluster templates Build a Cluster! - A drag-n-drop based user interface to build a cluster Blueprint Generator ! - Generate Ambari Blueprints directly from imported or built clusters Build a Cluster - New - This is definitely one of my favorite features. Instead of writing JSON templates, to plan and visualize a cluster, or exporting an existing cluster (although this is the easiest way) it is now possible to build a new cluster by using drag-n-drop. The tool supports up to 1000 Nodes, dynamic hostnames, HDFS & Yarn HA, .... Blueprints (Beta) - New - Generate Ambari Blueprints directly from imported or built clusters. General and Hostgroup-specific configurations can be added manually. More than one thousand suggested configuration parameters and categories. Read more in this article about Blueprints and "Build a Cluster"

jstraub · ‎10-22-2015

Good point! But I think if you run into an OOM issue even the Capacity Scheduler wouldnt help. Its probably a good idea to validate the Yarn configuration, especially memory settings.

Online	Offline
Last Visited	‎08-18-2019 08:21 AM

Member Since	‎09-15-2015 02:21 PM
Last Visited	‎08-18-2019 08:21 AM
Posts	457
Kudos received	472

Cloudera Community

Re: NiFi: How do I see the flowfile attributes nam...

Re: NiFi: JSON Array split

Re: Securing Solr with Ranger ERROR 500

Re: Is Ambari Infra open source?

Re: After disabling kerberos , ZKfailover not comi...

Re: Is there a set way to change jdk from 1.8.0_40...

Re: Is there a set way to change jdk from 1.8.0_40...

Recommended config: mapreduce.input.fileinputforma...

Re: How to Modify the Verbosity of Yarn Logs

Re: How do I configure HDFS storage types and poli...

Re: Ranger group policy not being applied to the u...

Re: Is there a minimum length for a user name in R...

Re: Is there a minimum length for a user name in R...

Visualize Cluster and Service Allocation - Reloade...

Re: One dead big job blocks all jobs