Created on 09-26-2016 05:03 PM - edited 08-17-2019 09:40 AM
This article "Configure SAP Vora HDP Ambari - Part 2" is continuation of "Getting started with SAP Hana and Vora with HDP using Apache Zeppelin for Data Analysis - Part 1 In...
Log back in to SAP Cloud Appliance Library - the free service to manage your SAP solutions in the public cloud. You should have HANA and Vora instances up and running:
The port of Ambari web UI has been preconfigured for you in the SAP HANA Vora, developer edition, in CAL. As well its port has been opened as one of the default Access Points. As you might remember it translates into the appropriate inbound rule in the corresponding AWS’s security group.
adminand the master password you defined during process of the creation of the instance in CAL.
You use this interface to start/stop cluster components if needed during operations or troubleshooting.
Please refer to Apache Ambari official documentation if you need additional information and training how to use it.
For detailed review of all SAP HANA Vora components and their purpose please review SAP HANA Vora help
We will need to make some configuration to get the HDFS View to work in Ambari and also modify Yarn scheduler.
Setup HDFS Ambari View:
Creating and Configuring a Files View Instance
|Instance Name||This is the Files view instance name. This value should be unique for all Files view instances you create. This value cannot contain spaces and is required.|
|Display Name||This is the name of the view link displayed to the user in Ambari Web.||MyFiles|
|Description||This is the description of the view displayed to the user in Ambari Web.||Browse HDFS files and directories.|
|Visible||This checkbox determines whether the view is displayed to users in Ambari Web.||Visible or Not Visible|
You should see the an ambari HDFS view like this:
Now lets test that you can view the HDFS View:
Next we will reconfigure the Yarn to fix an issue when submitting yarn jobs. I got this when running a sqoop job to import data from SAP HANA to HDFS ( this will be a separate how-to article published soon)
YarnApplicationState: ACCEPTED: waiting for AM container to be allocated, launched and register with RM.been stuck like that for a while
Lets set yarn.scheduler.capacity.maximum-am-resource-percent=0.6 . Go to YARN -> Configs and look for property yarn.scheduler.capacity.maximum-am-resource-percent
|yarn.scheduler.capacity.maximum-am-resource-percent /yarn.scheduler.capacity.<queue-path>.maximum-am-resource-percent||Maximum percent of resources in the cluster which can be used to run application masters - controls number of concurrent active applications. Limits on each queue are directly proportional to their queue capacities and user limits. Specified as a float - ie 0.5 = 50%. Default is 10%. This can be set for all queues with yarn.scheduler.capacity.maximum-am-resource-percent and can also be overridden on a per queue basis by settingyarn.scheduler.capacity.<queue-path>.maximum-am-resource-percent|
Now lets connect to Apache Zeppelin and load sample data from files already created in HDFS in SAP HANA Vora
SAP HANA Vora provides its own
%vora interpreter, which allows Spark/Vora features to be used from Zeppelin. Zeppelin allows queries to be written directly in Spark SQL
0_DemoDatanotebook will open up. Now you can click on Run all paragraphsbutton on top of the page to create tables in SAP HANA Vora using data from the existing HDFS files preloaded on the instance in CAL. These are the tables you will need as well later in exercises.
A dialog window will pop up asking you to confirm to Run all paragraphs? Click OK
The Vora code will load .csv files and create tables in Vora Spark. You can navigate to the hdfs files using the created view earlier to preview the data right on HDFS:
At this point we setup an Ambari HDFS view to browse our distributed files system on HDP and tested the Vora connectivity to HDFS that everything is working.
Stay tuned for the next article "How to connect SAP Vora to SAP HANA using Apache Zeppelin", where we will now use the Apache Zeppelin to connect to the SAP HANA system in part 1