Member since
01-07-2019
217
Posts
135
Kudos Received
18
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2081 | 12-09-2021 09:57 PM | |
1942 | 10-15-2018 06:19 PM | |
9406 | 10-10-2018 07:03 PM | |
4215 | 07-24-2018 06:14 PM | |
1539 | 07-06-2018 06:19 PM |
08-09-2017
09:30 AM
@Dominika Bialek (Q1): It is assumed that it is the name of an already existing bucket (Q2): It is an optional (and deprecated) configuration parameter (mapped to "fs.gs.system.bucket" in core-site.xml) to set the GCS bucket to use as a default bucket for URI-s without "gs:" prefix. Please see more here and here and here. Hope this helps!
... View more
06-16-2017
07:55 PM
2 Kudos
With HDCloud for AWS 1.16 and Cloudbreak 1.16.1, you can optionally use Hortonworks flex subscriptions to cover the controller and all clusters created. To do that you must:
Obtain a flex subscription. For general information about the Hortonworks Flex Support Subscription, visit the Hortonworks Support page at https://hortonworks.com/services/support/enterprise/. Configure SmartSenseID and Telemetry when launching the cloud controller (HDCloud for AWS) or Cloudbreak. Register your Flex Subscription in the cloud controller or Cloudbreak web UI. When creating a cluster, select a flex subscription to use for the cluster. Configure and Manage Flex in HDCloud 1. To configure flex in HDCloud, you must provide your SmartSense ID and Telemetry Opt In when launching the cloud controller. These options are available in the SmartSense Configuration section of the CloudFormation template: 2. Once you log in to the cloud controller web UI, you can manage your flex subscriptions from the Settings > Manage Flex Subscriptions: 3. From this page, you can: Register a new flex subscription. The first subscription that you add will be used as default for the cloud controller and all cluster created by it. Set a default flex subscription (if using multiple subscriptions). Select a flex subscription to be used for cloud controller (if using multiple subscriptions). Delete a flex subscription. Check which clusters are connected to a specific subscription. 4. When creating a cluster, you will see in the GENERAL CONFIGURATION > SHOW ADVANCED OPTIONS that your default flex subscription is automatically assigned to the cluster: 5. Equivalent options are also available via HDC CLI. See https://docs.hortonworks.com/HDPDocuments/HDCloudAWS/HDCloudAWS-1.16.0/bk_hdcloud-aws/content/cli-using/index.html#managing-flex-subscriptions. Configure and Manage Flex in Cloudbreak 1. To configure flex in Cloudbreak, enable SmartSense in the Profile by adding the following variables: export CB_SMARTSENSE_ID=YOUR-SMARTSENSE-ID
export CB_SMARTSENSE_CONFIGURE=true For example: export CB_SMARTSENSE_ID=A-00000000-C-00000000
export CB_SMARTSENSE_CONFIGURE=true You can do this in one of the two ways: - When initiating Cloudbreak - After you've already initiated Cloudbreak Deployer. In this case, you must restart Cloudbreak using cbd restart. 2. Once you log in to the Cloudbreak web UI, you can manage your flex subscriptions from the manage flex subscriptions tab: 3. From this page, you can:
Register a new flex subscription. Set a default flex subscription.
Select a flex subscription to be used for cloud controller.
Delete a flex subscription.
Check which clusters are connected to a specific subscription. 4. When creating a cluster using the advanced options, in the CONFIGURE CLUSTER > Flex Subscriptions, you can select the flex subscription that you want to use: 5. Equivalent options are also available via Cloudbreak Shell. Hortonworks Documentation https://docs.hortonworks.com/HDPDocuments/HDCloudAWS/HDCloudAWS-1.16.0/bk_hdcloud-aws/content/flex/index.html http://sequenceiq.com/cloudbreak-docs/release-1.16.1/help/
... View more
Labels:
06-16-2017
07:35 AM
"\"Hortonworks Data Cloud - HDP Services\" product is designed to be used in conjunction with the "Hortonworks Data Cloud - Controller Service" "After you have subscribed to this product, you should also subscribe to the "Hortonworks Data Cloud - Controller Service" Marketplace product and use that product to first launch the cloud controller service." source: https://aws.amazon.com/marketplace/pp/B01M193KGR "To launch Hortonworks Data Cloud for AWS, you need to subscribe to the following Hortonworks Data Cloud products in AWS Marketplace:
Hortonworks Data Cloud - Controller Service Hortonworks Data Cloud - HDP Services" source: https://docs.hortonworks.com/HDPDocuments/HDCloudAWS/HDCloudAWS-1.16.0/bk_hdcloud-aws/content/subscribe/index.html
... View more
06-03-2017
08:14 PM
Pretty informative and useful. Thanks @Dominika Bialek for writing this. Keep it up !!
... View more
06-02-2017
06:24 PM
7 Kudos
Overview
To control access, Azure uses Azure Active Directory (Azure AD), a multi-tenant cloud-based directory and identity management service. To learn more, refer to
https://docs.microsoft.com/en-us/azure/active-directory/active-directory-whatis.
In short, to configure authentication with ADLS using the client credential, you must register a new application with Active Directory service and then give your application access to your ADL account. After you've performed these steps, you can configure your core-site.xml.
Note for Cloudbreak users: When you create a cluster with Cloudbreak, you can configure authentication with ADLS on the "Add File System" page of the create cluster wizard and then you must perform an additional step as described in Cloudbreak documentation. If you do this, you do not need to perform the steps below. If you have already created a cluster with Cloudbreak but did not perform ADLS configuration on the "Add File System" page of the create cluster wizard, follow the steps below: Prerequisites
1. To use ADLS storage, you must have a subscription for Data Lake Storage.
2. To access ADLS data in HDP, you must have an HDP version that supports that. I am using HDP 2.6.1, which supports connecting to ADLS using the ADL connector. Step 1: Register an application
1. Log in to the Azure Portal at
https://portal.azure.com/.
2. Navigate to your
Active Directory and then select App Registrations:
3. Create a new web application by clicking on
+New application registration.
4. Specify an application name, type (Web app/API), and sign-on URLs.
Remember the application name: you will later add it to your ADLS account as an authorized user:
5. Once an application is created, navigate to the application configuration and find the Keys in the application's settings:
6. Create a key by entering key description, selecting a key duration, and then clicking
Save. Make sure to copy and save the key value. You won't be able to retrieve it after you leave the page.
7. Write down the properties that you will need to authenticate: Step 2: Assign permissions to your application 1.Log in to the Azure Portal. 2.If you don't have an ADL account, create one: 3.Navigate to your ADL account and then select Access Control (IAM): 4.Click on +Add to add to add role-based permissions. 5.Under Role select the "Owner". Under Select, select your application. This will grant the "Owner" role for this ADL account to your application. Note: If you are not able to assign the "Owner" role, you can set fine-grained RWX ACL permissions for your application, allowing it access to the files and folders of your ADLS account, as documented here. Note: If using a corporate Azure account, you may be unable to perform the role assignment step. In this case, contact your Azure admin to perform this step for you. Step 3: Configure core-site.xml
1.Add the following four properties to your core-site.xml.
While "fs.adl.oauth2.access.token.provider.type" must be set to “ClientCredential” you can obtain the remaining three parameters from step 7 above.
<property>
<name>fs.adl.oauth2.access.token.provider.type</name>
<value>ClientCredential</value></property>
<property>
<name>fs.adl.oauth2.client.id</name>
<value>APPLICATION-ID</value></property>
<property>
<name>fs.adl.oauth2.credential</name>
<value>KEY</value></property>
<property>
<name>fs.adl.oauth2.refresh.url</name>
<value>TOKEN-ENDPOINT</value>
</property>
2. (Optional) It's recommended that you protect your credentials with credential providers. For instructions, refer to https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.1/bk_cloud-data-access/content/adls-protecting-credentials.html. Step 4: Validate access to ADLS
To make sure that the authentication works, try accessing data. To test access, SSH to any cluster node, switch to the hdfs user by using sudo su hdfs and then try accessing your data. The URL structure is:
adl://<data_lake_store_name>.azuredatalakestore.net/dir/file
For example, to access "testfile" located in a directory called "testdir", stored in a data lake store called "mytest", the URL is:
adl://mytest.azuredatalakestore.net/testdir/testfile
The following FileSystem shell commands demonstrate access to a data lake store named mytest:
hadoop fs -ls adl://mytest.azuredatalakestore.net/
hadoop fs -mkdir adl://mytest.azuredatalakestore.net/testDir
hadoop fs -put testFile adl://mytest.azuredatalakestore.net/testDir/testFile
hadoop fs -cat adl://mytest.azuredatalakestore.net/testDir/testFiletest
file content
Learn more
For more information about working with ADLS, refer to Getting Started with ADLS in Hortonworks documentation.
... View more
06-02-2017
04:25 PM
1 Kudo
We are excited to introduce the new Cloud Data Access guide for HDP 2.6.1. The goal of this guide is to provide information and steps required for configuring, using, securing, tuning performance, and troubleshooting access to the cloud storage services using HDP cloud storage connectors available for Amazon Web Services (Amazon S3) and Microsoft Azure (ADLS, WASB). To learn about the architecture of the cloud connectors, refer to Introducing the Cloud Storage Connectors. To get started with your chosen cloud storage service, refer to:
Getting Started with Amazon S3 Getting Started with ADLS Getting Started with WASB Once you have configured authentication with the chosen cloud storage service, you can start working with the data. To get started, refer to:
Accessing Cloud Data with Hive Accessing Cloud Data with Spark Copying Cloud Data with Hadoop If you have comments or suggestions, corrections or updates regarding our documentation, let us know on HCC. Help us continue to improve our documentation! Thanks! Hortonworks Technical Documentation Team
... View more
07-21-2017
11:06 AM
@Dominika Bialek Thanks for checking in. I ended up switching regions.
... View more
04-05-2017
05:25 PM
5 Kudos
HDCloud for AWS general availability version 1.14.1 is now available, including six new HDP 2.6 and Ambari 2.5 cluster configurations and new cloud controller features. If you are new to HDCloud, you can get started using this tutorial (updated for 1.14.1). Officail HDCloud for AWS documentation is available here. HDP 2.6 and Ambari 2.5 The following HDP 2.6 configurations are now available: For the list of all available HDP 2.5 and HDP 2.6 configurations, refer to Cluster Configurations documentation. Resource Tagging When creating a cluster, you can optionally add custom tags that will be displayed on the CloudFormation stack and on EC2 instances, allowing you to keep track of the resources that cloud controller crates on your behalf. For more information, refer to Resource Tagging documentation. Node Auto Repair The cloud controller monitors clusters by checking for Ambari Agent heartbeat on all cluster nodes. If the Ambari Agent heartbeat is lost on a node, a failure is reported for that node. Once the failure is reported, it is fixed automatically (if auto repair is enabled), or options are available for you to fix the failure manually (if auto repair is disabled). You can configure auto repair settings for each cluster when you create it. For more information, refer to Node Auto Repair documentation. Auto Scaling Auto Scaling provides the ability to increase or decrease the number of nodes in a cluster according to the auto scaling policies that you define. After you create an auto scaling policy, cloud controller will execute the policy when the conditions that you specified are met. You can create an auto scaling policy when creating a cluster or when the cluster is already running you can manage the auto scaling settings and policies. For more information, refer to Auto Scaling documentation. Protected Gateway HDCloud now configures a protected gateway on the cluster master node. This gateway is designed to provide access to various cluster resources from a single network port. Shared Druid Metastore (Technical Preview) When creating an HDP 2.6 cluster based on the BI configuration, you have an option to have a Druid metastore database created with the cluster, or you can use an external Druid metastore that is backed by Amazon RDS. Using an external Amazon RDS database for a Druid metastore allows you to preserve the Druid metastore metadata and reuse it between clusters. For more information, refer to Managing Shared Metastores documentation. The features are available via cloud controller UI or CLI.
... View more
03-27-2017
07:57 PM
2 Kudos
Thanks @Ram Venkatesh. After registering the metastore I can see the named entry in my json. Thankyou!
... View more