Member since
02-07-2019
1792
Posts
1
Kudos Received
0
Solutions
03-29-2022
07:40 AM
@aarif, Thanks for reaching out to us. You will have a better chance of getting responses if you can post your query in the Support Questions section. You can post your query there and provide the link to this article in your post. Experts who are following the components/labels will be able to help you quickly.
... View more
07-15-2021
07:40 AM
@renzhongpei for cluster mode it can be put on hdfs location as well. And can be referenced from there in files argument of spark-submit script. --files hdfs://namenode:8020/log4j-driver.properties#log4j-driver.properties
... View more
09-27-2020
11:14 PM
In this video, we'll review how to access data in S3 from the command line of a Data Hub cluster host using IDBroker. Some components in CDP work out of the box with IDBroker. However, most command-line tools like the Hadoop file system commands require a couple of additional steps to access data in S3. We'll demonstrate retrieving a keytab file for a workload user and using it to kinit on the Data Hub cluster host, enabling data access via IDBroker.
Open the video on YouTube here
Many command-line tools in CDP Public Cloud Data Hub clusters require a Kerberos ticket granting ticket (TGT) for a workload user in order to obtain a short-term access token for S3 or ADLS Gen 2 via IDBroker.
This video demonstrates the following steps:
Granting a data access role to a workload user
Retrieving a keytab file for the workload user
Copying the keytab file to a host in the data hub cluster
Using the keytab file to kinit
Confirming the TGT using klist
Accessing data in S3 via IDBroker
It mentions, but does not demonstrate, retrieving a keytab file via the cdp command-line tool. Instructions for doing so are available in CDP documentation.
... View more
Labels:
08-20-2020
01:47 AM
This video covers Livy's Feature, Operational Flow and Basic Demo.
Open the video on YouTube here
... View more
07-30-2020
01:33 AM
This video covers the Default Zeppelin UI option and basic navigation demo.
References:
Cloudera Product document page: https://docs.cloudera.com/runtime/7.1...
Cloudera tutorials: https://www.cloudera.com/tutorials/ge...
Visit Apache Zeppelin Website: http://zeppelin.apache.org
Cloudera Community: https://community.cloudera.com/
... View more
07-30-2020
01:30 AM
1 Kudo
This video covers zeppelin’s backend operations and overview on impersonation concepts in Zeppelin.
References:
Cloudera Product document page: https://docs.cloudera.com/runtime/7.1...
Cloudera tutorials: https://www.cloudera.com/tutorials/ge...
Visit Apache Zeppelin Website: http://zeppelin.apache.org
Cloudera Community: https://community.cloudera.com/
... View more
07-30-2020
01:26 AM
This video covers High level overview of Zeppelin’s architecture and Operational Flow.
References:
Cloudera Product document page: https://docs.cloudera.com/runtime/7.1...
Cloudera tutorials: https://www.cloudera.com/tutorials/ge...
Visit Apache Zeppelin Website: http://zeppelin.apache.org
Cloudera Community: https://community.cloudera.com/
... View more
05-22-2020
09:45 AM
We have created a new Support Video based on this topic:
How to mask Hive columns using Atlas tags and Ranger?
... View more
05-22-2020
09:20 AM
1 Kudo
Masking of Hive columns could be achieved using Hive resource-based policies and masking policies for databases, tables, and columns. However, dynamic masking could be achieved using Atlas tags or collections (from HDP-3.x), which empowers users to regulate the visibility of sensitive data by leveraging Atlas tag-based policies in Ranger.
Prerequisites of this tutorial include a healthy HDP cluster with existing tables/databases in Hive, Atlas configuration with Hive and Ranger, and Audit to Solr enabled for Ranger.
Open the video on YouTube here
This video is based on the original article How to Mask Columns in Hive with Atlas and Ranger.
Other references:Providing Authorization with Apache Ranger
... View more
Labels:
04-21-2020
04:32 AM
This video describes how to register an HDP cluster in DataPlane:
Open the video on YouTube here
DataPlane is a portfolio of data solutions that supports the management in discovery of data (whether at-rest or in-motion) and enable an enterprise hybrid data strategy (from data center to the cloud).
DataPlane is composed of a core platform (“DP Platform” or “Platform”) and an extensible set of apps (“DP Apps”) that are installed on the platform. Depending on the app which you plan to use, you may be required to install an agent into a cluster to support that app, as well as meet other cluster requirements.
The following are documents for reference:
Configure Knox Gateway for DataPlane
Configure Knox SSO for DataPlane
... View more