Member since
06-05-2019
128
Posts
133
Kudos Received
11
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1641 | 12-17-2016 08:30 PM | |
1226 | 08-08-2016 07:20 PM | |
2248 | 08-08-2016 03:13 PM | |
2316 | 08-04-2016 02:49 PM | |
2133 | 08-03-2016 06:29 PM |
08-02-2016
04:01 PM
@mike harding -> are you running Zeppelin (within HDP 2.5 Sandbox) or running Zeppelin standalone?
... View more
08-02-2016
03:32 PM
Hi @mike harding What version of Zeppelin are you using?
... View more
07-15-2016
11:28 PM
8 Kudos
Teradata's JDBC connector contains two jar files (tdgssconfig.jar and terajdbc4.jar) that must both be contained within the classpath. NiFi Database processors like ExecuteSQL or PutSQL use a connection pool such as DBCPConnectionPool which defines your JDBC connection to a database like Teradata. Follow the steps below to integrate Teradata JDBC connector into your DBCPConnectionPool: 1) Download the Teradata connectors (tdgssconfig.jar and terajdbc4.jar) - you can download the Teradata v1.4.1 connector on http://hortonworks.com/downloads/ 2) Extract the jar files (tdgssconfig.jar and terajdbc4.jar) from hdp-connector-for-teradata-1.4.1.2.3.2.0-2950-distro.tar.gz and move these files to your NIFI_DIRECTORY/lib/* 3) Restart NiFi 4) Under your DBCPConnectionPool (Controller > Controller Services), Edit your existing DBCPConnectionPool (if your pool is active, disable it before editing) 5) Under the Configuration Controller Service > Properties, define the following Database Connection URL: your Teradata jdbc connection url Database Driver Class Name: com.teradata.jdbc.TeraDriver Database Driver Jar Url: * Do not define anything, since you added the two jars to the NiFi classpath (nifi/lib), the driver jars will be automatically picked up -> you could only add one Jar here and you need two *which is why we added to the nifi/lib directory Database User: Provide Database user Password: Provide password for Database user You're all set, you'll now be able to connect to Teradata from NiFi!
... View more
Labels:
07-14-2016
04:14 PM
1 Kudo
Hi @Angel Kafazov If you upgraded HDP and Ambari to their latest versions, what are you worried about on updating your Ubuntu packages (since HDP and Ambari are already at their latest versions)? Referring to https://docs.hortonworks.com/HDPDocuments/Ambari-2.2.2.0/bk_Installing_HDP_AMB/content/_download_the_ambari_repo_ubuntu14.html if you want a version of Ambari later than 2.2.2, wouldn't you need to add a new repo? wget -nv http://public-repo-1.hortonworks.com/ambari/ubuntu14/2.x/updates/2.2.2.0/ambari.list -O /etc/apt/sources.list.d/ambari.list (later than 2.2.2)?
... View more
07-11-2016
03:45 PM
Hi @Pardeep Gorla I have not configured freeipa - but have configured MIT KDC, so I'll refer to MIT KDC. Are you referring to logging into an edge node with a ppk file in AWS? Once you're logged into the edge node, you'll need to run kinit (kdc user) - and provide an additional password, which will generate a Kerberos ticket. Once you have the Kerberos ticket, you'll be able to access HDFS (and other services integrated with Kerberos). If something goes wrong in the automated setup, you can simply disable Kerberos -> where the services will shutdown, remove their Kerberos permissions and then start back up.
... View more
06-29-2016
01:54 PM
5 Kudos
Security is a key element when discussing Big Data. A common requirement with security is data encryption. By following the instructions below, you'll be able to setup transparent data encryption in HDFS on defined directories otherwise known as encryption zones "EZ". Before starting this step-by-step tutorial, there are three HDP services that are essential (must be installed): 1) HDFS 2) Ranger 3) Ranger KMS Step 1: Prepare environment As explained in the HDFS "Data at Rest" Encryption manual a) If using Oracle JDK, verify JCE is installed (OpenJDK has JCE installed by default) If the server running Ranger KMS is using Oracle JDK, you must install JCE (necessary for Ranger KMS to run) instructions on installing JCE can be found here b) CPU Support for AES-NI optimization AES-NI optimization requires an extended CPU instruction set for AES hardware acceleration. There are several ways to check for this; for example: cat /proc/cpuinfo | grep aes
Look for output with flags and 'aes'. c) Library Support for AES-NI optimization You will need a version of the libcrypto.so library that supports hardware acceleration, such as OpenSSL 1.0.1e. (Many OS versions have an older version of the library that does not support AES-NI.) A version of the libcrypto.so libary with AES-NI support must be installed on HDFS cluster nodes and MapReduce client hosts -- that is, any host from which you issue HDFS or MapReduce requests. The following instructions describe how to install and configure the libcrypto.so library. RHEL/CentOS 6.5 or later On HDP cluster nodes, the installed version of libcrypto.so supports AES-NI, but you will need to make sure that the symbolic link exists: sudo ln -s /usr/lib64/libcrypto.so.1.0.1e /usr/lib64/libcrypto.so On MapReduce client hosts, install the openssl-devel package: sudo yum install openssl-devel d) Verify AES-NI support To verify that a client host is ready to use the AES-NI instruction set optimization for HDFS encryption, use the following command: hadoop checknative You should see a response similar to the following: 15/08/12 13:48:39 INFO bzip2.Bzip2Factory: Successfully loaded & initialized native-bzip2 library system-native
14/12/12 13:48:39 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
Native library checking:
hadoop: true /usr/lib/hadoop/lib/native/libhadoop.so.1.0.0
zlib: true /lib64/libz.so.1
snappy: true /usr/lib64/libsnappy.so.1
lz4: true revision:99
bzip2: true /lib64/libbz2.so.1
openssl: true /usr/lib64/libcrypto.so Step 2: Create an Encryption key This step will outline how to create an encryption key using Ranger. a) Login to Ranger http://RANGER_FQDN_ADDR:6080/ * To access Ranger KMS (Encryption) - login using the username "keyadmin", the default password is "keyadmin" - remember to change this password b) Choose Encryption > Key Manager * In this tutorial, "hdptutorial" is the name of the HDP cluster. Your name will be different, depending on your cluster name. c) Choose Select Service > yourclustername_kms
d) Choose "Add New Key"
e) Create the new key Length - either 128 or 256 * Length of 256 requires JCE installed on all hosts in the cluster"The default key size is 128 bits. The optional -size parameter supports 256-bit keys, and requires the Java Cryptography Extension (JCE) Unlimited Strength Jurisdiction Policy File on all hosts in the cluster. For installation information, see the Ambari Security Guide." Step 3: Add KMS Ranger Policies for encrypting directory a) Login to Ranger http://RANGER_FQDN_ADDR:6080/ * To access Ranger KMS (Encryption) - login using the username "keyadmin", the default password is "keyadmin" - remember to change this password b) Choose Access Manager > Resource Based Policies c) Choose Add New Policy d) Create a policy - the user hdfs must be added to GET_METDATA and GENERATE_EEK -> using any user calls the user hdfs in the background - the user "nicole" is a custom user I created to be able to read/write data using the key "yourkeyname"
Step 4: Create an Encryption Zone a) Create a new directory hdfs dfs -mkdir /zone_encr * Leave the directory empty until the directory has been encrypted (recommend using a superuser to create the directory) b) Create an encryption zone hdfs crypto -createZone -keyName yourkeyname -path /zone_encr * Using the user "nicole" above to create the encryption zone c) Validate the encryption zone exists hdfs crypto -listZones * must be a superuser to call this command (or part of a superuser group like hdfs) The command should output: [nicole@hdptutorial01 security]$ hdfs crypto -listZones
/zone_encr yourkeyname
* You will now be able to read/write data to your encrypted directory /zone_encr. If you receive any errors - including "IOException:" when creating an encryption zone in Step 4 (b) take a look at your Ranger KMS server /var/log/ranger/kms/kms.log -> there usually is a permission issue accessing the key * To find out more about how transparent data encryption in HDFS works, refer to the Hortonworks blog here Tested in HDP: 2.4.2
... View more
06-28-2016
01:24 AM
Hi @Ethan Hsieh Can you try running the commands in beeline or Ambari View? I wouldn't recommend running in Hive CLI - let me know if Beeline or Ambari View output the same error.
... View more
06-21-2016
06:55 PM
1 Kudo
Hi @Pardeep Gorla To re-iterate the great answers above - yes, you can enable Kerberos without AD/LDAP - its called MIT Kerberos. Please follow the instructions here You're creating a new MIT Kerberos Instance: https://docs.hortonworks.com/HDPDocuments/Ambari-2.2.2.0/bk_Ambari_Security_Guide/content/_optional_install_a_new_mit_kdc.html You're using an existing MIT Kerberos Instance: https://docs.hortonworks.com/HDPDocuments/Ambari-2.2.2.0/bk_Ambari_Security_Guide/content/_use_an_exisiting_mit_kdc.html
... View more
06-15-2016
09:52 PM
Hi All, I've searched around HCC and was unable to come up with an answer to: There doesn't seem to be a way to automatically define a retention policy for the Ranger Audit Data (audit data is kept indefinitely unless we manually remove it). Is there a plan to add an automatic retention policy for these audit logs in HDFS and/or Solr? * Falcon can be used for retention in HDFS - but will there be an easy-to-configure option under Ambari>Ranger under Audit?
... View more
Labels:
- Labels:
-
Apache Ranger
06-10-2016
11:35 PM
2 Kudos
A remote Linux system can use NFS (Network File System) to mount an HDFS file system and interact with the file system. Before proceeding, it's important to understand that your linux instance is directly accessing your HDFS system through the network, therefore you will incur network latency. Depending on your dataset size, you have to remember you could be potentially processing gigabytes or more of data on a single machine therefore this is not the best approach for large datasets. These steps will show you how to mount and interact with a remote HDFS node within your Linux system: 1) The linux system must have NFS installed (CentOS for demo) yum install nfs-utils nfs-utils-lib 2) Your HDP cluster must have an NFS Gateway installed (Ambari allows this option with one click) * Keep track of either the FQDN or IP address of the NFSGateway 3) In Ambari, under HDFS > Advanced > General set Access time precision = 3600000 3) Mount the NFS Gateway on your linux system (must be root) mount -t nfs -o vers=3,proto=tcp,nolock myipaddressorfqdnofnfsgateway:/ /opt/remotedirectory
4) On both your HDFS node & remote Linux system add the same user with the same uid (making sure neither already exist) useradd -u 1234 testuser * If your user/uid doesn't match between HDFS node and your remote Linux system - whatever uid you are logged in as on your remote Linux system will be passed and interpreted by the NFS Gateway. For example if your Linux system has usertest (uid = 501) and you write a file to HDFS's /tmp, the file owner of the file will be whichever user on the HDFS node matches uid=501 - therefore it is good practice to match both the username and the uid across both systems. 5) On your remote Linux system, login as your "testuser" and go-to your mounted NFS directory cd /opt/remotedirectory You will now be able to interact with HDFS with native linux command such as cp, less, more, etc:.
... View more
Labels: