Member since
06-05-2019
126
Posts
133
Kudos Received
11
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1387 | 12-17-2016 08:30 PM | |
988 | 08-08-2016 07:20 PM | |
2006 | 08-08-2016 03:13 PM | |
2003 | 08-04-2016 02:49 PM | |
1830 | 08-03-2016 06:29 PM |
09-24-2016
12:24 AM
1 Kudo
You may be in a bind if you need to install HDP on Azure with CentOS 6 or RHEL 6 and certain services (not everything). By following these steps below, you will be able to use ambari-server to install HDP on any of the supported Hortonworks/Azure VMs. 1) Configure your VMs - use the same VNet for all VMs Run the next steps as root or sudo the commands: 2) Update /etc/hosts on all your machines: vi /etc/hosts
172.1.1.0 master1.jd32j3j3kjdppojdf3349dsfeow0.dx.internal.cloudapp.net
172.1.1.1 master2.jd32j3j3kjdppojdf3349dsfeow0.dx.internal.cloudapp.net
172.1.1.2 master3.jd32j3j3kjdppojdf3349dsfeow0.dx.internal.cloudapp.net
172.1.1.3 worker1.jd32j3j3kjdppojdf3349dsfeow0.dx.internal.cloudapp.net
172.1.1.4 worker2.jd32j3j3kjdppojdf3349dsfeow0.dx.internal.cloudapp.net
172.1.1.5 worker3.jd32j3j3kjdppojdf3349dsfeow0.dx.internal.cloudapp.net * use the FQDN (find the fqdn by typing hostname -f). The ip address are internal and can be found on eth0 by typing ifconfig 3) Edit /etc/sudoers.d/waagent so that you don't need to type a password when sudoing a) change permissions on /etc/sudoers.d/waagent: chmod 600 /etc/sudoers.d/waagent
b) update the file "username ALL = (ALL) ALL" to "username ALL = (ALL) NOPASSWD: ALL": vi /etc/sudoers.d/waagent c) change permissions on /etc/sudoers.d/waagent: chmod 440 /etc/sudoers.d/waagent * change username to the user that you sudo with (the user that will install Ambari) 3) Disable iptables a) service iptables stop
b) chkconfig iptables off * If you need iptables enabled, please make the necessary port configuration changes found here 4) Disable transparent huge pages a) Run the following in your shell: cat > /usr/local/sbin/ambari-thp-disable.sh <<-'EOF'
#!/usr/bin/env bash
# disable transparent huge pages: for Hadoop
thp_disable=true
if [ "${thp_disable}" = true ]; then
for path in redhat_transparent_hugepage transparent_hugepage; do
for file in enabled defrag; do
if test -f /sys/kernel/mm/${path}/${file}; then
echo never > /sys/kernel/mm/${path}/${file}
fi
done
done
fi
exit 0
EOF
b) chmod 755 /usr/local/sbin/ambari-thp-disable.sh
c) sh /usr/local/sbin/ambari-thp-disable.sh * Perform a-c on all hosts to disable transparent huge pages 5) If you don't have a private key generated (where the host running ambari-server can use a privat key to login to all the hosts - please perform this step) a) ssh-keygen -t rsa -b 2048 -C "username@master1.jd32j3j3kjdppojdf3349dsfeow0.dx.internal.cloudapp.net"
b) ssh-copy-id -i /locationofgeneratedinaabove/id_rsa.pub username@master1 * Run b above on all hosts, this way you can ssh using the username into all hosts from the ambari-server host without a password 6) Install the ambari repo on the server where you'll install Ambari (documentation😞 wget -nv http://public-repo-1.hortonworks.com/ambari/centos6/2.x/updates/2.2.2.0/ambari.repo -O /etc/yum.repos.d/ambari.repo 7) Install ambari-server: yum install ambari-server 😎 Setup ambari-server: ambari-server setup * You can use the defaults by pressing ENTER 9) Start ambari-server: ambari-server start
* This could take a few minutes to startup depending on the speed of your machine 10) Open your browser and go to the ip address where ambari-server is running http://ambariipaddress:8080 * Continue with your HDP 2.4.3 installation
... View more
09-14-2016
05:25 AM
1 Kudo
Hi @Ryan Cicak Several processors that call an API (if not all) have a property Connection Timeout. You can set this property to wait for a fixed duration depending on your data source, network condition and so on (look at GetHttp for instance). You can use this property with a max retry strategy. The processor wait until the time out expire, and try again until it reaches a max retry number. If the max retry is reached, the flowfile goes into a processor that handle this special case (alert an admin, store data in a dir for errors, etc)
... View more
09-14-2016
11:43 AM
2 Kudos
What you described is the correct process, the NAR needs to be copied to the lib directory on each node of the cluster, and then the nodes need to be restarted. Nothing has changed in 1.0.0 that changes this approach.
... View more
08-08-2016
03:13 PM
1 Kudo
Hi @Mayank Pandey If you have existing tables (not in ORC format), I'd recommend creating the ORC tables. Then run: insert into yourorctable
select * from yourexistingtable; Is this how you are currently inserting data?
... View more
08-08-2016
07:32 PM
1 Kudo
Hi @john doe I recently ran PutKafka and GetKafka in NiFi (connecting to a local VM). I found that adding the FQDN and ip to /etc/hosts made this work for me. For example if the FQDN is host1.local and IP is 192.168.4.162 then adding 192.168.4.162 host1.local to /etc/hosts Made this work.
... View more
08-04-2016
03:50 PM
Hi @sbhat - this is certainly helpful, thank you for the reference!
... View more
08-03-2016
03:45 AM
Thank you! I believe java upgrade is not required. only openssl upgrade should fix this.
... View more
07-15-2016
11:28 PM
8 Kudos
Teradata's JDBC connector contains two jar files (tdgssconfig.jar and terajdbc4.jar) that must both be contained within the classpath. NiFi Database processors like ExecuteSQL or PutSQL use a connection pool such as DBCPConnectionPool which defines your JDBC connection to a database like Teradata. Follow the steps below to integrate Teradata JDBC connector into your DBCPConnectionPool: 1) Download the Teradata connectors (tdgssconfig.jar and terajdbc4.jar) - you can download the Teradata v1.4.1 connector on http://hortonworks.com/downloads/ 2) Extract the jar files (tdgssconfig.jar and terajdbc4.jar) from hdp-connector-for-teradata-1.4.1.2.3.2.0-2950-distro.tar.gz and move these files to your NIFI_DIRECTORY/lib/* 3) Restart NiFi 4) Under your DBCPConnectionPool (Controller > Controller Services), Edit your existing DBCPConnectionPool (if your pool is active, disable it before editing) 5) Under the Configuration Controller Service > Properties, define the following Database Connection URL: your Teradata jdbc connection url Database Driver Class Name: com.teradata.jdbc.TeraDriver Database Driver Jar Url: * Do not define anything, since you added the two jars to the NiFi classpath (nifi/lib), the driver jars will be automatically picked up -> you could only add one Jar here and you need two *which is why we added to the nifi/lib directory Database User: Provide Database user Password: Provide password for Database user You're all set, you'll now be able to connect to Teradata from NiFi!
... View more
Labels:
07-28-2016
08:53 AM
I recommend you try it in dev or on virtual environment. Did you use Ubuntu 14.04 to install HDP? Probably not. Do one machine at a time.
... View more
06-29-2016
01:54 PM
5 Kudos
Security is a key element when discussing Big Data. A common requirement with security is data encryption. By following the instructions below, you'll be able to setup transparent data encryption in HDFS on defined directories otherwise known as encryption zones "EZ". Before starting this step-by-step tutorial, there are three HDP services that are essential (must be installed): 1) HDFS 2) Ranger 3) Ranger KMS Step 1: Prepare environment As explained in the HDFS "Data at Rest" Encryption manual a) If using Oracle JDK, verify JCE is installed (OpenJDK has JCE installed by default) If the server running Ranger KMS is using Oracle JDK, you must install JCE (necessary for Ranger KMS to run) instructions on installing JCE can be found here b) CPU Support for AES-NI optimization AES-NI optimization requires an extended CPU instruction set for AES hardware acceleration. There are several ways to check for this; for example: cat /proc/cpuinfo | grep aes
Look for output with flags and 'aes'. c) Library Support for AES-NI optimization You will need a version of the libcrypto.so library that supports hardware acceleration, such as OpenSSL 1.0.1e. (Many OS versions have an older version of the library that does not support AES-NI.) A version of the libcrypto.so libary with AES-NI support must be installed on HDFS cluster nodes and MapReduce client hosts -- that is, any host from which you issue HDFS or MapReduce requests. The following instructions describe how to install and configure the libcrypto.so library. RHEL/CentOS 6.5 or later On HDP cluster nodes, the installed version of libcrypto.so supports AES-NI, but you will need to make sure that the symbolic link exists: sudo ln -s /usr/lib64/libcrypto.so.1.0.1e /usr/lib64/libcrypto.so On MapReduce client hosts, install the openssl-devel package: sudo yum install openssl-devel d) Verify AES-NI support To verify that a client host is ready to use the AES-NI instruction set optimization for HDFS encryption, use the following command: hadoop checknative You should see a response similar to the following: 15/08/12 13:48:39 INFO bzip2.Bzip2Factory: Successfully loaded & initialized native-bzip2 library system-native
14/12/12 13:48:39 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
Native library checking:
hadoop: true /usr/lib/hadoop/lib/native/libhadoop.so.1.0.0
zlib: true /lib64/libz.so.1
snappy: true /usr/lib64/libsnappy.so.1
lz4: true revision:99
bzip2: true /lib64/libbz2.so.1
openssl: true /usr/lib64/libcrypto.so Step 2: Create an Encryption key This step will outline how to create an encryption key using Ranger. a) Login to Ranger http://RANGER_FQDN_ADDR:6080/ * To access Ranger KMS (Encryption) - login using the username "keyadmin", the default password is "keyadmin" - remember to change this password b) Choose Encryption > Key Manager * In this tutorial, "hdptutorial" is the name of the HDP cluster. Your name will be different, depending on your cluster name. c) Choose Select Service > yourclustername_kms
d) Choose "Add New Key"
e) Create the new key Length - either 128 or 256 * Length of 256 requires JCE installed on all hosts in the cluster"The default key size is 128 bits. The optional -size parameter supports 256-bit keys, and requires the Java Cryptography Extension (JCE) Unlimited Strength Jurisdiction Policy File on all hosts in the cluster. For installation information, see the Ambari Security Guide." Step 3: Add KMS Ranger Policies for encrypting directory a) Login to Ranger http://RANGER_FQDN_ADDR:6080/ * To access Ranger KMS (Encryption) - login using the username "keyadmin", the default password is "keyadmin" - remember to change this password b) Choose Access Manager > Resource Based Policies c) Choose Add New Policy d) Create a policy - the user hdfs must be added to GET_METDATA and GENERATE_EEK -> using any user calls the user hdfs in the background - the user "nicole" is a custom user I created to be able to read/write data using the key "yourkeyname"
Step 4: Create an Encryption Zone a) Create a new directory hdfs dfs -mkdir /zone_encr * Leave the directory empty until the directory has been encrypted (recommend using a superuser to create the directory) b) Create an encryption zone hdfs crypto -createZone -keyName yourkeyname -path /zone_encr * Using the user "nicole" above to create the encryption zone c) Validate the encryption zone exists hdfs crypto -listZones * must be a superuser to call this command (or part of a superuser group like hdfs) The command should output: [nicole@hdptutorial01 security]$ hdfs crypto -listZones
/zone_encr yourkeyname
* You will now be able to read/write data to your encrypted directory /zone_encr. If you receive any errors - including "IOException:" when creating an encryption zone in Step 4 (b) take a look at your Ranger KMS server /var/log/ranger/kms/kms.log -> there usually is a permission issue accessing the key * To find out more about how transparent data encryption in HDFS works, refer to the Hortonworks blog here Tested in HDP: 2.4.2
... View more