Member since
06-20-2017
15
Posts
6
Kudos Received
0
Solutions
04-05-2018
12:10 AM
The script provided here can be used to perform health check of a node before installing DSX. There are certain pre-requisites when installing IBM DSX, without which installation cannot go through. This script helps to validate health of a node and check if it is fit to install DSX. Note: Make changes to line 166 and 182 as per your installation. The drives may vary based on where you are doing the installation. # /bin/bash
function checkRAM(){
local size="$1"
local limit="$2"
if [[ ${size} -lt ${limit} ]]; then
echo "WARNING: RAM size is ${size}GB, while requirement is ${limit}GB" | tee -a ${OUTPUT}
return 1
fi
}
function checkCPU(){
local size="$1"
local limit="$2"
if [[ ${size} -lt ${limit} ]]; then
echo "WARNING: CPU cores are ${size}, while requirement are ${limit}" | tee -a ${OUTPUT}
return 1
fi
}
function usage(){
echo "This script checks if this node meets requirements to install DSX-Local. "
echo "Arguments: "
echo "--type=[9nodes_master|9nodes_storage|9nodes_compute|3nodes] To specify a node type"
echo "--help To see help "
}
function helper(){
echo "##########################################################################################
Help:
./$(basename $0) --type=[9nodes_master|9nodes_storage|9nodes_compute|3nodes]
Specify a node type and start the validation
Checking preReq before DSX-local installation
Please run this script in all the nodes of your cluster
Differnt node types have different RAM/CPU requirement
List of validation:
CPU
WARNING for 9node master cpu core < 8, 9node storage cpu core < 16, 9node compute cpu core < 32; for 3node cpu core < 8
WARNING for 3node cpu core < 8
RAM
WARNING for 9node master RAM < 16GB, 9node storage RAM < 32GB, 9node compute RAM size < 64GB; for 3node RAM size < 16GB
WARNING for 3node RAM < 16GB
Disk latency test:
WARNING dd if=/dev/zero of=/root/testfile bs=512 count=1000 oflag=dsync The value should be less than 10s for copying 512 kB
ERROR: must be less than 60s for copying 512 kB,
Disk throughput test:
WARNING dd if=/dev/zero of=/root/testfile bs=1G count=1 oflag=dsync The value should be less than 5s for copying 1.1 GB
ERROR: must be less than 35s for copying 1.1 GB
Chrony/NTP
WARNING check is ntp/chrony is setup
Firewall disabled
ERROR firewalled and iptable is disabled
Disk
ERROR root directory should have at least 10 GB
WARNING partition for installer files should have one xfs disk formartted and mounted > ${INSTALLPATH_SIZE}GB
WARNING partition for data storage should have one xfs disk formartted and mounted > ${DATAPATH_SIZE}GB
Cron job check
ERROR check whether this node has a cronjob changes ip route, hosts file or firewall setting during installation
DSX Local 443 port check
ERROR check port 443 is open
SELinux check
ERROR check SElinux is either in enforcing or permissive mode
Gateway check
ERROR check is gateway is setup
DNS check
ERROR check is DNS service is setup which allow hostname map to ip
Docker check
ERROR Check to confirm Docker is not installed
Kubernetes check
ERROR Check to confirm Kubernetes is not installed
##########################################################################################"
}
function checkpath(){
local mypath="$1"
if [[ "$mypath" = "/" ]]; then
echo "ERROR: Can not use root path / as path" | tee -a ${OUTPUT}
usage
exit 1
fi
if [ ! -d "$mypath" ]; then
echo "ERROR: $mypath not found in node." | tee -a ${OUTPUT}
usage
exit 1
fi
}
#for internal usage
MASTERONE="MASTERONE_PLACEHOLDER" #if master one internal run will not check docker since we already install it
INSTALLPATH="INSTALLPATH_PLACEHOLDER"
DATAPATH="DATAPATH_PLACEHOLDER"
CPU=0
RAM=0
#Global parameter
INSTALLPATH_SIZE=150
DATAPATH_SIZE=350
#setup output file
OUTPUT="/tmp/preInstallCheckResult"
rm -f ${OUTPUT}
WARNING=0
ERROR=0
LOCALTEST=0
USE_SUDO=""
[[ "$(whoami)" != "root" ]] && USE_SUDO="sudo"
#input check
if [[ $# -ne 1 ]]; then
if [[ "$INSTALLPATH" != "" ]]; then
# This mean internal call the script, the script has already edited the INSTALLPATH DATAPATH CPU RAM by sed cmd
checkpath $INSTALLPATH
if [[ "$DATAPATH" != "" ]]; then
checkpath "$DATAPATH"
fi
else
usage
exit 1
fi
else
# This mean the user runs script, will prompt user to input the INSTALLPATH DATAPATH
if [[ "$1" = "--help" ]]; then
helper
exit 1
elif [ "$1" == "--type=9nodes_master" ] || [ "$1" == "--type=9nodes_storage" ] || [ "$1" == "--type=9nodes_compute" ] || [ "$1" == "--type=3nodes" ]; then
echo "Please enter the path of partition for installer files"
read INSTALLPATH
checkpath "$INSTALLPATH"
if [[ "$1" = "--type=9nodes_storage" ]]; then
echo "Please enter the path of partition for data storage"
read DATAPATH
checkpath "$DATAPATH"
CPU=16
RAM=32
elif [[ "$1" = "--type=9nodes_master" ]]; then
CPU=8
RAM=16
elif [[ "$1" = "--type=9nodes_compute" ]]; then
CPU=32
RAM=64
elif [[ "$1" = "--type=3nodes" ]]; then
echo "Please enter the path of partition for data storage"
read DATAPATH
checkpath "$DATAPATH"
CPU=32
RAM=64
else
echo "please only specify type among 9nodes_master/9nodes_storage/9nodes_compute/3nodes"
exit 1
fi
else
echo "Sorry the argument is invalid"
usage
exit 1
fi
fi
echo "##########################################################################################" > ${OUTPUT} 2>&1
echo "Checking Disk latency and Disk throughput" | tee -a ${OUTPUT}
# Note: Here location has been chose as /dev/xvdb as this is the storage space where I mounted /install.
# Check your storage before running the test and update accordingly
${USE_SUDO} dd if=/dev/xvdb of=${INSTALLPATH}/testfile bs=512 count=1000 oflag=dsync &> output
res=$(cat output | tail -n 1 | awk '{print $6}')
# writing this since bc may not be default support in customer environment
res_int=$(echo $res | grep -E -o "[0-9]+" | head -n 1)
if [[ $res_int -gt 60 ]]; then
echo "ERROR: Disk latency test failed. By copying 512 kB, the time must be shorter than 60s, recommended to be shorter than 10s, validation result is ${res_int}s " | tee -a ${OUTPUT}
ERROR=1
LOCALTEST=1
elif [[ $res_int -gt 10 ]]; then
echo "WARNING: Disk latency test failed. By copying 512 kB, the time recommended to be shorter than 10s, validation result is ${res_int}s " | tee -a ${OUTPUT}
WARNING=1
LOCALTEST=1
fi
# Note: Here location has been chose as /dev/xvdb as this is the storage space where I mounted /install.
# Check your storage before running the test and update accordingly
${USE_SUDO} dd if=/dev/xvdb of=${INSTALLPATH}/testfile bs=1G count=1 oflag=dsync &> output
res=$(cat output | tail -n 1 | awk '{print $6}')
# writing this since bc may not be default support in customer environment
res_int=$(echo $res | grep -E -o "[0-9]+" | head -n 1)
if [[ $res_int -gt 35 ]]; then
echo "ERROR: Disk throughput test failed. By copying 1.1 GB, the time must be shorter than 35s, recommended to be shorter than 5s, validation result is ${res_int}s " | tee -a ${OUTPUT}
ERROR=1
LOCALTEST=1
elif [[ $res_int -gt 5 ]]; then
echo "WARNING: Disk throughput test failed. By copying 1.1 GB, the time is recommended to be shorter than 5s, validation result is ${res_int}s " | tee -a ${OUTPUT}
WARNING=1
LOCALTEST=1
fi
rm -f output > /dev/null 2>&1
rm -f ${INSTALLPATH}/testfile > /dev/null 2>&1
if [[ ${LOCALTEST} -eq 0 ]]; then
echo "PASS" | tee -a ${OUTPUT}
fi
echo
echo "##########################################################################################" >> ${OUTPUT} 2>&1
LOCALTEST=0
echo "##########################################################################################" >> ${OUTPUT} 2>&1
echo "Checking gateway" | tee -a ${OUTPUT}
${USE_SUDO} ip route | grep "default" > /dev/null 2>&1
if [[ $? -ne 0 ]]; then
echo "ERROR: default gateway is not setup " | tee -a ${OUTPUT}
ERROR=1
LOCALTEST=1
fi
if [[ ${LOCALTEST} -eq 0 ]]; then
echo "PASS" | tee -a ${OUTPUT}
fi
echo
echo "##########################################################################################" >> ${OUTPUT} 2>&1
LOCALTEST=0
echo "##########################################################################################" >> ${OUTPUT} 2>&1
echo "Checking DNS" | tee -a ${OUTPUT}
${USE_SUDO} cat /etc/resolv.conf | grep -E "nameserver [0-9]+.[0-9]+.[0-9]+.[0-9]+" &> /dev/null
if [[ $? -ne 0 ]]; then
echo "ERROR: DNS is not properly setup " | tee -a ${OUTPUT}
ERROR=1
LOCALTEST=1
fi
if [[ ${LOCALTEST} -eq 0 ]]; then
echo "PASS" | tee -a ${OUTPUT}
fi
echo
echo "##########################################################################################" >> ${OUTPUT} 2>&1
LOCALTEST=0
echo "##########################################################################################" >> ${OUTPUT} 2>&1
echo "Checking chrony / ntp" | tee -a ${OUTPUT}
TIMESYNCON=1 # 1 for not sync 0 for sync
${USE_SUDO} systemctl status ntpd > /dev/null 2>&1
if [[ $? -eq 0 || $? -eq 3 ]]; then # 0 is active, 3 is active, both are ok here
TIMESYNCON=0
fi
${USE_SUDO} systemctl status chronyd > /dev/null 2>&1
if [[ $? -eq 0 || $? -eq 3 ]]; then # 0 is active, 3 is active, both are ok here
TIMESYNCON=0
fi
if [[ ${TIMESYNCON} -ne 0 ]]; then
echo "WARNING: NTP/Chronyc is not setup " | tee -a ${OUTPUT}
WARNING=1
LOCALTEST=1
fi
if [[ ${LOCALTEST} -eq 0 ]]; then
echo "PASS" | tee -a ${OUTPUT}
fi
echo
echo "##########################################################################################" >> ${OUTPUT} 2>&1
LOCALTEST=0
echo "##########################################################################################" >> ${OUTPUT} 2>&1
echo "Checking if firewall is shutdown" | tee -a ${OUTPUT}
${USE_SUDO} service iptables status > /dev/null 2>&1
if [ $? -eq 0 ]; then
echo "WARNING: iptable is not disabled" | tee -a ${OUTPUT}
LOCALTEST=1
WARNING=1
fi
${USE_SUDO} service ip6tables status > /dev/null 2>&1
if [ $? -eq 0 ]; then
echo "WARNING: ip6table is not disabled" | tee -a ${OUTPUT}
LOCALTEST=1
WARNING=1
fi
${USE_SUDO} systemctl status firewalld > /dev/null 2>&1
if [ $? -eq 0 ]; then
echo "WARNING: firewalld is not disabled" | tee -a ${OUTPUT}
LOCALTEST=1
WARNING=1
fi
if [[ ${LOCALTEST} -eq 0 ]]; then
echo "PASS" | tee -a ${OUTPUT}
fi
echo
echo "##########################################################################################" >> ${OUTPUT} 2>&1
LOCALTEST=0
echo "##########################################################################################" >> ${OUTPUT} 2>&1
echo "Checking SELinux" | tee -a ${OUTPUT}
selinux_res="$(${USE_SUDO} getenforce 2>&1)"
if [[ ! "${selinux_res}" =~ ("Permissive"|"permissive"|"Enforcing"|"enforcing") ]]; then
echo "ERROR: SElinux is not in enforcing or permissive mode" | tee -a ${OUTPUT}
LOCALTEST=1
ERROR=1
fi
if [[ ${LOCALTEST} -eq 0 ]]; then
echo "PASS" | tee -a ${OUTPUT}
fi
echo
echo "##########################################################################################" >> ${OUTPUT} 2>&1
LOCALTEST=0
echo "##########################################################################################" >> ${OUTPUT} 2>&1
echo "Checking pre-exsiting cronjob" | tee -a ${OUTPUT}
${USE_SUDO} crontab -l | grep -E "*" &> /dev/null
if [[ $? -eq 0 ]] ; then
echo "WARNING: Found cronjob set up in background. Please make sure cronjob will not change ip route, hosts file or firewall setting during installation" | tee -a ${OUTPUT}
LOCALTEST=1
WARNING=1
fi
if [[ ${LOCALTEST} -eq 0 ]]; then
echo "PASS" | tee -a ${OUTPUT}
fi
echo
echo "##########################################################################################" >> ${OUTPUT} 2>&1
LOCALTEST=0
echo "##########################################################################################" >> ${OUTPUT} 2>&1
echo "Checking size of root partition" | tee -a ${OUTPUT}
ROOTSIZE=$(${USE_SUDO} df -k -BG "/" | awk '{print($4 " " $6)}' | grep "/" | cut -d' ' -f1 | sed 's/G//g')
if [[ $ROOTSIZE -lt 10 ]] ; then
echo "ERROR: size of root partition is smaller than 10G" | tee -a ${OUTPUT}
LOCALTEST=1
ERROR=1
fi
if [[ ${LOCALTEST} -eq 0 ]]; then
echo "PASS" | tee -a ${OUTPUT}
fi
echo
echo "##########################################################################################" >> ${OUTPUT} 2>&1
LOCALTEST=0
echo "##########################################################################################" >> ${OUTPUT} 2>&1
echo "Checking if install path: ${INSTALLPATH} have enough space (${INSTALLPATH_SIZE}GB)" | tee -a ${OUTPUT}
PARTITION=$(${USE_SUDO} df -k -BG | grep ${INSTALLPATH})
if [[ $? -ne 0 ]]; then
echo "ERROR: can not find the ${INSTALLPATH} partition you specified in install_path" | tee -a ${OUTPUT}
LOCALTEST=1
ERROR=1
else
PARTITION=$(echo $PARTITION | tail -n 1 | awk '{print $2}' | sed 's/G//g')
if [[ ${PARTITION} -lt ${INSTALLPATH_SIZE} ]]; then
echo "WARNING: size of install_path ${INSTALLPATH} is smaller than requirement (${INSTALLPATH_SIZE}GB)" | tee -a ${OUTPUT}
LOCALTEST=1
ERROR=1
fi
fi
if [[ ${LOCALTEST} -eq 0 ]]; then
echo "PASS" | tee -a ${OUTPUT}
fi
echo
echo "##########################################################################################" >> ${OUTPUT} 2>&1
if [[ $DATAPATH != "" && $DATAPATH != "DATAPATH_PLACEHOLDER" ]]; then
LOCALTEST=0
echo "##########################################################################################" >> ${OUTPUT} 2>&1
echo "This is a storage node, checking if data path: ${DATAPATH} have enough space (${DATAPATH_SIZE}GB)" | tee -a ${OUTPUT}
cmd='df -k -BG | grep ${DATAPATH}'
PARTITION=$(${USE_SUDO} df -k -BG | grep ${DATAPATH})
if [[ $? -ne 0 ]]; then
echo "ERROR: can not find the ${DATAPATH} partition you specified in data_path" | tee -a ${OUTPUT}
LOCALTEST=1
ERROR=1
else
PARTITION=$(echo $PARTITION | tail -n 1 | awk '{print $2}' | sed 's/G//g')
if [[ ${PARTITION} -lt ${DATAPATH_SIZE} ]]; then
echo "WARNING: size of data_path ${DATAPATH} is smaller than requirement (${DATAPATH_SIZE}GB)" | tee -a ${OUTPUT}
LOCALTEST=1
ERROR=1
fi
fi
if [[ ${LOCALTEST} -eq 0 ]]; then
echo "PASS" | tee -a ${OUTPUT}
fi
echo
echo "##########################################################################################" >> ${OUTPUT} 2>&1
fi
LOCALTEST=0
echo "##########################################################################################" >> ${OUTPUT} 2>&1
echo "Checking if xfs is enabled" | tee -a ${OUTPUT}
${USE_SUDO} xfs_info ${INSTALLPATH} | grep "ftype=1" > /dev/null 2>&1
if [[ $? -ne 0 ]] ; then
echo "ERROR: xfs is not enabled, ftype=0, should be 1" | tee -a ${OUTPUT}
LOCALTEST=1
ERROR=1
fi
if [[ ${LOCALTEST} -eq 0 ]]; then
echo "PASS" | tee -a ${OUTPUT}
fi
echo
echo "##########################################################################################" >> ${OUTPUT} 2>&1
LOCALTEST=0
echo "##########################################################################################" >> ${OUTPUT} 2>&1
echo "Checking CPU core numbers and RAM size" | tee -a ${OUTPUT}
# Get CPU numbers and min frequency
cpunum=$(${USE_SUDO} cat /proc/cpuinfo | grep '^processor' |wc -l | xargs)
if [[ ! ${cpunum} =~ ^[0-9]+$ ]]; then
echo "ERROR: Invalid cpu numbers '${cpunum}'" | tee -a ${OUTPUT}
else
checkCPU ${cpunum} ${CPU}
if [[ $? -eq 1 ]]; then
LOCALTEST=1
WARNING=1
fi
fi
mem=$(${USE_SUDO} cat /proc/meminfo | grep MemTotal | awk '{print $2}')
# Get Memory info
mem=$(( $mem/1000000 ))
if [[ ! ${mem} =~ ^[0-9]+$ ]]; then
echo "ERROR: Invalid memory size '${mem}'" | tee -a ${OUTPUT}
else
checkRAM ${mem} ${RAM}
if [[ $? -eq 1 ]]; then
LOCALTEST=1
WARNING=1
fi
fi
if [[ ${LOCALTEST} -eq 0 ]]; then
echo "PASS" | tee -a ${OUTPUT}
fi
echo
echo "##########################################################################################" >> ${OUTPUT} 2>&1
if [[ ${MASTERONE} = "NO" || $# -eq 1 ]]; then
LOCALTEST=0
echo "##########################################################################################" >> ${OUTPUT} 2>&1
echo "Checking to confirm docker is not installed " | tee -a ${OUTPUT}
${USE_SUDO} which docker > /dev/null 2>&1
rc1=$?
${USE_SUDO} systemctl status docker &> /dev/null
rc2=$?
if [[ ${rc1} -eq 0 ]] || [[ ${rc2} -eq 0 ]]; then
echo "ERROR: Docker is already installed with a different version or settings, please uninstall Docker" | tee -a ${OUTPUT}
LOCALTEST=1
ERROR=1
fi
if [[ ${LOCALTEST} -eq 0 ]]; then
echo "PASS" | tee -a ${OUTPUT}
fi
echo
echo "##########################################################################################" >> ${OUTPUT} 2>&1
fi
LOCALTEST=0
echo "##########################################################################################" >> ${OUTPUT} 2>&1
echo "Checking to confirm Kubernetes is not installed" | tee -a ${OUTPUT}
${USE_SUDO} systemctl status kubelet &> /dev/null
if [[ $? -eq 0 ]]; then
echo "ERROR: Kubernetes is already installed with a different version or settings, please uninstall Kubernetes" | tee -a ${OUTPUT}
LOCALTEST=1
ERROR=1
else
${USE_SUDO} which kubectl &> /dev/null
if [[ $? -eq 0 ]]; then
echo "ERROR: Kubernetes is already installed with a different version or settings, please uninstall Kubernetes" | tee -a ${OUTPUT}
LOCALTEST=1
ERROR=1
fi
fi
if [[ ${LOCALTEST} -eq 0 ]]; then
echo "PASS" | tee -a ${OUTPUT}
fi
echo
echo "##########################################################################################" >> ${OUTPUT} 2>&1
#log result
if [[ ${ERROR} -eq 1 ]]; then
echo "Finished with ERROR, please check ${OUTPUT}"
exit 2
elif [[ ${WARNING} -eq 1 ]]; then
echo "Finished with WARNING, please check ${OUTPUT}"
exit 1
else
echo "Finished successfully! This node meets the requirement" | tee -a ${OUTPUT}
exit 0
fi
... View more
04-05-2018
12:10 AM
3 Kudos
Guide to deploying a 3-node DSX Local Cluster on AWS Part 1: Deploying AWS Cluster On your EC2 dashboard click on launch instance. Step 1: Choose an Amazon Machine Image (AMI) Select AWS Marketplace and search for Centos -> Chose the Centos version you need (I chose CentOS 7 (x86_64) - with Updates HVM) Step 2: Choose an Instance Type. Select m4.4xlarge or m4.2xlarge Step 3: Configure Instance Details Set Number of Instances as 3 Step 4: Add Storage You need to add Root storage minimum 50GB and additional EBS storage of around 500GB( if you deploying for just test purpose 300GB is fine). DSX usually require high IOPS so provisioned SSD are preferred. But be advised that provisioned SSD are expensive due to high IOPS. Step 5: Add Tags Optional Step 6: Configure Security Group You need to set up a custom security group, which should include - Custom TCP rule, port 6443 and range 172.31.0.0/16 -> used for internal connection Allow all connection between nodes Allow restricted connection from outside. Step 7: Review and Instance Launch You would be required to create a key-pair that will allow you to access the cluster over SSH. Part 2: Prepare nodes to Install DSX. After the nodes have been launched successfully, set one node as a master node. Follow the steps below to prepare the nodes for installation Step 1. Invoking root login on each node . ssh into each of the three nodes. Use the following commands - sudo su
sudo sed -i '/^#PermitRootLogin.*/s/^#//' /etc/ssh/sshd_config
cat /etc/ssh/sshd_config | grep PermitRootLogin
Now, edit the ~/.ssh/authorized_keys to get rid of "no-port-forwarding,no-agent-forwarding,no-X11-forwarding,command="echo 'Please login as the user \"centos\" rather than the user \"root\".';echo;sleep 10" Step 2. Configure password-less ssh between all the nodes Creating a ssh key ssh-keygen<br> copy content from ~/.ssh/id_rsa.pub to ~/.ssh/authorized_keys of all nodes. Test ssh connection between all the nodes including self, Note: Use the private ip of each node to ssh. Step 3. Create two partitions for install and data. Create two partitions using fdisk utility. You can create partitions of any size but usually advised that you have >200GB of space for the both partitions. More details here - https://content-dsxlocal.mybluemix.net/docs/content/local/requirements.html confirm the creation of partitions by lsblk Step 4. Format the two partitions Format the partitions to mkfs.xfs - mkfs.xfs -f -n ftype=1 -i size=512 -n size=8192 /dev/xvdb1 #//data
mkfs.xfs -f -n ftype=1 -i size=512 -n size=8192 /dev/xvdb2 #//install
Step 5. Create directories /install and /data Step 6. Mount the new partitions onto the locations mount /dev/xvdb1 /data
mount /dev/xvdb2 /install
Note: make following entries to /etc/fstab to ensure the partitions are mounted even on reboot. /dev/xvdb1 /data auto defaults,nofail,comment=cloudconfig 0 2
/dev/xvdb2 /install auto defaults,nofail,comment=cloudconfig 0 2 Step 7. Install required packages sudo yum install -y epel-release
sudo yum update -y
sudo yum install -y wget python-pip fio screen
Step 8. Improve Disk IO by warming the disks (optional) fio --filename=/dev/xvdb --rw=read --bs=128k --iodepth=32 --ioengine=libaio --direct=1 --name=volume-initialize Note: This will take ~40 minutes to complete Part 3: Create Load balancers 1. Internal TCP Load balancer for Nodes Step 1: Configure Load Balancer Step 2: Configure Routing Make sure the port is set to 6443. Step 3: Register Targets register nodes in your cluster Review and launch the load balancer. Once the ELB has been provisioned, get the IP address attached to the ELB from the DNS name - ping DNS_Name 2. Also, make sure that the target nodes has been attached to the ELB. Note: You might get warning the nodes are not healthy but its fine. 2. External Load Balancer to access cluster over HTTPS Step 1: Configure Load Balancer Note that here we are using port 443. Step 2: Configure Security Settings You can either choose your own ACM certificate or you can upload a certificate to ACM. Note: This step will work only once you have kicked off the installation. In this scenario you are creating this load balancer after the installation has been kicked off. This is good too. One way to create a ACM certificate is - cd /etc/kubernetes/ssl For private key use apiserver-key.pem For certificate body use apiserver.pem For certificate chain use ca.pem Step 3: Configure Security Groups Now you can configure your security group, make sure you are using same security group you earlier used to spin your instances. Step 4: Configure Routing Make sure to use HTTPS and set path for health check as /auth/login/login.html Step 5: Register Targets Review and Create the Load Balancer. Part 4: Running DSX Installer Step 1: Run pre-installation check Before running installation you want to make sure that the nodes pass some basic tests. You can get the script here. Run the script, see if there are any errors while the script ran. Note: It might complain about the number of cores but you can ignore it. Step 2: Get installer on /install folder Download the DSX installer(optional), not needed if you have it locally. Change its permission to execute. Step 3: Create Config file for DSX Installation Before running the installer we would like to create a config file called as wdp.conf. This file can be created as - your_installer --get-conf-key --three-nodes This will create wdp.conf file. Now edit this file by replacing "load_balancer_ip_address" for "virtual_ip_address" and add the ip address of TCP load balancer. Also you would need to add 2 options to this file - Suppress Warning, No Path check. These options will suppress any warnings that you get during installation. Any errors can be later referred in the error logs. load_balancer_ip_address=IP addr of TCP lb
ssh_key=/root/.ssh/id_rsa
ssh_port=22
node_1=internal ip address of master node
node_path_1=/install
node_data_1=/data
node_2=internal ip address of node2
node_path_2=/install
node_data_2=/data
node_3=internal ip address of node3
node_path_3=/install
node_data_3=/data
suppress_warning=true
no_path_check=true
Step 4: Kick in the installation Set off the installer. Note: We recommend using screen option as installation takes ~2 hr so you can restore the session if it gets disconnected from local machine. Your_Installer --three-nodes Part 5: Launch DSX and check the setup After the installation has completed launch the DSX on your web browser. The ip address listed on your installer screen will not works. You will need to get the IP address of the external load balancer by ping DNS Name. https://your_dns_ip_address/auth/login/login.html Note: You will see this warning message when you load the url, you can ignore this warning and proceed. You should see above screen once you proceed to next step. Default username and password for first login will be - admin and password
... View more
03-24-2018
10:33 PM
@Thomas Bury I had similar problem, I had set authentication none and had all the packages required (sasl, thrift, pyhive). I was missing a plain kerberos plugin, which was being used for authentication and had to do the following - yum install cyrus-sasl-plain You can read more here - http://grokbase.com/t/hive/user/144tajctxv/error-in-sasl-client-start-4-sasl-4-no-mechanism-available-no-worthy-mechs-found
... View more
02-03-2018
04:47 AM
2 Kudos
In this article, we will see how an organization can tackle the data science challenge for an Internet of things use case. This article shows how a trucking company can leverage data and analytics to solve its logistics problem by leveraging Hortonworks Data Platform (HDP), Apache NiFi, and IBM’s Data Science Experience (DSX) to build a real-time data science solution to predict any violation events by the driver. IBM DSX is a data science platform from IBM that provides all favorite data science toolkits at one place. DSX and HDP solve some of the major pain points of running data science at scale such as - ease of deployment and maintenance of machine learning models, scaling the developed model to big data scale, ease of using different open source tools, and collaboration. DSX comes in multiple flavors: cloud, desktop, and local. DSX can run on top of you HDP cluster to provide advanced analytics capabilities. Following diagram show major stages of a data science project - Stage 1: Problem Definition Imagine a trucking company that dispatches trucks across the country. The trucks are outfitted with sensors that collect data. For instance, the location of the driver, weather conditions, and recent events such as speeding, the truck weaving out of its lane, or following too closely. This data is generated once per second and streamed back to the company’s servers. The company needs a way to process this stream of data and run some analysis on the data so that it can make sure trucks are traveling safe and that the driver is not likely to make any violations anytime soon. And all this has to be done in real-time. Stage 2: ETL and Feature Extractions I am creating a Jupyter notebook with Python to solve this problem which is provided on DSX. DSX also offer the choice of Scala and R. This saves me time in installations and preparing my work environment. Now, to predict violation events, we need to decide which features of the gathered data are important. We are using historical data that has been gathered about the trucks. With DSX we don't worry about fetching data, DSX offer connectors that let a user fetch data from multiple sources such as HDFS, Hive, DB2 or any other source. Our data resides in HDFS with several features such as location, miles driven by a driver, and weather conditions. We perform feature engineering to create new features, perform exploratory analysis to find correlations between different features. DSX supports open source libraries such as Matplotlib, Seaborn and also offer IBM visualization libraries such as Brunel and PixieDust. Stage 3: Machine Learning
Once the data is cleaned and ready, we build a binary classification model. In this demo, we are using SparkML Random Forest classifier to classify if a driver is likely to make violation (T) or it's unlikely (F). We split the data into training and testing [80:20] and train the model on training data. After training the model is evaluated for accuracy, precision & recall, and area under ROC curve against the test data. Once satisfied with the model accuracy, we save the model and deploy it.
Stage 4: Model Deployment and Management DSX models are deployed as REST endpoints which allow developers to easily access these deployed models and integrate into the applications. This approach drastically reduces the time to build, deploy, and integrate new models. All that is required to be changed in existing code is the REST endpoint. DSX Models can be tested over DSX UI or by making RESTful API calls from terminals. In addition to the ease of model deployment, managing the deployed models is easier too. DSX offers scheduling evaluations of a model that have been deployed, so a user can consistently monitor model performance and retrain the model if the performance of current model falls short of the expected level.
Now, as the model has been deployed we can integrate this model into NiFi to demonstrate real-time event scoring. We simulate an end-to-end flow in Apache NiFi shown in the diagram below. End-To-End Application Flow. We are generating simulated events using one of the processors, these events are sent to the model that is being hosted at REST endpoint. The results from the model are then returned as T or F based on the probability of whether the incoming event can be a violation or not. Based on the prediction we create a subsequent flow where we can alter the driver. Summary Next step in this demo can be to build a dashboard that would allow to visualize the result and improve performance monitoring and trigger alerts for the trucking fleet. These alerts would be both useful to the trucking management as well as to the individual drivers who could take corrective action to diminish the probability of a violation. For further reading and detail demo, please refer to this articleIOT AND DATA SCIENCE – A TRUCKING DEMO ON DSX LOCAL WITH APACHE NIFI
... View more
07-12-2017
10:37 PM
1 Kudo
I ran into similar problem as @Hari Krishnan Umapathy, my HiveServer2 was going down as soon as I started it. I recommend following steps to debug the: - Check the error logs(.err) under - /var/log/hive - Restart the service and put a tail on the new logs: tail -f hiverserver2.log Note: I was running into out of memory error on my node and stopped some of the unwanted services and was able to bring up hive and beeline.
... View more