About ymoiseev

ymoiseev · ‎09-27-2017

@Mohammad Shazreen Bin Haini, Based on the question tag, I believe that you're using Sandbox. You will need to ssh into the Sandbox and type in command there. You can learn more here: https://hortonworks.com/hadoop-tutorial/learning-the-ropes-of-the-hortonworks-sandbox/

ymoiseev · ‎09-15-2017

Hi Team, We have few use-cases that require using CSV tables as a lookups. Based on docs, it is something LookupRecord processor is designed to do. Do we have any example flows / templates / tutorials that we can use? For a bit more context, here's what we're trying to achieve: We have a flowfile with certain attribute name Value of this attribute represents a key that we'd like to search for We have a CSV file with 3 columns and a header (let's say key, col1, col2) We'd like to add col1 and col2 as attributes and their respective values as attribute values for a given key

ymoiseev · ‎10-05-2016

Hi, I was wondering if anyone tried setting up Zeppelin to work with Spark2 on HDP 2.5? As far as I understand, I can switch Spark version via SPARK_MAJOR_VERSION variable, and it works for spark-submit, but Zeppelin is still using Spark1. Ideally, I want to have two different interpreter groups for every major Spark version, but just just switching it to Spark2 will be a good start. Any ideas?

ymoiseev · ‎09-28-2016

Hi, After restarting ambari server, it failed to launch and we started to have following error: Ambari database consistency check started... No errors were found. ERROR: Exiting with exit code 1. REASON: Database check failed to complete. Please check /var/log/ambari-server/ambari-server.log and /var/log/ambari-server/ambari-server-check-database.log for more information. In logs, we can find this: Error injecting constructor, org.apache.ambari.server.AmbariException: Unable to find stack definitions under stackRoot = /var/lib/ambari-server/resources/stacks Has anyone encounter this error before? What might be the reason why it happened and what are the possible solutions? We're using HDF 2.0 on RHEL7

ymoiseev · ‎09-06-2016

Setting up GPU-enabled Tensorflow to work with Zeppelin Sometimes we want to do some quick Deep Learning prototyping using TensorFlow. We also want to take advantage of Spark for data pre-processing, scaling, feature extraction while keeping it all in the same place for demo. This step-by-step guide will go through process of setting up those tools to work with each other. My setup: AWS GPU-enabled instance (any g2) Ubuntu 14.04 HDP 2.5 This tutorial has been partly based on this script: https://gist.github.com/erikbern/78ba519b97b440e10640, but has been heavily updated and modified. First, let’s install and set up some pre-reqs: sudo apt-get update sudo apt-get upgrade -y # choose maintainers version sudo apt-get install -y build-essential python-pip python-dev git python-numpy swig python-dev default-jdk zip zlib1g-dev Then, disable nouveau and update initramfs echo -e "blacklist nouveau\nblacklist lbm-nouveau\noptions nouveau modeset=0\nalias nouveau off\nalias lbm-nouveau off\n" | sudo tee /etc/modprobe.d/blacklist-nouveau.conf echo options nouveau modeset=0 | sudo tee -a /etc/modprobe.d/nouveau-kms.conf sudo update-initramfs -u sudo reboot # we do actually need to reboot it sudo apt-get install -y linux-image-extra-virtual sudo reboot Let's also install linux-headers and linux-source sudo apt-get install -y linux-source linux-headers-`uname -r` Now, we need to get CUDA. At the time of writing this article, latest version was 7.5 wget http://developer.download.nvidia.com/compute/cuda/7.5/Prod/local_installers/cuda_7.5.18_linux.run chmod +x cuda_7.5.18_linux.run ./cuda_7.5.18_linux.run -extract=`pwd`/nvidia_installers cd nvidia_installers sudo ./NVIDIA-Linux-x86_64-352.39.run sudo modprobe nvidia sudo ./cuda-linux64-rel-7.5.18-19867135.run cd .. It would be too easy for NVidia if that was it. Now we need to get CuDNN from their Accelerated Computing Program. You would need to apply for it here. (https://developer.nvidia.com/cudnn). Approval shouldn’t take more than couple of hours. Once approved, go to download page here (https://developer.nvidia.com/rdp/cudnn-download) and get cuDNN v5 for CUDA 7.5. This one: Then, scp this file from your local machine to remote host. Once it’s there, do: tar -xf cudnn-7.5-linux-x64-v5.0-ga.tar sudo cp cuda/lib64/* /usr/local/cuda/lib64/ sudo cp cuda/include/cudnn.h /usr/local/cuda/include/ sudo add-apt-repository ppa:webupd8team/java sudo apt-get update sudo apt-get install oracle-java8-installer We will also need setup tools called Bazel. I used bazel v 0.3.1 but newer version should work too. wget https://github.com/bazelbuild/bazel/releases/download/0.3.1/bazel-0.3.1-installer-linux-x86_64.sh chmod +x bazel-0.3.1-installer-linux-x86_64.sh ./bazel-0.3.1-installer-linux-x86_64.sh --user Let's save environment variables to ~/.bashrc echo 'export PATH="$PATH:$HOME/bin"' >> ~/.bashrc echo 'export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64"' >> ~/.bashrc echo 'export CUDA_HOME=/usr/local/cuda' >> ~/.bashrc source ~/.bashrc Getting TensorFlow from GitHub git clone --recurse-submodules https://github.com/tensorflow/tensorflow cd tensorflow Here we need to do some manual setup. Open third_party/gpus/crosstool/CROSSTOOL and add "/usr/local/cuda-7.5/include" to cxx_builtin_include_directory property. It should look something like that: Then, start configuration and building: ./configure bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg Finally, installing TensorFlow package from wheel. sudo pip install /tmp/tensorflow_pkg/tensorflow-0.10.0rc0-cp27-none-linux_x86_64.whl At this point, we have Tensorflow working with GPU support, but we also need to make it available for Zeppelin. To do that, we need to show Zeppelin where our CUDA is installed. echo 'export PATH="$PATH:$HOME/bin"' >> /etc/zeppelin/conf/zeppelin-env.sh echo 'export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64"' >> /etc/zeppelin/conf/zeppelin-env.sh echo 'export CUDA_HOME=/usr/local/cuda' >> etc/zeppelin/conf/zeppelin-env.sh This should work with Zeppelin now. If python throws error that it can't find CUDA while importing TensorFlow, try again after restarting Zeppelin.

ymoiseev · ‎06-20-2016

@Timothy Spann If it's still relevant, there is a project called TensorFrames that is pretty much a wrapper for Spark's DataFrames: https://github.com/tjhunter/tensorframes. Although I am not sure if it is being actively developed, last time I checked it looked functional.

Online	Offline
Last Visited	‎05-20-2018 03:54 AM

Member Since	‎05-23-2016 11:30 PM
Last Visited	‎05-20-2018 03:54 AM
Posts	16
Kudos received	21

Cloudera Community

Re: Can't execute query - Permission denied?,Can't...

NiFi: LookupRecord 1.3.0 Flow Example

Spark 2 and Zeppelin

Ambari-Server error on restart: Unable to find sta...

Setting up GPU-enabled Tensorflow to work with Zep...

Re: Deep Learning on HDP