Member since
03-16-2016
707
Posts
1753
Kudos Received
203
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5105 | 09-21-2018 09:54 PM | |
6484 | 03-31-2018 03:59 AM | |
1964 | 03-31-2018 03:55 AM | |
2171 | 03-31-2018 03:31 AM | |
4800 | 03-27-2018 03:46 PM |
06-21-2016
08:24 PM
1 Kudo
@Pirlouis Pirlouis Yes. That is the same for RHEL5.x. For the other question, I am not aware of special issues.
... View more
06-21-2016
03:55 PM
2 Kudos
@Pirlouis Pirlouis If you are on a newer version of HDP 2.3.x & 2.4.x, we *really* suggest using 2.1.2+ and upgrade to RHEL 6.x. The last ODBC driver Hortonworks provided for CentOS 5.x was 2.0.0. Nothing after is supported on RHEL 5.x after 2.0.0. This is the repo you need: http://public-repo-1.hortonworks.com/HDP/hive-odbc/2.0.0-1000/centos5/hive-odbc-native-2.0.0.1000-centos5.tar.gz or you can go to http://hortonworks.com/downloads/#data-platform , then expand the Archive section and scroll down
HDP 2.2 Add-Ons-->Hortonworks ODBC Driver for Apache Hive (v2.0)-->CentOS 5.x If this is what you wanted, please vote the response and accepted it as a best answer.
... View more
06-16-2016
11:33 PM
2 Kudos
@atul gupta For Kafka log miner use case and not only, see: https://github.com/linkedin/databus. Databus understands Oracle redo logs. For GoldenGate that is an out of box capability! Not clear why would you use customized open source technologies to push data to GoldenGate. Maybe you want to replace GoldenGate all together. Your business case needs more clarification. If you like this response, please vote it.
... View more
06-16-2016
11:07 PM
3 Kudos
@ammu ch This is a loaded question. It depends on the version of Spark you installed, the version of Ambari, the version of HDP, whether you want to use the power of YARN, keep track of configuration changes (keep in mind that Ambari provides configuration management, integration with dashboards etc). It can get complex. Hortonworks does not support this approach. You must a have a serious reason to not use "Add Service" from Ambari to install your new Spark cluster and willing to deal with all these complexities. You should save your Spark configurations and replicate them within Ambari. If you still want to explore the options, please be more specific in versions of the above, plus use of YARN, plus topology of the Spark cluster you installed (multi-node?)
... View more
06-13-2016
02:41 PM
15 Kudos
Objective Deploy a 4-node HDP 2.4.2 cluster with Apache Ambari 2.2.2, Vagrant and VirtualBox on OS X host. This is helpful for development and proof of concepts. Scope This approach has been tested on OS X host, but it should work on all supported Vagrant and VirtualBox environments. Pre-requisites Minimum 9 GB of RAM for the HDP 2.4.2 cluster Download and install Vagrant for your host OS: https://www.vagrantup.com/downloads.html Download and install VirtualBox for your host OS: https://www.virtualbox.org/wiki/Downloads Download and install git client for your host Open a command shell and change to the folder where you plan to clone the github repository Clone the following git repository git clone https://github.com/cstanca1/hdp2_4_2-vagrant.git Create and Start VMs Change directory to /hdp_2.4.2-vagrant, the folder that includes Vagrantfile and create a /data folder: mkdir data This /data folder will be needed for guest VMs to share with the host. Vagrant (via Vagrantfile) is configured to use Centos 6.7 as the base box and includes the pre-requisites for installing HDP. 4 VMs will be created: 1 Ambari Server (ambari1), 1 Hadoop master (master1) and 2 slaves (slave1, slave2). vagrant up ambari1
Install and Setup Ambari Server Set a Local Reference to a Remote Ambari Repo vagrant ssh ambari1
sudo su -
cd /etc/yum.repos.d
wget http://public-repo-1.hortonworks.com/ambari/centos6/2.x/updates/2.2.2.0/ambari.repo Setup SSH Access and Starting the Other 3 VMs Add at the same path with files downloaded from the repoosity, your id_rsa and id_rsa.pub keys (seehttps://wiki.centos.org/HowTos/Network/SecuringSSH section 7 for instructions on CentOS). You could perform these steps on ambari1 VM and copy these two files to your /vagrant_data folder which shares data between guest and host. Only after you copy those two files, start the other three VMs: vagrant up master1
vagrant up slave1
vagrant up slave2 Install Ambari Server yum install ambari-server Setup Ambari Server Run the setup command to configure your Ambari Server, Database, JDK, LDAP, and other options: ambari-server setup Start Ambari Server ambari-server start Deploy Cluster using Ambari Web UI Open up a web browser and go to: http://ambari1:8080 Log in with username admin and password admin and follow on-screen instructions, using hosts created and selecting services of interest. For more details, see "Automated Install" at: https://docs.hortonworks.com/HDPDocuments/Ambari/Ambari-2.2.2.0/index.html
... View more
06-11-2016
11:03 PM
@Armando Segnini Thank you so much for your review. Your findings were spot-on. I had a few typos and omitted a mv command. Excellent catches.
... View more
06-11-2016
10:46 PM
@Artem Ervits, @Deepesh, @Mike Riggs I'm trying this in HDP 2.4 sandbox with SQL Server Express 2014. Connectivity is ok. Even you add it to /var/lib/sqoop/lib/ or /usr/lib/sqoop/lib folder, how do you get past that a connection-manager needs to be set to use a factory class in order to use the Microsoft driver? The error (even is shown as a WARN) is: WARN sqoop.ConnFactory: Parameter --driver is set to an explicit driver however appropriate connection manager is not being set (via --connection-manager). Sqoop is going to fall back to org.apache.sqoop.manager.GenericJdbcManager. Please specify explicitly which connection manager should be used next time. Due to this, the command does not work. This error is thrown after executing a command like this: sqoop list-databases --driver com.microsoft.sqlserver.jdbc.SQLServerDriver --connect jdbc:sqlserver://10.226.170.191\poc:1433 --username WHATEVERUSER --password WHATEVERPASSWORD --connection-manager directive seems to be needed. What needs to be added?
... View more
06-09-2016
03:32 AM
1 Kudo
@Micheal Kubbo What HDP sandbox version do you use? I am intrigued on whether your sandbox supports HTTP 1.0 or 1.1. True, 1.0 is very old and it is unlikely the case, but you should still check. I run the following command on HDP 2.4 sandbox: curl --head 127.0.0.1 and the result is: HTTP/1.1 200 OK
Date: Thu, 09 June 2016 03:26:25 GMT
Server: gunicorn/19.1.1
...
... View more
06-09-2016
03:02 AM
4 Kudos
@sameer lail It is not stupid what you did. CSV is a file format, not a data structure in R. What you could is to create a dataframe with a single column with all values separated by comma then use hdfs write to output that as a file with extension csv. Another option is to write map-reduce with R and streaming API and set the output to be csv. If any of my responses were helpful, please don't forget to vote them.
... View more
06-06-2016
09:21 PM
4 Kudos
@sameer lail What data format is the file that you assign to modelfile dataframe? If it is not csv then you would need to convert it to csv before writing it to HDFS. If it is csv then check this Q/A: https://community.hortonworks.com/questions/36583/how-to-save-data-in-hdfs-using-r.html
... View more