Created on 06-15-201601:21 AM - edited 08-17-201912:02 PM
Hi all, here a few steps to get a quick example of around
sqooping some Oracle Data into HDFS and Hive table working using the Oracle
Developer VM and Hortonworks Sandbox.
Very simple but may provide some help for people just starting out. I will be using VirtualBox for this walkthrough
so I assuming you already have this installed.
Also the Oracle VM will require about 2 Gig of memory and Sandbox about
8 Gig, so you will need a machine with decent a amount of memory to give this a
try. I used a mac with 16 gig and it ran
fine.
Step1: Download
Hortonworks Sandbox and import ova into VirtualBox
Step3: Set up the 2 VMs so they can communicate with each
other. There are many options here, but
I setup a Nat Network on the second network adapter within both VM’s for this
test. Few diagrams below to help. Basically set up new Nat Network in Virtualbox
under Vitualbox Preference menu ->
select network icon and add new Nat Network (display below – called it
DensNetwork). The go into the settings
for both VM’s, go to network, click on 2nd adapter and follow
diagram below.
VB natnetwork diagram
VB settings – sandbox
VB setting – Oracle VM
Step4: Fire up the VM’s, open a terminal session, and ssh
into the sandbox
Step5:
You can read up a little about Oracle CDB and PDB, will help with understanding
the Jdbc connection a little if needed.
The Oracle VM database will have SID of orcl12c, and Pluggable DB of
orcl, all passwords will be oracle.