Created on 06-24-2016 12:41 AM - edited 08-17-2019 11:56 AM
How to make Mysql Database as Hive’s instance:
Install Mysql if not available:
brew update
brew doctor
brew upgrade
brew install mysql
mysql.server restart
mysql_secure_installation
login to mysql -> mysql –u root –p
Enter password:
Happy Mysql learning….
Mysql is already installed on Hortonworks sandbox.
Steps:
Confirm with
mysql –u root –p
Import an already available database into Mysql:
Configuration of Mysql Instance with Hive:
From HIVE create Mysql metastore
[root@host]# mysqladmin -u root create hivedb
mysql> USE hivedb;
mysql> CREATE USER 'hive'@'localhost' IDENTIFIED BY 'hive';
mysql> GRANT ALL PRIVILEGES ON *.* TO 'hive'@'localhost';
With this we confirm that mysql database is the Hive’s new metastore.
Suppose to perform a full import of the ‘employees’ and ‘salaries’ tables into HDP:
Tables created in Hive
Create database employees;
Use employees;
CREATE EXTERNAL TABLE IF NOT EXISTS employees (
emp_no INT,
birth_date DATE,
first_name VARCHAR(14),
last_name VARCHAR(16),
gender STRING,
hire_date DATE
)
STORED AS TEXTFILE;
CREATE TABLE IF NOT EXISTS salaries (
salary INT,
from_date DATE,
to_date DATE
) STORED AS TEXTFILE;
sqoop import --connect jdbc:mysql://172.16.16.128:3306/employees --username=hive --password=hive --driver com.mysql.jdbc.Driver --table=employees --hive-import --hive-table=empl.employees --target-dir=wp_users_import –direct
sqoop import --connect jdbc:mysql://172.16.16.128:3306/employees --username=hive --password=hive --driver com.mysql.jdbc.Driver --table=employees --hive-import --hive-table=empl.salaries --target-dir=wp_users_import –direct
Suppose we need to perform some cleansing of data using Regex expressions of Hive:
use empl;
drop table empl.empl_clean;
show tables;
create table empl.empl_clean(emp_no INT, birth_date STRING, first_name STRING,
last_name STRING,gender STRING, hire_date STRING
);
insert overwrite table empl.empl_clean
SELECT
regexp_replace(employees.emp_no, '\t', '')emp_no,
regexp_replace(employees.birth_date, '\t', '')birth_date,
regexp_replace(employees.first_name, '\t', '')first_name,
regexp_replace(employees.last_name, '\t', '')last_name,
regexp_replace(employees.gender, '\t', '')gender,
regexp_replace(employees.hire_date, '\t', '')hire_date
from empl.employees;
select * from empl.empl_clean limit 100;
Cleansing the salaries table:
drop table empl.salary_clean;
create table empl.salary_clean(emp_no INT,salary INT, from_date STRING, to_date STRING);
insert overwrite table empl.salary_clean
regexp_replace(salaries.emp_no, '\t', '')emp_no,
regexp_replace(salaries.salary, '\t', '')salary,
regexp_replace(salaries.from_date, '\t', '')from_date,
regexp_replace(salaries.to_date, '\t', '')to_date
from empl.salaries;
select * from empl.salary_clean limit 100;
Happy Learning….