Member since
05-22-2018
69
Posts
1
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3915 | 06-07-2018 05:33 AM | |
957 | 05-30-2018 06:30 AM |
06-11-2018
12:41 PM
Hi all , I have install Hortownworks Sandbox in Oracle virtual machine. Now I want to upload files from Windows to HDFS. I have tried following command. C:\InputFileWindows>scp -p 22 datafile.txt root@localhost: But I am not able to do that. It is giving me permission denied error;
... View more
Labels:
- Labels:
-
Apache Hadoop
06-07-2018
05:33 AM
HI All, I have solved the issue by following ; Copied sqljdbc42.jar into Hive and HCatalog lib path on HDFS By Appending job.properties file by giving value of oozie.action.sharelib.for.sqoop and oozie.action.sharelib.for.hive arguments. job.properties:
nameNode=hdfs://sandbox.hortonworks.com:8020
jobTracker=sandbox.hortonworks.com:8050
queueName=defaultappPath=${nameNode}/<HDFS_path_where_workflow.xml_file>
oozie.use.system.libpath=true
oozie.libpath=${nameNode}/user/oozie/share/lib/lib_20161025075203/
oozie.wf.application.path=${appPath}
#SHARELIB PATH FOR ACTION#
oozie.action.sharelib.for.sqoop=hive,hcatalog,sqoop
oozie.action.sharelib.for.hive=hive,hcatalog,sqoop Note: You could exclude oozie.libpath from job.properties. Regards, Jay.
... View more
06-04-2018
02:02 PM
Hey @Shu, I have gone through both links which you have given. This link this for "how to import data from multiple source thru Sqoop?" In this link, there is no idea about import only specific tables using Sqoop. They created script for importing tables and aslo mentioning those 100 tables names. i.e., table1, table2,table3,table4,table5,table6,... show on. But I don't want to mention those all tables names. Regards, Jay.
... View more
06-04-2018
01:55 PM
@Geoffrey Shelton Okot, thank you, But 98 is just a number. Let's think about bigger number and change scenario. What if MsSQL database has 500 tables and I want to import only 100 tables among them? Do I need to mention those all 100 tables in --table argument? Regards, Jay.
... View more
06-04-2018
01:44 PM
Hi, @Geoffrey Shelton Okot, thanks Yes , I tried --exclude-tables argument to exclude some of the tables. But what if I want to import only 2 tables out of 100 tables from MsSQL to Hive database. Do I need to mention all 98 tables names in --exclude-tables argument?
... View more
06-04-2018
12:40 PM
Hi all, I want to import specific tables from MsSQL to HIVE database. Although I have tried --exclude-tables arguments for excluding some table while importing. But my scenario is a little bit different. For Example, If I have 100 tables and I want to import only 98 tables then I could exclude those 2 tables using --exclude-tables argument. But what if I want to import only 2 tables among those 100 tables. I tried to give multiple tables names in --table argument. My observation: import-all-tables
--connect
jdbc:sqlserver://<HOST>:<port>;databasename=<mssql_database_name>
--username
xxxxx
--password
xxxx
--table
mssql_table1,mssql_table2
--hive-import
--hive-database
<hive_database_name>
--fields-terminated-by
","
-m
1 Does anyone have an idea? Regards, Jay.
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Sqoop
05-30-2018
06:30 AM
1. It is a streaming platform. It is used for distribution of a data on a public-subscriber model with a storage layer and processing layer. 2. It is an enterprise messaging system. Big Data infrastructure is open source, so big data market cost per year approximately $40B and may be increased day by day. So it has come to host of hardware. Despite the open source nature of much of his software, there's a lot of money to be made. 3. Kafka has connectors which are import and export data from databases and other systems also. Kafka connect provides connectors i.e. Source connector , Sink Connector , JDBC Connector . It provides a facility to importing data from sources and exporting it to multiple targets. Producers: It can only push data to a Kafka broker or we can say publish data. Consumers: It can only pull data from the Kafka broker. Thank you, Jay.
... View more
05-30-2018
06:11 AM
@Shu Thank you so much. Your command works for me. So as per my observation for `sqoop-import` command; We can not use --hive-import and --target-dir/--warehouse-dir both arguments at once. If we have already created external hive table at target directory. Note: If we want to import data of RDBMS table into Hadoop and into the specific directory in HDFS, then the user only --target-dir argument. If we want to import RDBMS table into Hive table into the specific HDFS directory; then, first of all, create external hive table and use only --hive-import argument. if when we want to use --query argument. we can use both arguments at once.i.e., --hive-import and --target-dir/warehouse-dir Regards, Jay.
... View more
05-29-2018
02:12 PM
Thank you for respond @Shu. Yes, I deleted --create-hive and --hive-import argument and created external hive table into '/user/root/hivetable/' directory and execute the following command; sqoop import --connect jdbc:sqlserver://<HOST>:<PORT> --username XXXX --password XXXX --table <mssql_table> --hive-table <hivedatabase.hivetable> --target-dir '/user/root/hivetable/' --fields-terminated-by ',' -m 1 But it says "File already exists". Regards, Jay.
... View more
05-29-2018
10:16 AM
Hi @Shu, Thank you for the positive reply. But my observation did not import MsSQL table into selective target directory in HDFS means not importing into '/user/root/hivetable'. It is storing a table into '/apps/hive/warehouse/' directory. Jay.
... View more