Member since
05-17-2020
2
Posts
0
Kudos Received
0
Solutions
05-17-2020
05:49 PM
Resolved the Issue. by configuring in sqoop action with arguments and also configured hcatalog-home to resolve the issue as below. <action name="run-sqoop" cred="hcat_credentials"> <sqoop xmlns="uri:oozie:sqoop-action:0.4"> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node> <job-xml>/apps/dif/config/hive-site.xml</job-xml> <configuration> <property> <name>mapred.job.queue.name</name> <value>${queueName}</value> </property> <property> <name>oozie.launcher.mapred.job.queue.name</name> <value>${oozieQueueName}</value> </property> <property> <name>hive.execution.engine</name> <value>mr</value> </property> </configuration> <arg>import</arg> <arg>-Dhadoop.security.credential.provider.path=${jceks_locn}</arg> <arg>--connect</arg> <arg>${jdbc_url}</arg> <arg>--username</arg> <arg>${user_name}</arg> <arg>--password-alias</arg> <arg>${password_aloas}</arg> <arg>--compress</arg> <arg>--compression-codec</arg> <arg>org.apache.hadoop.io.compress.SnappyCodec</arg> <arg>--table</arg> <arg>${table_name}</arg> <arg>--validate</arg> <arg>--escaped-by</arg> <arg>\\</arg> <arg>--null-string</arg> <arg>\\N</arg> <arg>--null-non-string</arg> <arg>\\N</arg> <arg>--map-column-hive</arg> <arg>${map_coluns}</arg> <arg>--hive-delims-replacement</arg> <arg>" "</arg> <arg>--hcatalog-home</arg> <arg>/usr/hdp/current/hive-webhcat</arg> <arg>--hcatalog-database</arg> <arg>${target_db_name}</arg> <arg>--hcatalog-table</arg> <arg>${target_table_name}</arg> <arg>--drop-and-create-hcatalog-table</arg> <arg>--hcatalog-storage-stanza</arg> <arg>"STORED AS ORC tblproperties ('orc.compress'='SNAPPY')"</arg> <arg>--m</arg> <arg>1</arg> <arg>--skip-dist-cache</arg> <file>/apps/dif/config/hive-site.xml</file> </sqoop> <ok to="end"/> <error to="fail"/> </action>
... View more
05-17-2020
12:40 PM
I want to import data from rdbms syatems to hdfs in orc format & snappy compression. for that as per the instructions given in forum i am using sqoop hcatalog methodology to achieve my goal.--hcatalog-storage-stanza not supporting orcfile options other than rcfile. Kindly provide your valuable inputs to move forward with Sqoop Hcatalog.Below are the observations & additional details Hadoop Distribution: HDP 2.6.5.165-3 Running Sqoop version: 1.4.6.2.6.5.165-3 Below Sqoop import command configured in oozie. <command>import -Dhadoop.security.credential.provider.path=jceks://hdfs/appz/xyz/credential/centralstore.jceks --connect jdbc:oracle:thin:@oxxx21:1621/oxxx21 --password-alias hdpedmingest@oram21 --username hdpedmingest --compression-codec org.apache.hadoop.io.compress.SnappyCodec --table CO0101.ADDRESS --validate --escaped-by \\ --null-string \\N --null-non-string \\N -m 1 --hcatalog-database srcpub_oracle_oxxx21_co0101 --hcatalog-table address_orc --drop-and-create-hcatalog-table --hcatalog-storage-stanza 'stored as orc tblproperties ("orc.compress"="SNAPPY")' </command> Error: 2020-05-17 15:27:43,825 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: AS 2020-05-17 15:27:43,825 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: ORC 2020-05-17 15:27:43,825 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: tblproperties 2020-05-17 15:27:43,825 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: ("orc.compress"="SNAPPY")' If I remove --hcatalog-storage-stanza getting below error. 2020-05-17 14:26:35,731 [main] INFO org.apache.sqoop.mapreduce.hcat.SqoopHCatUtilities - HCatalog Create table statement: drop table `srcpub_oracle_oram21_co0101`.`address_orc`; create table `srcpub_oracle_oram21_co0101`.`address_orc` ( `cl_id` varchar(9), `cl_ad_sqn` decimal(10), `ad_lst_cg_dt` string, `cty_ad` varchar(31), `state_cd` varchar(2), `zip` varchar(9), `extnd_zip_cd` varchar(4), `str_one_ad` varchar(31), `str_two_ad` varchar(31), `str_thre_ad` varchar(31), `cntry_cd` varchar(3), `us_cnty_cd` varchar(3), `uclmd_mail_dt` string, `ad_ovrd_ir` varchar(1), `cntc_pt_tp_cd` varchar(3), `crea_logn_id_cd` varchar(8), `crea_ts` string, `lst_mod_logn_id_cd` varchar(8), `lst_mod_ts` string) stored as rcfile 2020-05-17 15:36:35,731 [main] INFO org.apache.sqoop.mapreduce.hcat.SqoopHCatUtilities - Executing external HCatalog CLI process with args :-f,/tmp/hcat-script-1589739995731 2020-05-17 15:36:35,735 [main] ERROR org.apache.sqoop.Sqoop - Got exception running Sqoop: java.lang.NullPointerException
... View more
Labels:
- Labels:
-
Apache Oozie
-
Apache Sqoop