About Heri

tusharkathpal · ‎10-16-2020

@Hari When you create a table in Impala with data in kudu, the metadata resides in the Hive metastore and you can see the table in hive as well as both Hive and Impala share the same metadata. But if you try to access the data by running a query in hive over kudu table, you are bound to get the below error. FAILED: RuntimeException java.lang.ClassNotFoundException: org.apache.kudu.mapreduce.KuduTableInputFormat This is because in CDH, Hive doesn't support accessing Kudu. It is added in Hive 4.0 in HIVE-12971 and is designed to work with Kudu 1.2+ You can try out in CDP where in this feature has been added. https://cwiki.apache.org/confluence/display/Hive/Kudu+Integration https://issues.apache.org/jira/browse/HIVE-12971

paras · ‎06-16-2020

@Heri There seems to be missing sentry privileges on the hdfs URI Please refer below document to add the privileges. https://docs.cloudera.com/documentation/enterprise/latest/topics/sg_hive_sql.html#grant_privilege_on_uri Hope this helps, Paras Was your question answered? Make sure to mark the answer as the accepted solution. If you find a reply useful, say thanks by clicking on the thumbs up button.

EricL · ‎06-12-2020

Hi Heri, Glad that it helped and thanks for the info. Cheers Eric

jagadeesan · ‎06-06-2020

Glad to hear that you have finally found the root cause of this issue. Thanks for sharing @Heri

Heri · ‎06-03-2020

Made it work! 🙂 I had to use --map-column-hive sqoop import \ -Dhadoop.security.credential.provider.path=jceks://hdfs/user/lrm0613/mydb2.jceks \ --connection-manager org.apache.sqoop.manager.SQLServerManager \ --driver net.sourceforge.jtds.jdbc.Driver \ --connect 'jdbc:jtds:sqlserver://SQLQP002:1433;useNTLMv2=true;domain=JNL_NT;databaseName=TC31Scheduler' \ --username 'lrm0613' \ --password-alias sqlserver2.password \ --query 'select * from Job where id=0 and $CONDITIONS' \ --hcatalog-database dataengsandbox \ --hcatalog-table Job \ --compress \ --compression-codec snappy \ --map-column-hive 'excludednodes=varchar(160)','errorparams=varchar(160)' \ -m 1 \ --create-hcatalog-table \ --hcatalog-storage-stanza 'stored as parquet'

aakulov · ‎05-22-2020

Sqoop can only insert into a single Hive partition at one time. To accomplish what you are trying to do, you can have two separate sqoop commands: sqoop with --query ... where year(EventTime)=2019 (remove year(EventTime)=2020) and set --hive-partition-value 2019 (not 2020) sqoop with --query ... where year(EventTime)=2020 (remove year(EventTime)=2019) and set --hive-partition-value 2020 (not 2019) This way sqoop will write into the one partition you want. Since this is one-time import, the solution should work just fine. Let me know if this works and accept the answer if it makes sense.

aakulov · ‎04-17-2020

Glad things are moving forward for you, Heri. Examining your sqoop command, I notice the following: --check-column EventTime tells sqoop to check this column as the timestamp column for select logic --incremental lastmodified tells sqoop that your source SQL table can have both records added to it AND records updated in it. Sqoop assumes that when a record is updated or added its EventTime is set to current timestamp. When you run this job for the first time, sqoop will pickup ALL records available (initial load). It will then print out a --last-value timestampX. This timestamp is the cutoff point for the next run of the job (i.e. next time you run the job with --exec incjob, it will set --last-value timestampX) So, to answer your question, it looks like sqoop is treating your job as an incremental load on the first run: [EventTime] < '2020-04-17 08:51:00.54'. When this job is kicked off again, it should pickup records from where it left off automatically. If you want, you can provide a manual --last-value timestamp for the initial load, but make sure you don't use it on subsequent incremental loads. For more details, please review sections 7.2.7 and 11.4 of Sqoop Documentation If this is helpful, don't forget to give kudos and accept the solution. Thank you!

Online	Offline
Last Visited	‎10-23-2020 03:18 PM

Member Since	‎04-03-2020 11:57 AM
Last Visited	‎10-23-2020 03:18 PM
Posts	25
Kudos received	2

Cloudera Community

Re: Sqoop translates one column incorrectly when w...

Re: Access a partitioned table created in Impala u...

Re: Strange error on Hive

Re: max() function generating an error in sqoop

Re: Drop table not working as expected in Hive

Re: Sqoop translates one column incorrectly when w...

Re: Creating a partitioned table in hive with sqoo...

Re: Could not get current time from database