Member since
04-03-2020
25
Posts
2
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2342 | 06-03-2020 12:54 PM |
10-16-2020
01:48 PM
1 Kudo
@Hari When you create a table in Impala with data in kudu, the metadata resides in the Hive metastore and you can see the table in hive as well as both Hive and Impala share the same metadata. But if you try to access the data by running a query in hive over kudu table, you are bound to get the below error. FAILED: RuntimeException java.lang.ClassNotFoundException: org.apache.kudu.mapreduce.KuduTableInputFormat This is because in CDH, Hive doesn't support accessing Kudu. It is added in Hive 4.0 in HIVE-12971 and is designed to work with Kudu 1.2+ You can try out in CDP where in this feature has been added. https://cwiki.apache.org/confluence/display/Hive/Kudu+Integration https://issues.apache.org/jira/browse/HIVE-12971
... View more
06-16-2020
03:30 AM
1 Kudo
@Heri There seems to be missing sentry privileges on the hdfs URI Please refer below document to add the privileges. https://docs.cloudera.com/documentation/enterprise/latest/topics/sg_hive_sql.html#grant_privilege_on_uri Hope this helps, Paras Was your question answered? Make sure to mark the answer as the accepted solution. If you find a reply useful, say thanks by clicking on the thumbs up button.
... View more
06-12-2020
05:27 PM
Hi Heri, Glad that it helped and thanks for the info. Cheers Eric
... View more
06-06-2020
09:15 PM
Glad to hear that you have finally found the root cause of this issue. Thanks for sharing @Heri
... View more
06-03-2020
12:54 PM
1 Kudo
Made it work! 🙂 I had to use --map-column-hive sqoop import \ -Dhadoop.security.credential.provider.path=jceks://hdfs/user/lrm0613/mydb2.jceks \ --connection-manager org.apache.sqoop.manager.SQLServerManager \ --driver net.sourceforge.jtds.jdbc.Driver \ --connect 'jdbc:jtds:sqlserver://SQLQP002:1433;useNTLMv2=true;domain=JNL_NT;databaseName=TC31Scheduler' \ --username 'lrm0613' \ --password-alias sqlserver2.password \ --query 'select * from Job where id=0 and $CONDITIONS' \ --hcatalog-database dataengsandbox \ --hcatalog-table Job \ --compress \ --compression-codec snappy \ --map-column-hive 'excludednodes=varchar(160)','errorparams=varchar(160)' \ -m 1 \ --create-hcatalog-table \ --hcatalog-storage-stanza 'stored as parquet'
... View more
05-22-2020
12:51 PM
1 Kudo
Sqoop can only insert into a single Hive partition at one time. To accomplish what you are trying to do, you can have two separate sqoop commands: sqoop with --query ... where year(EventTime)=2019 (remove year(EventTime)=2020) and set --hive-partition-value 2019 (not 2020) sqoop with --query ... where year(EventTime)=2020 (remove year(EventTime)=2019) and set --hive-partition-value 2020 (not 2019) This way sqoop will write into the one partition you want. Since this is one-time import, the solution should work just fine. Let me know if this works and accept the answer if it makes sense.
... View more
04-17-2020
07:58 AM
1 Kudo
Glad things are moving forward for you, Heri. Examining your sqoop command, I notice the following: --check-column EventTime tells sqoop to check this column as the timestamp column for select logic --incremental lastmodified tells sqoop that your source SQL table can have both records added to it AND records updated in it. Sqoop assumes that when a record is updated or added its EventTime is set to current timestamp. When you run this job for the first time, sqoop will pickup ALL records available (initial load). It will then print out a --last-value timestampX. This timestamp is the cutoff point for the next run of the job (i.e. next time you run the job with --exec incjob, it will set --last-value timestampX) So, to answer your question, it looks like sqoop is treating your job as an incremental load on the first run: [EventTime] < '2020-04-17 08:51:00.54'. When this job is kicked off again, it should pickup records from where it left off automatically. If you want, you can provide a manual --last-value timestamp for the initial load, but make sure you don't use it on subsequent incremental loads. For more details, please review sections 7.2.7 and 11.4 of Sqoop Documentation If this is helpful, don't forget to give kudos and accept the solution. Thank you!
... View more