Member since
02-27-2020
173
Posts
42
Kudos Received
48
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1046 | 11-29-2023 01:16 PM | |
1151 | 10-27-2023 04:29 PM | |
1138 | 07-07-2023 10:20 AM | |
2489 | 03-21-2023 08:35 AM | |
886 | 01-25-2023 08:50 PM |
11-20-2020
09:42 AM
Ok, fair enough. There is also a CDH specific connector for Teradata (available here https://www.cloudera.com/downloads/connectors/sqoop/teradata/1-7c6.html). Try that. The installation and usage guide is here: https://docs.cloudera.com/documentation/other/connectors/teradata/1-x/PDF/Cloudera-Connector-for-Teradata.pdf
... View more
11-19-2020
09:52 AM
Your connection string is "teradat" instead of "teradata" - this can also lead to parsing error. Otherwise, have you tried exporting the entire table, without giving specific columns?
... View more
11-18-2020
09:09 PM
The issue is not with --columns parameter, actually. Problem is the sqoop can't parse the command because it expects "-m" instead of "--m". Remember, when using a short-form parameter that is a single letter, use a single dash, otherwise use double dash. Hope this helps!
... View more
11-13-2020
01:00 PM
With your original approach, each query can filter out whole partitions of the table based on the WHERE clauses (that is if your table is partitioned and at least some of the columns in the clause use those partitions). However, if your WHERE clauses are pretty different/unique, then you will be scanning big portion of the table for every one of your 100+ queries. With the suggested approach, there is only one scan of the table, but there is more processing that is happening for each row. The best way to see if performance is better is just to test it and go with the winner.
... View more
11-12-2020
10:25 AM
Do your WHERE conditions rely on different columns in MyTable or all the same columns, just different filter criteria? If it's the latter than the answer is partitioning your Hive table based on those key columns. Also if your MyTable is not too big, it would be most efficient to do your 100 queries in memory with something like SparkSQL, rather than Hive.
... View more
11-03-2020
09:57 AM
1 Kudo
The error likely indicates that some AWS resources were not reachable from CDP control plane. Double check your security policy settings and any proxy settings. Reach out to support as they will be able to better assist by being able to look at your particular environment setup. Regarding the logs, if the CM instance was stood up in your Data Lake, you can search the logs by clicking "Command logs" or "Service logs" in the Data Lake tab of your Data Lake environment.
... View more
11-03-2020
09:38 AM
The error indicates an issue with Kerberos. If things ran fine last week, perhaps your Kerberos ticket expired and needs to be renewed.
... View more
10-21-2020
09:02 PM
Try adding these options in your sqoop command: --relaxed-isolation
--metadata-transaction-isolation-level TRANSACTION_READ_UNCOMMITTED More info here: https://docs.cloudera.com/HDPDocuments/HDP2/HDP-2.6.3/bk_data-movement-and-integration/content/controlling_trans_isol.html and here: https://www.tutorialspoint.com/java-connection-settransactionisolation-method-with-example Hope this helps. Regards, Alex
... View more
10-21-2020
05:48 PM
Can you share a little more about your use case? Are you only appending to the table or also updating records (UPSERT)? Will there be duplicate records? Also, what version of Hive are you working with?
... View more
10-20-2020
03:17 PM
Hi Priyanshu, This is may not be the answer you are looking for, but this may be a bug in Apache Oozie. Looking at the source code for CallbackService.java the string that callback tries to return is CALL_BACK_QUERY_STRING = "{0}?" + ID_PARAM + "{1}" + "&" + STATUS_PARAM + "{2}" Note that there is an & character. If this is not properly handled, the XML/HTTP will come back with exactly the error you mention (see here). Your best bet may be to use the alternate solution you proposed. Regards, Alex
... View more