About aakulov

aakulov · ‎11-20-2020

Ok, fair enough. There is also a CDH specific connector for Teradata (available here https://www.cloudera.com/downloads/connectors/sqoop/teradata/1-7c6.html). Try that. The installation and usage guide is here: https://docs.cloudera.com/documentation/other/connectors/teradata/1-x/PDF/Cloudera-Connector-for-Teradata.pdf

aakulov · ‎11-19-2020

Your connection string is "teradat" instead of "teradata" - this can also lead to parsing error. Otherwise, have you tried exporting the entire table, without giving specific columns?

aakulov · ‎11-18-2020

The issue is not with --columns parameter, actually. Problem is the sqoop can't parse the command because it expects "-m" instead of "--m". Remember, when using a short-form parameter that is a single letter, use a single dash, otherwise use double dash. Hope this helps!

aakulov · ‎11-13-2020

With your original approach, each query can filter out whole partitions of the table based on the WHERE clauses (that is if your table is partitioned and at least some of the columns in the clause use those partitions). However, if your WHERE clauses are pretty different/unique, then you will be scanning big portion of the table for every one of your 100+ queries. With the suggested approach, there is only one scan of the table, but there is more processing that is happening for each row. The best way to see if performance is better is just to test it and go with the winner.

aakulov · ‎11-12-2020

Do your WHERE conditions rely on different columns in MyTable or all the same columns, just different filter criteria? If it's the latter than the answer is partitioning your Hive table based on those key columns. Also if your MyTable is not too big, it would be most efficient to do your 100 queries in memory with something like SparkSQL, rather than Hive.

aakulov · ‎11-03-2020

The error likely indicates that some AWS resources were not reachable from CDP control plane. Double check your security policy settings and any proxy settings. Reach out to support as they will be able to better assist by being able to look at your particular environment setup. Regarding the logs, if the CM instance was stood up in your Data Lake, you can search the logs by clicking "Command logs" or "Service logs" in the Data Lake tab of your Data Lake environment.

aakulov · ‎11-03-2020

The error indicates an issue with Kerberos. If things ran fine last week, perhaps your Kerberos ticket expired and needs to be renewed.

aakulov · ‎10-21-2020

Try adding these options in your sqoop command: --relaxed-isolation --metadata-transaction-isolation-level TRANSACTION_READ_UNCOMMITTED More info here: https://docs.cloudera.com/HDPDocuments/HDP2/HDP-2.6.3/bk_data-movement-and-integration/content/controlling_trans_isol.html and here: https://www.tutorialspoint.com/java-connection-settransactionisolation-method-with-example Hope this helps. Regards, Alex

aakulov · ‎10-21-2020

Can you share a little more about your use case? Are you only appending to the table or also updating records (UPSERT)? Will there be duplicate records? Also, what version of Hive are you working with?

aakulov · ‎10-20-2020

Hi Priyanshu, This is may not be the answer you are looking for, but this may be a bug in Apache Oozie. Looking at the source code for CallbackService.java the string that callback tries to return is CALL_BACK_QUERY_STRING = "{0}?" + ID_PARAM + "{1}" + "&" + STATUS_PARAM + "{2}" Note that there is an & character. If this is not properly handled, the XML/HTTP will come back with exactly the error you mention (see here). Your best bet may be to use the alternate solution you proposed. Regards, Alex

Online	Offline
Last Visited	‎09-05-2024 02:11 AM

Member Since	‎02-27-2020 04:13 PM
Last Visited	‎09-05-2024 02:11 AM
Posts	173
Kudos received	42

Cloudera Community

Re: Changing Colours or adding a banner to WebUIs

Re: CDP Public Cloud - Resizing of Worker/Compute ...

Re: How to collect queries submitted by other user...

Re: CDH配置好以后，agent服务能够启动，但是server服务无法启动 (After CDH...

Re: How to increase timeout definition?

Re: Sqoop Export Specific Columns Failure

Re: Sqoop Export Specific Columns Failure

Re: Sqoop Export Specific Columns Failure

Re: Run Multiple Count Operation On Data Table

Re: Run Multiple Count Operation On Data Table

Re: CDP - Data Lake creation failed: Operation tim...

Re: Impala queries error

Re: Sqoop import to MS PDW is failing with com.mic...

Re: Insert data in hive

Re: how to get url for oozie action logs as an arg...