Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How do I split columns on table that has varchar datatypes only?

Solved Go to solution

How do I split columns on table that has varchar datatypes only?

New Contributor
 
1 ACCEPTED SOLUTION

Accepted Solutions

Re: How do I split columns on table that has varchar datatypes only?

@Christian Lunesa

If you are talking about a Sqoop Import, that's the only tag with your question :), it is always highly recommended to use an Integral column as split-by. But since you have only string/varchar columns in your data source, you can try following options based on the data that you have.

  1. add surrogate int PK and use it also as a split or
  2. split your data manually using a custom query with WHERE clause and run sqoop few times with num-mappers=1, or
  3. apply some deterministic Integer non-aggregation function to you varchar column, for example cast(substr(...) as int) as split-column.

Let know if you need any other help!

4 REPLIES 4

Re: How do I split columns on table that has varchar datatypes only?

@Christian Lunesa

If you are talking about a Sqoop Import, that's the only tag with your question :), it is always highly recommended to use an Integral column as split-by. But since you have only string/varchar columns in your data source, you can try following options based on the data that you have.

  1. add surrogate int PK and use it also as a split or
  2. split your data manually using a custom query with WHERE clause and run sqoop few times with num-mappers=1, or
  3. apply some deterministic Integer non-aggregation function to you varchar column, for example cast(substr(...) as int) as split-column.

Let know if you need any other help!

Re: How do I split columns on table that has varchar datatypes only?

@Christian Lunesa

If the answer helped solving your query, please mark the answer as accepted :)

Re: How do I split columns on table that has varchar datatypes only?

New Contributor

thank you for answering my questions sir, do you mean I have to alter the table in mysql when adding surrogate int PK?

Highlighted

Re: How do I split columns on table that has varchar datatypes only?

New Contributor

I want to use the third option on the split-by but i am not seeing example of how to pass in that value. I have tried with single quote 'cast(id as unsigned)' and it throws error saying column not found on the sqoop run.

Don't have an account?
Coming from Hortonworks? Activate your account here